[v1,04/20] perf jevents: Add tsx metric group for Intel models

Message ID 20240229001806.4158429-5-irogers@google.com
State New
Headers
Series Python generated Intel metrics |

Commit Message

Ian Rogers Feb. 29, 2024, 12:17 a.m. UTC
  Allow duplicated metric to be dropped from json files.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)
  

Comments

Liang, Kan Feb. 29, 2024, 9:15 p.m. UTC | #1
On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> Allow duplicated metric to be dropped from json files.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index 20c25d142f24..1096accea2aa 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -7,6 +7,7 @@ import argparse
>  import json
>  import math
>  import os
> +from typing import Optional
>  
>  parser = argparse.ArgumentParser(description="Intel perf json generator")
>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>      ])
>  
>  
> +def Tsx() -> Optional[MetricGroup]:
> +    if args.model not in [
> +        'alderlake',
> +        'cascadelakex',
> +        'icelake',
> +        'icelakex',
> +        'rocketlake',
> +        'sapphirerapids',
> +        'skylake',
> +        'skylakex',
> +        'tigerlake',> +    ]:

Can we get ride of the model list? Otherwise, we have to keep updating
the list.

> +        return None
> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"

Is it possible to change the check to the existence of the "cpu" PMU
here? has_pmu("cpu") ? "cpu" : "cpu_core"

> +    cycles = Event('cycles')
> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> +    transaction_start = Event(f'{pmu}/tx\-start/')
> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> +    metrics = [
> +        Metric('tsx_transactional_cycles',
> +                      'Percentage of cycles within a transaction region.',
> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> +                      '100%'),
> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> +                                    has_event(cycles_in_tx),
> +                                    0),
> +                      '100%'),
> +        Metric('tsx_cycles_per_transaction',
> +                      'Number of cycles within a transaction divided by the number of transactions.',
> +                      Select(cycles_in_tx / transaction_start,
> +                                    has_event(cycles_in_tx),
> +                                    0),
> +                      "cycles / transaction"),
> +    ]
> +    if args.model != 'sapphirerapids':

Add the "tsx_cycles_per_elision" metric only if
has_event(f'{pmu}/el\-start/')?

Thanks,
Kan

> +        elision_start = Event(f'{pmu}/el\-start/')
> +        metrics += [
> +            Metric('tsx_cycles_per_elision',
> +                          'Number of cycles within a transaction divided by the number of elisions.',
> +                          Select(cycles_in_tx / elision_start,
> +                                        has_event(elision_start),
> +                                        0),
> +                          "cycles / elision"),
> +        ]
> +    return MetricGroup('transaction', metrics)
> +
> +
>  all_metrics = MetricGroup("", [
>      Idle(),
>      Rapl(),
>      Smi(),
> +    Tsx(),
>  ])
>  
>  if args.metricgroups:
  
Ian Rogers March 1, 2024, 1:01 a.m. UTC | #2
On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > Allow duplicated metric to be dropped from json files.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index 20c25d142f24..1096accea2aa 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -7,6 +7,7 @@ import argparse
> >  import json
> >  import math
> >  import os
> > +from typing import Optional
> >
> >  parser = argparse.ArgumentParser(description="Intel perf json generator")
> >  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> > @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> >      ])
> >
> >
> > +def Tsx() -> Optional[MetricGroup]:
> > +    if args.model not in [
> > +        'alderlake',
> > +        'cascadelakex',
> > +        'icelake',
> > +        'icelakex',
> > +        'rocketlake',
> > +        'sapphirerapids',
> > +        'skylake',
> > +        'skylakex',
> > +        'tigerlake',> +    ]:
>
> Can we get ride of the model list? Otherwise, we have to keep updating
> the list.

Do we expect the list to update? :-) The issue is the events are in
sysfs and not the json. If we added the tsx events to json then this
list wouldn't be necessary, but it also would mean the events would be
present in "perf list" even when TSX is disabled.

> > +        return None
> > +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>
> Is it possible to change the check to the existence of the "cpu" PMU
> here? has_pmu("cpu") ? "cpu" : "cpu_core"

The "Unit" on "cpu" events in json always just blank. On hybrid it is
either "cpu_core" or "cpu_atom", so I can make this something like:

pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"

which would be a build time test.


> > +    cycles = Event('cycles')
> > +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> > +    transaction_start = Event(f'{pmu}/tx\-start/')
> > +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> > +    metrics = [
> > +        Metric('tsx_transactional_cycles',
> > +                      'Percentage of cycles within a transaction region.',
> > +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> > +                      '100%'),
> > +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> > +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> > +                                    has_event(cycles_in_tx),
> > +                                    0),
> > +                      '100%'),
> > +        Metric('tsx_cycles_per_transaction',
> > +                      'Number of cycles within a transaction divided by the number of transactions.',
> > +                      Select(cycles_in_tx / transaction_start,
> > +                                    has_event(cycles_in_tx),
> > +                                    0),
> > +                      "cycles / transaction"),
> > +    ]
> > +    if args.model != 'sapphirerapids':
>
> Add the "tsx_cycles_per_elision" metric only if
> has_event(f'{pmu}/el\-start/')?

It's a sysfs event, so this wouldn't work :-(

Thanks,
Ian

> Thanks,
> Kan
>
> > +        elision_start = Event(f'{pmu}/el\-start/')
> > +        metrics += [
> > +            Metric('tsx_cycles_per_elision',
> > +                          'Number of cycles within a transaction divided by the number of elisions.',
> > +                          Select(cycles_in_tx / elision_start,
> > +                                        has_event(elision_start),
> > +                                        0),
> > +                          "cycles / elision"),
> > +        ]
> > +    return MetricGroup('transaction', metrics)
> > +
> > +
> >  all_metrics = MetricGroup("", [
> >      Idle(),
> >      Rapl(),
> >      Smi(),
> > +    Tsx(),
> >  ])
> >
> >  if args.metricgroups:
  
Liang, Kan March 1, 2024, 2:52 p.m. UTC | #3
On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>> Allow duplicated metric to be dropped from json files.
>>>
>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>> ---
>>>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>>  1 file changed, 51 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>> index 20c25d142f24..1096accea2aa 100755
>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>> @@ -7,6 +7,7 @@ import argparse
>>>  import json
>>>  import math
>>>  import os
>>> +from typing import Optional
>>>
>>>  parser = argparse.ArgumentParser(description="Intel perf json generator")
>>>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>>      ])
>>>
>>>
>>> +def Tsx() -> Optional[MetricGroup]:
>>> +    if args.model not in [
>>> +        'alderlake',
>>> +        'cascadelakex',
>>> +        'icelake',
>>> +        'icelakex',
>>> +        'rocketlake',
>>> +        'sapphirerapids',
>>> +        'skylake',
>>> +        'skylakex',
>>> +        'tigerlake',> +    ]:
>>
>> Can we get ride of the model list? Otherwise, we have to keep updating
>> the list.
> 
> Do we expect the list to update? :-) 

Yes, at least for the meteorlake and graniterapids. They should be the
same as alderlake and sapphirerapids. I'm not sure about the future
platforms.

Maybe we can have a if args.model in list here to include all the
non-hybrid models which doesn't support TSX. I think the list should not
be changed shortly.

> The issue is the events are in
> sysfs and not the json. If we added the tsx events to json then this
> list wouldn't be necessary, but it also would mean the events would be
> present in "perf list" even when TSX is disabled.

I think there may an alternative way, to check the RTM events, e.g.,
RTM_RETIRED.START event. We only need to generate the metrics for the
platform which supports the RTM_RETIRED.START event.


> 
>>> +        return None
>>> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>
>> Is it possible to change the check to the existence of the "cpu" PMU
>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
> 
> The "Unit" on "cpu" events in json always just blank. On hybrid it is
> either "cpu_core" or "cpu_atom", so I can make this something like:
> 
> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
> 
> which would be a build time test.

Yes, I think using the "Unit" is good enough.

> 
> 
>>> +    cycles = Event('cycles')
>>> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>> +    transaction_start = Event(f'{pmu}/tx\-start/')
>>> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>> +    metrics = [
>>> +        Metric('tsx_transactional_cycles',
>>> +                      'Percentage of cycles within a transaction region.',
>>> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>> +                      '100%'),
>>> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>> +                                    has_event(cycles_in_tx),
>>> +                                    0),
>>> +                      '100%'),
>>> +        Metric('tsx_cycles_per_transaction',
>>> +                      'Number of cycles within a transaction divided by the number of transactions.',
>>> +                      Select(cycles_in_tx / transaction_start,
>>> +                                    has_event(cycles_in_tx),
>>> +                                    0),
>>> +                      "cycles / transaction"),
>>> +    ]
>>> +    if args.model != 'sapphirerapids':
>>
>> Add the "tsx_cycles_per_elision" metric only if
>> has_event(f'{pmu}/el\-start/')?
> 
> It's a sysfs event, so this wouldn't work :-(

The below is the definition of el-start in the kernel.
EVENT_ATTR_STR(el-start,	el_start,	"event=0xc8,umask=0x1");

The corresponding event in the event list should be HLE_RETIRED.START
      "EventCode": "0xC8",
      "UMask": "0x01",
      "EventName": "HLE_RETIRED.START",

I think we may check the HLE_RETIRED.START instead. If the
HLE_RETIRED.START doesn't exist, I don't see a reason why the
tsx_cycles_per_elision should be supported.

Again, in the virtualization world, it's possible that the
HLE_RETIRED.START exists in the event list but el_start isn't available
in the sysfs. I think it has to be specially handle in the test as well.

Thanks,
Kan

> 
> Thanks,
> Ian
> 
>> Thanks,
>> Kan
>>
>>> +        elision_start = Event(f'{pmu}/el\-start/')
>>> +        metrics += [
>>> +            Metric('tsx_cycles_per_elision',
>>> +                          'Number of cycles within a transaction divided by the number of elisions.',
>>> +                          Select(cycles_in_tx / elision_start,
>>> +                                        has_event(elision_start),
>>> +                                        0),
>>> +                          "cycles / elision"),
>>> +        ]
>>> +    return MetricGroup('transaction', metrics)
>>> +
>>> +
>>>  all_metrics = MetricGroup("", [
>>>      Idle(),
>>>      Rapl(),
>>>      Smi(),
>>> +    Tsx(),
>>>  ])
>>>
>>>  if args.metricgroups:
>
  
Ian Rogers March 1, 2024, 4:37 p.m. UTC | #4
On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> > On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >>
> >>
> >>
> >> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> >>> Allow duplicated metric to be dropped from json files.
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> >>>  1 file changed, 51 insertions(+)
> >>>
> >>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> >>> index 20c25d142f24..1096accea2aa 100755
> >>> --- a/tools/perf/pmu-events/intel_metrics.py
> >>> +++ b/tools/perf/pmu-events/intel_metrics.py
> >>> @@ -7,6 +7,7 @@ import argparse
> >>>  import json
> >>>  import math
> >>>  import os
> >>> +from typing import Optional
> >>>
> >>>  parser = argparse.ArgumentParser(description="Intel perf json generator")
> >>>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> >>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> >>>      ])
> >>>
> >>>
> >>> +def Tsx() -> Optional[MetricGroup]:
> >>> +    if args.model not in [
> >>> +        'alderlake',
> >>> +        'cascadelakex',
> >>> +        'icelake',
> >>> +        'icelakex',
> >>> +        'rocketlake',
> >>> +        'sapphirerapids',
> >>> +        'skylake',
> >>> +        'skylakex',
> >>> +        'tigerlake',> +    ]:
> >>
> >> Can we get ride of the model list? Otherwise, we have to keep updating
> >> the list.
> >
> > Do we expect the list to update? :-)
>
> Yes, at least for the meteorlake and graniterapids. They should be the
> same as alderlake and sapphirerapids. I'm not sure about the future
> platforms.
>
> Maybe we can have a if args.model in list here to include all the
> non-hybrid models which doesn't support TSX. I think the list should not
> be changed shortly.
>
> > The issue is the events are in
> > sysfs and not the json. If we added the tsx events to json then this
> > list wouldn't be necessary, but it also would mean the events would be
> > present in "perf list" even when TSX is disabled.
>
> I think there may an alternative way, to check the RTM events, e.g.,
> RTM_RETIRED.START event. We only need to generate the metrics for the
> platform which supports the RTM_RETIRED.START event.
>
>
> >
> >>> +        return None
> >>> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
> >>
> >> Is it possible to change the check to the existence of the "cpu" PMU
> >> here? has_pmu("cpu") ? "cpu" : "cpu_core"
> >
> > The "Unit" on "cpu" events in json always just blank. On hybrid it is
> > either "cpu_core" or "cpu_atom", so I can make this something like:
> >
> > pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
> >
> > which would be a build time test.
>
> Yes, I think using the "Unit" is good enough.
>
> >
> >
> >>> +    cycles = Event('cycles')
> >>> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> >>> +    transaction_start = Event(f'{pmu}/tx\-start/')
> >>> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> >>> +    metrics = [
> >>> +        Metric('tsx_transactional_cycles',
> >>> +                      'Percentage of cycles within a transaction region.',
> >>> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> >>> +                      '100%'),
> >>> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> >>> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> >>> +                                    has_event(cycles_in_tx),
> >>> +                                    0),
> >>> +                      '100%'),
> >>> +        Metric('tsx_cycles_per_transaction',
> >>> +                      'Number of cycles within a transaction divided by the number of transactions.',
> >>> +                      Select(cycles_in_tx / transaction_start,
> >>> +                                    has_event(cycles_in_tx),
> >>> +                                    0),
> >>> +                      "cycles / transaction"),
> >>> +    ]
> >>> +    if args.model != 'sapphirerapids':
> >>
> >> Add the "tsx_cycles_per_elision" metric only if
> >> has_event(f'{pmu}/el\-start/')?
> >
> > It's a sysfs event, so this wouldn't work :-(
>
> The below is the definition of el-start in the kernel.
> EVENT_ATTR_STR(el-start,        el_start,       "event=0xc8,umask=0x1");
>
> The corresponding event in the event list should be HLE_RETIRED.START
>       "EventCode": "0xC8",
>       "UMask": "0x01",
>       "EventName": "HLE_RETIRED.START",
>
> I think we may check the HLE_RETIRED.START instead. If the
> HLE_RETIRED.START doesn't exist, I don't see a reason why the
> tsx_cycles_per_elision should be supported.
>
> Again, in the virtualization world, it's possible that the
> HLE_RETIRED.START exists in the event list but el_start isn't available
> in the sysfs. I think it has to be specially handle in the test as well.

So we keep the has_event test on the sysfs event to handle the
virtualization and disabled case. We use  HLE_RETIRED.START to detect
whether the model supports TSX. Should the event be the sysfs or json
version? i.e.

        "MetricExpr": "(cycles\\-t / el\\-start if
has_event(el\\-start) else 0)",

or

        "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
has_event(el\\-start) else 0)",

I think I favor the former for some consistency with the has_event.

Using HLE_RETIRED.START means the set of TSX models goes from:
        'alderlake',
        'cascadelakex',
        'icelake',
        'icelakex',
        'rocketlake',
        'sapphirerapids',
        'skylake',
        'skylakex',
        'tigerlake',

To:
   broadwell
   broadwellde
   broadwellx
   cascadelakex
   haswell
   haswellx
   icelake
   rocketlake
   skylake
   skylakex

Using RTM_RETIRED.START it goes to:
   broadwell
   broadwellde
   broadwellx
   cascadelakex
   emeraldrapids
   graniterapids
   haswell
   haswellx
   icelake
   icelakex
   rocketlake
   sapphirerapids
   skylake
   skylakex
   tigerlake

So I'm not sure it is working equivalently to what we have today,
which may be good or bad. Here is what I think the code should look
like:

def Tsx() -> Optional[MetricGroup]:
  pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
  cycles = Event('cycles')
  cycles_in_tx = Event(f'{pmu}/cycles\-t/')
  cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
  try:
    # Test if the tsx event is present in the json, prefer the
    # sysfs version so that we can detect its presence at runtime.
    transaction_start = Event("RTM_RETIRED.START")
    transaction_start = Event(f'{pmu}/tx\-start/')
  except:
    return None

  elision_start = None
  try:
    # Elision start isn't supported by all models, but we'll not
    # generate the tsx_cycles_per_elision metric in that
    # case. Again, prefer the sysfs encoding of the event.
    elision_start = Event("HLE_RETIRED.START")
    elision_start = Event(f'{pmu}/el\-start/')
  except:
    pass

  return MetricGroup('transaction', [
      Metric('tsx_transactional_cycles',
             'Percentage of cycles within a transaction region.',
             Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
             '100%'),
      Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
transactions.',
             Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
                    has_event(cycles_in_tx),
                    0),
             '100%'),
      Metric('tsx_cycles_per_transaction',
             'Number of cycles within a transaction divided by the
number of transactions.',
             Select(cycles_in_tx / transaction_start,
                    has_event(cycles_in_tx),
                    0),
             "cycles / transaction"),
      Metric('tsx_cycles_per_elision',
             'Number of cycles within a transaction divided by the
number of elisions.',
             Select(cycles_in_tx / elision_start,
                    has_event(elision_start),
                    0),
             "cycles / elision") if elision_start else None,
  ], description="Breakdown of transactional memory statistics")

Wdyt?

Thanks,
Ian

> Thanks,
> Kan
>
> >
> > Thanks,
> > Ian
> >
> >> Thanks,
> >> Kan
> >>
> >>> +        elision_start = Event(f'{pmu}/el\-start/')
> >>> +        metrics += [
> >>> +            Metric('tsx_cycles_per_elision',
> >>> +                          'Number of cycles within a transaction divided by the number of elisions.',
> >>> +                          Select(cycles_in_tx / elision_start,
> >>> +                                        has_event(elision_start),
> >>> +                                        0),
> >>> +                          "cycles / elision"),
> >>> +        ]
> >>> +    return MetricGroup('transaction', metrics)
> >>> +
> >>> +
> >>>  all_metrics = MetricGroup("", [
> >>>      Idle(),
> >>>      Rapl(),
> >>>      Smi(),
> >>> +    Tsx(),
> >>>  ])
> >>>
> >>>  if args.metricgroups:
> >
  
Liang, Kan March 1, 2024, 5:26 p.m. UTC | #5
On 2024-03-01 11:37 a.m., Ian Rogers wrote:
> On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
>>> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>>>> Allow duplicated metric to be dropped from json files.
>>>>>
>>>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>>>> ---
>>>>>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>>>>  1 file changed, 51 insertions(+)
>>>>>
>>>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>>>> index 20c25d142f24..1096accea2aa 100755
>>>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>>>> @@ -7,6 +7,7 @@ import argparse
>>>>>  import json
>>>>>  import math
>>>>>  import os
>>>>> +from typing import Optional
>>>>>
>>>>>  parser = argparse.ArgumentParser(description="Intel perf json generator")
>>>>>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>>>>      ])
>>>>>
>>>>>
>>>>> +def Tsx() -> Optional[MetricGroup]:
>>>>> +    if args.model not in [
>>>>> +        'alderlake',
>>>>> +        'cascadelakex',
>>>>> +        'icelake',
>>>>> +        'icelakex',
>>>>> +        'rocketlake',
>>>>> +        'sapphirerapids',
>>>>> +        'skylake',
>>>>> +        'skylakex',
>>>>> +        'tigerlake',> +    ]:
>>>>
>>>> Can we get ride of the model list? Otherwise, we have to keep updating
>>>> the list.
>>>
>>> Do we expect the list to update? :-)
>>
>> Yes, at least for the meteorlake and graniterapids. They should be the
>> same as alderlake and sapphirerapids. I'm not sure about the future
>> platforms.
>>
>> Maybe we can have a if args.model in list here to include all the
>> non-hybrid models which doesn't support TSX. I think the list should not
>> be changed shortly.
>>
>>> The issue is the events are in
>>> sysfs and not the json. If we added the tsx events to json then this
>>> list wouldn't be necessary, but it also would mean the events would be
>>> present in "perf list" even when TSX is disabled.
>>
>> I think there may an alternative way, to check the RTM events, e.g.,
>> RTM_RETIRED.START event. We only need to generate the metrics for the
>> platform which supports the RTM_RETIRED.START event.
>>
>>
>>>
>>>>> +        return None
>>>>> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>>>
>>>> Is it possible to change the check to the existence of the "cpu" PMU
>>>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>>>
>>> The "Unit" on "cpu" events in json always just blank. On hybrid it is
>>> either "cpu_core" or "cpu_atom", so I can make this something like:
>>>
>>> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>>>
>>> which would be a build time test.
>>
>> Yes, I think using the "Unit" is good enough.
>>
>>>
>>>
>>>>> +    cycles = Event('cycles')
>>>>> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>>>> +    transaction_start = Event(f'{pmu}/tx\-start/')
>>>>> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>>>> +    metrics = [
>>>>> +        Metric('tsx_transactional_cycles',
>>>>> +                      'Percentage of cycles within a transaction region.',
>>>>> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>>>> +                      '100%'),
>>>>> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>>>> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>>>> +                                    has_event(cycles_in_tx),
>>>>> +                                    0),
>>>>> +                      '100%'),
>>>>> +        Metric('tsx_cycles_per_transaction',
>>>>> +                      'Number of cycles within a transaction divided by the number of transactions.',
>>>>> +                      Select(cycles_in_tx / transaction_start,
>>>>> +                                    has_event(cycles_in_tx),
>>>>> +                                    0),
>>>>> +                      "cycles / transaction"),
>>>>> +    ]
>>>>> +    if args.model != 'sapphirerapids':
>>>>
>>>> Add the "tsx_cycles_per_elision" metric only if
>>>> has_event(f'{pmu}/el\-start/')?
>>>
>>> It's a sysfs event, so this wouldn't work :-(
>>
>> The below is the definition of el-start in the kernel.
>> EVENT_ATTR_STR(el-start,        el_start,       "event=0xc8,umask=0x1");
>>
>> The corresponding event in the event list should be HLE_RETIRED.START
>>       "EventCode": "0xC8",
>>       "UMask": "0x01",
>>       "EventName": "HLE_RETIRED.START",
>>
>> I think we may check the HLE_RETIRED.START instead. If the
>> HLE_RETIRED.START doesn't exist, I don't see a reason why the
>> tsx_cycles_per_elision should be supported.
>>
>> Again, in the virtualization world, it's possible that the
>> HLE_RETIRED.START exists in the event list but el_start isn't available
>> in the sysfs. I think it has to be specially handle in the test as well.
> 
> So we keep the has_event test on the sysfs event to handle the
> virtualization and disabled case. We use  HLE_RETIRED.START to detect
> whether the model supports TSX.

Yes. I think the JSON event always keeps the latest status of an event.
If an event is deprecated someday, I don't think there is a reason to
keep any metrics including the event. So we should use it to check
whether to generate a metrics.

The sysfs event tells if the current kernel support the event. It should
be used to check whether a metrics should be used/enabled.

> Should the event be the sysfs or json
> version? i.e.
> 
>         "MetricExpr": "(cycles\\-t / el\\-start if
> has_event(el\\-start) else 0)",
> 
> or
> 
>         "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
> has_event(el\\-start) else 0)",
> 
> I think I favor the former for some consistency with the has_event.
>

Agree, the former looks good to me too.


> Using HLE_RETIRED.START means the set of TSX models goes from:
>         'alderlake',
>         'cascadelakex',
>         'icelake',
>         'icelakex',
>         'rocketlake',
>         'sapphirerapids',
>         'skylake',
>         'skylakex',
>         'tigerlake',
> 
> To:
>    broadwell
>    broadwellde
>    broadwellx
>    cascadelakex
>    haswell
>    haswellx
>    icelake
>    rocketlake
>    skylake
>    skylakex
> 
> Using RTM_RETIRED.START it goes to:
>    broadwell
>    broadwellde
>    broadwellx
>    cascadelakex
>    emeraldrapids
>    graniterapids
>    haswell
>    haswellx
>    icelake
>    icelakex
>    rocketlake
>    sapphirerapids
>    skylake
>    skylakex
>    tigerlake
> 
> So I'm not sure it is working equivalently to what we have today,
> which may be good or bad. Here is what I think the code should look
> like:

Yes, there should be some changes. But I think the changes should be good.

For icelakex, the HLE_RETIRED.START has been deprecated. I don't see a
reason why should perf keep the tsx_cycles_per_elision metric.

For alderlake, TSX is deprecated. The perf should drop the related
metrics as well.
https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/001/deprecated-technologies/

> 
> def Tsx() -> Optional[MetricGroup]:
>   pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
>   cycles = Event('cycles')
>   cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>   cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>   try:
>     # Test if the tsx event is present in the json, prefer the
>     # sysfs version so that we can detect its presence at runtime.
>     transaction_start = Event("RTM_RETIRED.START")
>     transaction_start = Event(f'{pmu}/tx\-start/')
>   except:
>     return None
> 
>   elision_start = None
>   try:
>     # Elision start isn't supported by all models, but we'll not
>     # generate the tsx_cycles_per_elision metric in that
>     # case. Again, prefer the sysfs encoding of the event.
>     elision_start = Event("HLE_RETIRED.START")
>     elision_start = Event(f'{pmu}/el\-start/')
>   except:
>     pass
> 
>   return MetricGroup('transaction', [
>       Metric('tsx_transactional_cycles',
>              'Percentage of cycles within a transaction region.',
>              Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>              '100%'),
>       Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
> transactions.',
>              Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>                     has_event(cycles_in_tx),
>                     0),
>              '100%'),
>       Metric('tsx_cycles_per_transaction',
>              'Number of cycles within a transaction divided by the
> number of transactions.',
>              Select(cycles_in_tx / transaction_start,
>                     has_event(cycles_in_tx),
>                     0),
>              "cycles / transaction"),
>       Metric('tsx_cycles_per_elision',
>              'Number of cycles within a transaction divided by the
> number of elisions.',
>              Select(cycles_in_tx / elision_start,
>                     has_event(elision_start),
>                     0),
>              "cycles / elision") if elision_start else None,
>   ], description="Breakdown of transactional memory statistics")
> 
> Wdyt?

Looks good to me.

Thanks,
Kan
> 
> Thanks,
> Ian
> 
>> Thanks,
>> Kan
>>
>>>
>>> Thanks,
>>> Ian
>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> +        elision_start = Event(f'{pmu}/el\-start/')
>>>>> +        metrics += [
>>>>> +            Metric('tsx_cycles_per_elision',
>>>>> +                          'Number of cycles within a transaction divided by the number of elisions.',
>>>>> +                          Select(cycles_in_tx / elision_start,
>>>>> +                                        has_event(elision_start),
>>>>> +                                        0),
>>>>> +                          "cycles / elision"),
>>>>> +        ]
>>>>> +    return MetricGroup('transaction', metrics)
>>>>> +
>>>>> +
>>>>>  all_metrics = MetricGroup("", [
>>>>>      Idle(),
>>>>>      Rapl(),
>>>>>      Smi(),
>>>>> +    Tsx(),
>>>>>  ])
>>>>>
>>>>>  if args.metricgroups:
>>>
  

Patch

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 20c25d142f24..1096accea2aa 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -7,6 +7,7 @@  import argparse
 import json
 import math
 import os
+from typing import Optional
 
 parser = argparse.ArgumentParser(description="Intel perf json generator")
 parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
@@ -77,10 +78,60 @@  def Smi() -> MetricGroup:
     ])
 
 
+def Tsx() -> Optional[MetricGroup]:
+    if args.model not in [
+        'alderlake',
+        'cascadelakex',
+        'icelake',
+        'icelakex',
+        'rocketlake',
+        'sapphirerapids',
+        'skylake',
+        'skylakex',
+        'tigerlake',
+    ]:
+        return None
+
+    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
+    cycles = Event('cycles')
+    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
+    transaction_start = Event(f'{pmu}/tx\-start/')
+    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
+    metrics = [
+        Metric('tsx_transactional_cycles',
+                      'Percentage of cycles within a transaction region.',
+                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
+                      '100%'),
+        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
+                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
+                                    has_event(cycles_in_tx),
+                                    0),
+                      '100%'),
+        Metric('tsx_cycles_per_transaction',
+                      'Number of cycles within a transaction divided by the number of transactions.',
+                      Select(cycles_in_tx / transaction_start,
+                                    has_event(cycles_in_tx),
+                                    0),
+                      "cycles / transaction"),
+    ]
+    if args.model != 'sapphirerapids':
+        elision_start = Event(f'{pmu}/el\-start/')
+        metrics += [
+            Metric('tsx_cycles_per_elision',
+                          'Number of cycles within a transaction divided by the number of elisions.',
+                          Select(cycles_in_tx / elision_start,
+                                        has_event(elision_start),
+                                        0),
+                          "cycles / elision"),
+        ]
+    return MetricGroup('transaction', metrics)
+
+
 all_metrics = MetricGroup("", [
     Idle(),
     Rapl(),
     Smi(),
+    Tsx(),
 ])
 
 if args.metricgroups: