Message ID | 1672745976-2800146-2-git-send-email-renyu.zj@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp4565262wrt; Tue, 3 Jan 2023 03:41:00 -0800 (PST) X-Google-Smtp-Source: AMrXdXtzSoJtT4ZjeZzIQVgdvryVsFm8SdKC5Rv9CkrWd/DSVpmUOf54pjNysXJpK8yOu2J/c51E X-Received: by 2002:a05:6402:2484:b0:46c:6ed1:83ac with SMTP id q4-20020a056402248400b0046c6ed183acmr40911493eda.9.1672746060263; Tue, 03 Jan 2023 03:41:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672746060; cv=none; d=google.com; s=arc-20160816; b=NftlHcbFQF2TU1v34IlVzgKcq7FD3lsTePlkWgexlQKYRxxj3CjNl/rxTRf6cmfTG0 sWEjF49wMK2m8oXRsFyToQ3IiFR8raZi+n23b80ZLKbPV1xyAZwhgCyI/mCeI5QRndv8 U8ht5wPc/LSGp8eIasn2zJD5KkPMSt/0hSoq/IOMPBFdjLG2BnGorUGYMhCulWIaZw0J p5Tu0c68ARj5pwhp91z7RY8xka6hML6jT9OeOwn6fy40QTVDoNCiBM0eiqDLWJa23NcG aw61UmG5biuGf/zTJ/kCqeWrVe0UvdVj1ckRV4M5jvEF3ABnMcNWUv5tqw2xQwat2hf5 t3Qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from; bh=2YlUudIZJUEACTqV0rijkYYqpobSY0q+jkmSlPapKpo=; b=XR6ZyewiBkWkyARFbC0atms7KBkx4YfPvAv+8Z5TAq5dGiGDC/ycLVgOo9GTE2SNKU RRPdVSRHE6GQJrHZyyj7riuw5mM/8wtyPmDvTvjVexCgAk3SNNkhOEIDnTGly2aoLOu7 3oTQvJHCAOLM0QkoFRhIJZmFT4PePZ6TSRL/scKhI/TQK5IUxDqv9tcQ5o70Tr6sn4kb pUebCP/N4JqD0iFzjGwGvKtjRK3a98Sus0MO62zfuj/lmXHAK9hQar/ChuD1rx2AjdlC mWPHgisybjOHUX5SFGQLT1R6ZeOC6d8uPbT+5/iV4YmiN1r5CfNR6bXcEpiuHoq20vbJ 2Ndg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z99-20020a509e6c000000b0046c9a4ab639si25649838ede.548.2023.01.03.03.40.36; Tue, 03 Jan 2023 03:41:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237027AbjACLkA (ORCPT <rfc822;tmhikaru@gmail.com> + 99 others); Tue, 3 Jan 2023 06:40:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230437AbjACLj4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 3 Jan 2023 06:39:56 -0500 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E46FB7C; Tue, 3 Jan 2023 03:39:54 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R541e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=renyu.zj@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0VYn6lcr_1672745990; Received: from srmbuffer011165236051.sqa.eu95(mailfrom:renyu.zj@linux.alibaba.com fp:SMTPD_---0VYn6lcr_1672745990) by smtp.aliyun-inc.com; Tue, 03 Jan 2023 19:39:51 +0800 From: Jing Zhang <renyu.zj@linux.alibaba.com> To: John Garry <john.g.garry@oracle.com>, Ian Rogers <irogers@google.com>, Xing Zhengjun <zhengjun.xing@linux.intel.com>, Will Deacon <will@kernel.org>, James Clark <james.clark@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Andrew Kilroy <andrew.kilroy@arm.com>, Shuai Xue <xueshuai@linux.alibaba.com>, Zhuo Song <zhuo.song@linux.alibaba.com>, Jing Zhang <renyu.zj@linux.alibaba.com> Subject: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2 Date: Tue, 3 Jan 2023 19:39:31 +0800 Message-Id: <1672745976-2800146-2-git-send-email-renyu.zj@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1672745976-2800146-1-git-send-email-renyu.zj@linux.alibaba.com> References: <1672745976-2800146-1-git-send-email-renyu.zj@linux.alibaba.com> X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754001372417733544?= X-GMAIL-MSGID: =?utf-8?q?1754001372417733544?= |
Series |
Add metrics for neoverse-n2
|
|
Commit Message
Jing Zhang
Jan. 3, 2023, 11:39 a.m. UTC
The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform design document [0], D37-38. However, due to the wrong count of stall_slot and stall_slot_frontend on neoverse-n2, the real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so correct the expression of topdown metrics. Reference from ARM neoverse-n2 errata notice [1], D117. Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to further analysis of performance bottlenecks in the following patches. Reference from ARM PMU guide [2][3]. [0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token= [1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token= [2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token= [3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token= Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Acked-by: Ian Rogers <irogers@google.com> --- .../arch/arm64/arm/neoverse-n2/metrics.json | 30 ++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
Comments
On 03/01/2023 11:39, Jing Zhang wrote: > The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform > design document [0], D37-38. I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse. You don't strictly have to do do this now, but it would be better. Thanks, John > > However, due to the wrong count of stall_slot and stall_slot_frontend on > neoverse-n2, the real stall_slot and real stall_slot_frontend need to > subtract cpu_cycles, so correct the expression of topdown metrics. > Reference from ARM neoverse-n2 errata notice [1], D117. > > Since neoverse-n2 does not yet support topdown L2, metricgroups such as > Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to > further analysis of performance bottlenecks in the following patches. > Reference from ARM PMU guide [2][3].
在 2023/1/3 下午7:52, John Garry 写道: > On 03/01/2023 11:39, Jing Zhang wrote: >> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform >> design document [0], D37-38. > > I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse. > > You don't strictly have to do do this now, but it would be better. > Hi John, I would really like to do this, but as discussed earlier, slot is different on each architectures. If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable? In addition, James mentioned that if the units and names and group names of different architectures are not unified, it will become complicated. Perhaps we could do it later. Thanks, Jing
On 04/01/2023 05:05, Jing Zhang wrote: > > > 在 2023/1/3 下午7:52, John Garry 写道: >> On 03/01/2023 11:39, Jing Zhang wrote: >>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform >>> design document [0], D37-38. >> >> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse. >> >> You don't strictly have to do do this now, but it would be better. >> > > Hi John, Hi Jing, > > I would really like to do this, but as discussed earlier, slot is different on each architectures. > If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to > overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only > reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable? I don't see a lot of value in that really. However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used? > > In addition, James mentioned that if the units and names and group names of different architectures > are not unified, it will become complicated. > Thanks, John
在 2023/1/5 上午1:26, John Garry 写道: > On 04/01/2023 05:05, Jing Zhang wrote: >> >> >> 在 2023/1/3 下午7:52, John Garry 写道: >>> On 03/01/2023 11:39, Jing Zhang wrote: >>>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform >>>> design document [0], D37-38. >>> >>> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse. >>> >>> You don't strictly have to do do this now, but it would be better. >>> >> >> Hi John, > > Hi Jing, > >> >> I would really like to do this, but as discussed earlier, slot is different on each architectures. >> If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to >> overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only >> reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable? > > I don't see a lot of value in that really. > > However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used? > This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the read slots values? Currently I understand that parameters in metricExpr only support events and constants.
On 05/01/2023 10:05, Jing Zhang wrote: >> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used? >> > > This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in > /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the > read slots values? Currently I understand that parameters in metricExpr only support events and constants. > Maybe during runtime we could create a pseudo metric/event for SLOT. This metric would be created during init, and it always just returns the value which was read from PMMIR_EL1. I'm not sure how well that would play will trying to resolve metrics when building generated pmu-events.c, but I don't think it's all too difficult to achieve. Have you actually read this value for the n2 core? Does look correct? Thanks, John
在 2023/1/5 下午6:13, John Garry 写道: > Maybe during runtime we could create a pseudo metric/event for SLOT. This metric would be created during init, and it always just returns the value which was read from PMMIR_EL1. > > I'm not sure how well that would play will trying to resolve metrics when building generated pmu-events.c, but I don't think it's all too difficult to achieve. > I'll try it in the v7 patch. I want to release the v6 patch first, to correct a mistake I made. :) > Have you actually read this value for the n2 core? Does look correct? Yes, I read it in n2 and it has a value of 5 which is correct. If the STALL_SLOT event is not implemented, PMMIR_EL1.SLOT might read as zero.
On Thu, Jan 5, 2023 at 2:13 AM John Garry <john.g.garry@oracle.com> wrote: > > On 05/01/2023 10:05, Jing Zhang wrote: > >> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used? > >> > > > > This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in > > /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the > > read slots values? Currently I understand that parameters in metricExpr only support events and constants. > > > > Maybe during runtime we could create a pseudo metric/event for SLOT. For Intel we do this by just having a different constant for each architecture. It is fairly easy to add a new "literal", so you could add a #slots in expr__get_literal: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf/core#n407 Populating it would be the challenge :-) Thanks, Ian > This metric would be created during init, and it always just returns the > value which was read from PMMIR_EL1. > > I'm not sure how well that would play will trying to resolve metrics > when building generated pmu-events.c, but I don't think it's all too > difficult to achieve. > > Have you actually read this value for the n2 core? Does look correct? > > Thanks, > John
On 05/01/2023 21:13, Ian Rogers wrote: >>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in >>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the >>> read slots values? Currently I understand that parameters in metricExpr only support events and constants. >>> >> Maybe during runtime we could create a pseudo metric/event for SLOT. > For Intel we do this by just having a different constant for each > architecture. It is fairly easy to add a new "literal", so you could > add a #slots in expr__get_literal: > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$ > Populating it would be the challenge 😄 Thanks for the pointer. I think that the challenge in populating it really comes down to whether we would really want to make this generic. I suppose that for arm64 we could have a method which accesses this PMMIR_EL1 register, while for other archs we could have a weak function which just returns NAN. If other archs want to use this key expr, they can add their own method. Out of curiosity, do you know if x86 has such a capability to get this slot info from HW? Thanks, John
在 2023/1/6 下午6:14, John Garry 写道: > On 05/01/2023 21:13, Ian Rogers wrote: >>>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in >>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the >>>> read slots values? Currently I understand that parameters in metricExpr only support events and constants. >>>> >>> Maybe during runtime we could create a pseudo metric/event for SLOT. >> For Intel we do this by just having a different constant for each >> architecture. It is fairly easy to add a new "literal", so you could >> add a #slots in expr__get_literal: >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$ Populating it would be the challenge 😄 Yes! I was thinking the same as you, I found this method from the SMT_on variable in icl_metrics.json, then I tried it and it worked, so excited! > > Thanks for the pointer. I think that the challenge in populating it really comes down to whether we would really want to make this generic. > > I suppose that for arm64 we could have a method which accesses this PMMIR_EL1 register, while for other archs we could have a weak function which just returns NAN. If other archs want to use this key expr, they can add their own method. > Now I have to use this method, because I just found out that neoverse-n2 has been changed to neoverse-n2-v2, merging n2 and v2. The slots of n2 are 5, and the slots of v2 are 8. I will release the v6 patch and put the metric in the sbsa.json file. The metrics in sbsa.json is only applicable to arm64, so even if x86 cannot get the slots value, there will be no conflict. > Out of curiosity, do you know if x86 has such a capability to get this slot info from HW? >
On 06/01/2023 10:14, John Garry wrote: > On 05/01/2023 21:13, Ian Rogers wrote: >>>> This may be a feasible idea. The value of slots comes from the >>>> register PMMIR_EL1, which I can read in >>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I >>>> replace the slots in MetricExpr with the >>>> read slots values? Currently I understand that parameters in >>>> metricExpr only support events and constants. >>>> >>> Maybe during runtime we could create a pseudo metric/event for SLOT. >> For Intel we do this by just having a different constant for each >> architecture. It is fairly easy to add a new "literal", so you could >> add a #slots in expr__get_literal: >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$ Populating it would be the challenge 😄 > > Thanks for the pointer. I think that the challenge in populating it > really comes down to whether we would really want to make this generic. > > I suppose that for arm64 we could have a method which accesses this > PMMIR_EL1 register, while for other archs we could have a weak function > which just returns NAN. If other archs want to use this key expr, they > can add their own method. > I wonder if it would be worthwhile and even more generic to add some sort of int containing file accessor construct. It could also have support for a default value when the file doesn't exist. For example: "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}" It gets a bit fiddly because you might want to support absolute paths and paths relative to whatever PMU is being used. But it could prevent having to add some custom identifier and glue code for every possible file that just has an integer in it. It also wouldn't be possible to support the case where the file has bitfields in it that need to be extracted, so maybe we shouldn't do it. James
On Mon, Jan 9, 2023 at 7:35 AM James Clark <james.clark@arm.com> wrote: > > > > On 06/01/2023 10:14, John Garry wrote: > > On 05/01/2023 21:13, Ian Rogers wrote: > >>>> This may be a feasible idea. The value of slots comes from the > >>>> register PMMIR_EL1, which I can read in > >>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I > >>>> replace the slots in MetricExpr with the > >>>> read slots values? Currently I understand that parameters in > >>>> metricExpr only support events and constants. > >>>> > >>> Maybe during runtime we could create a pseudo metric/event for SLOT. > >> For Intel we do this by just having a different constant for each > >> architecture. It is fairly easy to add a new "literal", so you could > >> add a #slots in expr__get_literal: > >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$ Populating it would be the challenge 😄 > > > > Thanks for the pointer. I think that the challenge in populating it > > really comes down to whether we would really want to make this generic. > > > > I suppose that for arm64 we could have a method which accesses this > > PMMIR_EL1 register, while for other archs we could have a weak function > > which just returns NAN. If other archs want to use this key expr, they > > can add their own method. > > > > I wonder if it would be worthwhile and even more generic to add some > sort of int containing file accessor construct. It could also have > support for a default value when the file doesn't exist. For example: > > "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}" > > It gets a bit fiddly because you might want to support absolute paths > and paths relative to whatever PMU is being used. But it could prevent > having to add some custom identifier and glue code for every possible > file that just has an integer in it. > > It also wouldn't be possible to support the case where the file has > bitfields in it that need to be extracted, so maybe we shouldn't do it. > > James Thanks James, I think there are many opportunities to improve the metrics. One step in this direction is: https://lore.kernel.org/lkml/20221221223420.2157113-1-irogers@google.com/ (which is looking for reviews :-D ). Some areas we could improve include: - the expression code has support for longs but I don't believe any metrics use it. - the modulus is weird and again unused. - I think divide (/) should behave like d_ratio as aborting parsing is next to useless. - events like Intel's msr/tsc/ don't have to be programmed on every CPU/hyperthread and doing so is quite wasteful. - we may have a read but no write counter, so being able to read a sibling CPUs/socket's read counter may inform about writes. This isn't currently expressible as metrics compute based on whatever the aggregation mode is, you can't get a particular count. - perf stat record/report don't work/compute metrics, but just provide counters. - the json format should resemble sysfs rather than being a flat list, metrics and events in the list should be separated. - metrics use / as divide and so @ is used in /'s place for event modifiers. BPF events use / as a directory separator. For the filesystems reading I think it is a good idea. I'd like to make it so that things like #num_dies become tool events and remove the notion of literals. Perhaps we can make reading a file something that is an event. The current event parsing logic is overly complex, for example the handling of '-' which has some legacy PMU separation properties. A proposal mentioned at LPC was to have a new event parsing library that doesn't carry legacy baggage. We can make metric code use this as the metrics encode the events. If the new library fails parsing the code can fall back on the existing parser. I'd like it if the event parsing logic more closely resembled the sysfs style. I'd like it if we could have events in sysfs, built into the tool (but with a layout resembling sysfs) and also allow events, etc. to be added by having say a zip of a sysfs directory/file structure. I'm hoping libraries like metric.py: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/pmu-events/metric.py can be used by vendors, so that it is easy to update vendor generated json if/when the format changes. Thanks, Ian
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json new file mode 100644 index 0000000..c126f1bc --- /dev/null +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json @@ -0,0 +1,30 @@ +[ + { + "MetricExpr": "(stall_slot_frontend - cpu_cycles) / (5 * cpu_cycles)", + "BriefDescription": "Frontend bound L1 topdown metric", + "MetricGroup": "TopdownL1", + "MetricName": "frontend_bound", + "ScaleUnit": "100%" + }, + { + "MetricExpr": "(1 - op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))", + "BriefDescription": "Bad speculation L1 topdown metric", + "MetricGroup": "TopdownL1", + "MetricName": "bad_speculation", + "ScaleUnit": "100%" + }, + { + "MetricExpr": "(op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))", + "BriefDescription": "Retiring L1 topdown metric", + "MetricGroup": "TopdownL1", + "MetricName": "retiring", + "ScaleUnit": "100%" + }, + { + "MetricExpr": "stall_slot_backend / (5 * cpu_cycles)", + "BriefDescription": "Backend Bound L1 topdown metric", + "MetricGroup": "TopdownL1", + "MetricName": "backend_bound", + "ScaleUnit": "100%" + } +]