Message ID | 20231113112507.917107-4-james.clark@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp1136253vqg; Mon, 13 Nov 2023 03:27:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IHraCqAJKviKDoMDyeG9+3PRj2z2mUL+aB9Vs6qLJMWZOEp33VKqkhm8n621vmXPyWp7+EV X-Received: by 2002:a05:6870:1e87:b0:1e9:ccec:645a with SMTP id pb7-20020a0568701e8700b001e9ccec645amr8481324oab.44.1699874862670; Mon, 13 Nov 2023 03:27:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699874862; cv=none; d=google.com; s=arc-20160816; b=Ee1KSEVVt3VPrXFx8v0hoqh5Cx5b4xxqX5U5p9vmIS0Qmqn0ZV+jG4bZclQcYXebXM cyYcfDbFNG+w8eYtfR09KaVOgqls06P5Su+JHqjnq2bQz6n2mMbqff+g4rSGIBlq9aRy d1ZF8zNtLya3OsOe7OdOys6GA5L7pCYbwHlWQu2M+U9rk3TwtpkBPEHgYGm6Suzo+jbn uK4kK9iIdcQ/JSmQIfj6etCPHygEn4xbyJL7AFRhRSklGM5qxXe6H9VMVtmjXx7D1PUG IsbK2lrqqbgyww2/F3y1bEcuZkc024OzbLk0t+Bwn01IXXb7M+WNdziEtVRDEgpEzT19 csQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=d3Css4OHpZ4sRp21iPs2ik+NYeCASnyCFZ96Vk2FPyw=; fh=SnCd4bvG88NEfyCdvKK/GXg69kKL/C759z8uFsR/n/U=; b=0KX6FRFM6G1nzKZTad/vVe9O+vMdvNq/PVAa2Vrai8988wj17PoVCv6M17KT1LEZWP EyWtIVexH/4ofNrunhIu2vkWU6p78VMnfYenEZPffGMb6CQKVT2nK0Ny8HTftZ5WVZaV ok0w38alporJbhtpr49odEhx8jyFTUr5kd7GX0H08SIcO+vPKKoeRi6hiz1jfuV//PBM Kj4hducO31kqP9xkRhpWIHM+2O0jfu6fAHLWJc5yejzfOfmQQNgtaO+BTEVnuFUvIhqe TYSlrdA7n8kNFKmJAASeAOWoFxH5jFY3rRyn855v+CgTQOrBSLOXAqW891aRerbqmhaf MgLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id jc20-20020a056a006c9400b006c4ec5d627csi5693946pfb.299.2023.11.13.03.27.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 03:27:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id B14EC8069D97; Mon, 13 Nov 2023 03:27:41 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231618AbjKML1Z (ORCPT <rfc822;heyuhang3455@gmail.com> + 29 others); Mon, 13 Nov 2023 06:27:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231232AbjKML1I (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 13 Nov 2023 06:27:08 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0157044B8; Mon, 13 Nov 2023 03:26:07 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A8AAD150C; Mon, 13 Nov 2023 03:26:47 -0800 (PST) Received: from e127643.arm.com (unknown [10.57.71.191]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 191A73F6C4; Mon, 13 Nov 2023 03:25:59 -0800 (PST) From: James Clark <james.clark@arm.com> To: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, suzuki.poulose@arm.com, will@kernel.org, mark.rutland@arm.com Cc: James Clark <james.clark@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Jonathan Corbet <corbet@lwn.net>, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature Date: Mon, 13 Nov 2023 11:25:06 +0000 Message-Id: <20231113112507.917107-4-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231113112507.917107-1-james.clark@arm.com> References: <20231113112507.917107-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 13 Nov 2023 03:27:41 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782447984153981744 X-GMAIL-MSGID: 1782447984153981744 |
Series |
arm64: perf: Add support for event counting threshold
|
|
Commit Message
James Clark
Nov. 13, 2023, 11:25 a.m. UTC
Add documentation for the new Perf event open parameters and
the threshold_max capability file.
Signed-off-by: James Clark <james.clark@arm.com>
---
Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
1 file changed, 56 insertions(+)
Comments
On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: > > Add documentation for the new Perf event open parameters and > the threshold_max capability file. > > Signed-off-by: James Clark <james.clark@arm.com> > --- > Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ > 1 file changed, 56 insertions(+) > > diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst > index 1f87b57c2332..36b8111a710d 100644 > --- a/Documentation/arch/arm64/perf.rst > +++ b/Documentation/arch/arm64/perf.rst > @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c > .. _tools/lib/perf/tests/test-evsel.c: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c > + > +Event Counting Threshold > +========================================== > + > +Overview > +-------- > + > +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on > +events whose count meets a specified threshold condition. For example if > +threshold_compare is set to 2 ('Greater than or equal'), and the > +threshold is set to 2, then the PMU counter will now only increment by > +when an event would have previously incremented the PMU counter by 2 or > +more on a single processor cycle. > + > +To increment by 1 after passing the threshold condition instead of the > +number of events on that cycle, add the 'threshold_count' option to the > +commandline. > + > +How-to > +------ > + > +The threshold, threshold_compare and threshold_count values can be > +provided per event: > + > +.. code-block:: sh > + > + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ > + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ Can you please explain this a bit more? I guess the first event counts stall_slot PMU if the event if it's greater than or equal to 2. And as threshold_count is not set, it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. OTOH, dtlb_walk will count 1 if it sees an event less than 10. Is my understanding correct? > + > +And the following comparison values are supported: > + > +.. code-block:: > + > + 0: Not-equal > + 1: Equals > + 2: Greater-than-or-equal > + 3: Less-than So the above values are for threashold_compare, right? It'd be nice if it's more explicit. Similarly, it'd be helpful to have a description for the threshold and threshold_count fields. Thanks, Namhyung > + > +The maximum supported threshold value can be read from the caps of each > +PMU, for example: > + > +.. code-block:: sh > + > + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max > + > + 0x000000ff > + > +If a value higher than this is given, then it will be silently clamped > +to the maximum. The highest possible maximum is 4095, as the config > +field for threshold is limited to 12 bits, and the Perf tool will refuse > +to parse higher values. > + > +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read > +0, and both threshold and threshold_compare will be silently ignored. > +threshold_max will also read as 0 on aarch32 guests, even if the host > +is running on hardware with the feature. > -- > 2.34.1 > >
On 20/11/2023 21:31, Namhyung Kim wrote: > On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: >> >> Add documentation for the new Perf event open parameters and >> the threshold_max capability file. >> >> Signed-off-by: James Clark <james.clark@arm.com> >> --- >> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >> 1 file changed, 56 insertions(+) >> >> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst >> index 1f87b57c2332..36b8111a710d 100644 >> --- a/Documentation/arch/arm64/perf.rst >> +++ b/Documentation/arch/arm64/perf.rst >> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >> .. _tools/lib/perf/tests/test-evsel.c: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >> + >> +Event Counting Threshold >> +========================================== >> + >> +Overview >> +-------- >> + >> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >> +events whose count meets a specified threshold condition. For example if >> +threshold_compare is set to 2 ('Greater than or equal'), and the >> +threshold is set to 2, then the PMU counter will now only increment by >> +when an event would have previously incremented the PMU counter by 2 or >> +more on a single processor cycle. >> + >> +To increment by 1 after passing the threshold condition instead of the >> +number of events on that cycle, add the 'threshold_count' option to the >> +commandline. >> + >> +How-to >> +------ >> + >> +The threshold, threshold_compare and threshold_count values can be >> +provided per event: >> + >> +.. code-block:: sh >> + >> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >> + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ > > Can you please explain this a bit more? > > I guess the first event counts stall_slot PMU if the event if it's > greater than or equal to 2. And as threshold_count is not set, > it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. > > OTOH, dtlb_walk will count 1 if it sees an event less than 10. > Is my understanding correct? That is correct. The behavior is described in the paragraph above. But I agree that it would be really helpful if we explained with the example above. > >> + >> +And the following comparison values are supported: >> + >> +.. code-block:: >> + >> + 0: Not-equal >> + 1: Equals >> + 2: Greater-than-or-equal >> + 3: Less-than > > So the above values are for threashold_compare, right? > It'd be nice if it's more explicit. > > Similarly, it'd be helpful to have a description for the > threshold and threshold_count fields. Agreed. Suzuki > > Thanks, > Namhyung > >> + >> +The maximum supported threshold value can be read from the caps of each >> +PMU, for example: >> + >> +.. code-block:: sh >> + >> + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max >> + >> + 0x000000ff >> + >> +If a value higher than this is given, then it will be silently clamped >> +to the maximum. The highest possible maximum is 4095, as the config >> +field for threshold is limited to 12 bits, and the Perf tool will refuse >> +to parse higher values. >> + >> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read >> +0, and both threshold and threshold_compare will be silently ignored. >> +threshold_max will also read as 0 on aarch32 guests, even if the host >> +is running on hardware with the feature. >> -- >> 2.34.1 >> >>
On 11/21/23 03:01, Namhyung Kim wrote: > On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: >> Add documentation for the new Perf event open parameters and >> the threshold_max capability file. >> >> Signed-off-by: James Clark <james.clark@arm.com> >> --- >> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >> 1 file changed, 56 insertions(+) >> >> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst >> index 1f87b57c2332..36b8111a710d 100644 >> --- a/Documentation/arch/arm64/perf.rst >> +++ b/Documentation/arch/arm64/perf.rst >> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >> .. _tools/lib/perf/tests/test-evsel.c: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >> + >> +Event Counting Threshold >> +========================================== >> + >> +Overview >> +-------- >> + >> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >> +events whose count meets a specified threshold condition. For example if >> +threshold_compare is set to 2 ('Greater than or equal'), and the >> +threshold is set to 2, then the PMU counter will now only increment by >> +when an event would have previously incremented the PMU counter by 2 or >> +more on a single processor cycle. >> + >> +To increment by 1 after passing the threshold condition instead of the >> +number of events on that cycle, add the 'threshold_count' option to the >> +commandline. >> + >> +How-to >> +------ >> + >> +The threshold, threshold_compare and threshold_count values can be >> +provided per event: >> + >> +.. code-block:: sh >> + >> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >> + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ > Can you please explain this a bit more? > > I guess the first event counts stall_slot PMU if the event if it's > greater than or equal to 2. And as threshold_count is not set, > it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. Hence without 'threshold_count' being set, the other two config requests will not have an effect, is that correct ? > > OTOH, dtlb_walk will count 1 if it sees an event less than 10. > Is my understanding correct? 'Equals' and 'Greater-than-or-equal' makes sense and are intuitive. Just wondering what will happen for 'Not-equal' and 'Less-than' - when would the counter count in such cases ? 0: Not-equal 1: Equals 2: Greater-than-or-equal 3: Less-than
On 21/11/2023 10:33, Suzuki K Poulose wrote: > On 20/11/2023 21:31, Namhyung Kim wrote: >> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: >>> >>> Add documentation for the new Perf event open parameters and >>> the threshold_max capability file. >>> >>> Signed-off-by: James Clark <james.clark@arm.com> >>> --- >>> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >>> 1 file changed, 56 insertions(+) >>> >>> diff --git a/Documentation/arch/arm64/perf.rst >>> b/Documentation/arch/arm64/perf.rst >>> index 1f87b57c2332..36b8111a710d 100644 >>> --- a/Documentation/arch/arm64/perf.rst >>> +++ b/Documentation/arch/arm64/perf.rst >>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as >>> needed. >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >>> .. _tools/lib/perf/tests/test-evsel.c: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >>> + >>> +Event Counting Threshold >>> +========================================== >>> + >>> +Overview >>> +-------- >>> + >>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >>> +events whose count meets a specified threshold condition. For >>> example if >>> +threshold_compare is set to 2 ('Greater than or equal'), and the >>> +threshold is set to 2, then the PMU counter will now only increment by >>> +when an event would have previously incremented the PMU counter by 2 or >>> +more on a single processor cycle. >>> + >>> +To increment by 1 after passing the threshold condition instead of the >>> +number of events on that cycle, add the 'threshold_count' option to the >>> +commandline. >>> + >>> +How-to >>> +------ >>> + >>> +The threshold, threshold_compare and threshold_count values can be >>> +provided per event: >>> + >>> +.. code-block:: sh >>> + >>> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >>> + -e >>> dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ >> >> Can you please explain this a bit more? >> >> I guess the first event counts stall_slot PMU if the event if it's >> greater than or equal to 2. And as threshold_count is not set, >> it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. >> >> OTOH, dtlb_walk will count 1 if it sees an event less than 10. >> Is my understanding correct? > > That is correct. The behavior is described in the paragraph above. > But I agree that it would be really helpful if we explained with the > example above. > Yeah I can add a description of how the example behaves. >> >>> + >>> +And the following comparison values are supported: >>> + >>> +.. code-block:: >>> + >>> + 0: Not-equal >>> + 1: Equals >>> + 2: Greater-than-or-equal >>> + 3: Less-than >> >> So the above values are for threashold_compare, right? >> It'd be nice if it's more explicit. Yep I agree, I can label this with threshold_compare. >> >> Similarly, it'd be helpful to have a description for the >> threshold and threshold_count fields. > > Agreed. > > Suzuki > Yeah I'll add explicit descriptions for each field. Thanks for the review. > > >> >> Thanks, >> Namhyung >> >>> + >>> +The maximum supported threshold value can be read from the caps of each >>> +PMU, for example: >>> + >>> +.. code-block:: sh >>> + >>> + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max >>> + >>> + 0x000000ff >>> + >>> +If a value higher than this is given, then it will be silently clamped >>> +to the maximum. The highest possible maximum is 4095, as the config >>> +field for threshold is limited to 12 bits, and the Perf tool will >>> refuse >>> +to parse higher values. >>> + >>> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read >>> +0, and both threshold and threshold_compare will be silently ignored. >>> +threshold_max will also read as 0 on aarch32 guests, even if the host >>> +is running on hardware with the feature. >>> -- >>> 2.34.1 >>> >>> > >
On 23/11/2023 05:50, Anshuman Khandual wrote: > > > On 11/21/23 03:01, Namhyung Kim wrote: >> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: >>> Add documentation for the new Perf event open parameters and >>> the threshold_max capability file. >>> >>> Signed-off-by: James Clark <james.clark@arm.com> >>> --- >>> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >>> 1 file changed, 56 insertions(+) >>> >>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst >>> index 1f87b57c2332..36b8111a710d 100644 >>> --- a/Documentation/arch/arm64/perf.rst >>> +++ b/Documentation/arch/arm64/perf.rst >>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >>> .. _tools/lib/perf/tests/test-evsel.c: >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >>> + >>> +Event Counting Threshold >>> +========================================== >>> + >>> +Overview >>> +-------- >>> + >>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >>> +events whose count meets a specified threshold condition. For example if >>> +threshold_compare is set to 2 ('Greater than or equal'), and the >>> +threshold is set to 2, then the PMU counter will now only increment by >>> +when an event would have previously incremented the PMU counter by 2 or >>> +more on a single processor cycle. >>> + >>> +To increment by 1 after passing the threshold condition instead of the >>> +number of events on that cycle, add the 'threshold_count' option to the >>> +commandline. >>> + >>> +How-to >>> +------ >>> + >>> +The threshold, threshold_compare and threshold_count values can be >>> +provided per event: >>> + >>> +.. code-block:: sh >>> + >>> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >>> + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ >> Can you please explain this a bit more? >> >> I guess the first event counts stall_slot PMU if the event if it's >> greater than or equal to 2. And as threshold_count is not set, >> it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. > > Hence without 'threshold_count' being set, the other two config requests > will not have an effect, is that correct ? Yeah I can mention this. It's implied because 0 is the default value of config fields, and 0 is a valid value for compare and count field, so threshold=0 has to be the way to disable it. But I can mention it explicitly. > >> >> OTOH, dtlb_walk will count 1 if it sees an event less than 10. >> Is my understanding correct? > > 'Equals' and 'Greater-than-or-equal' makes sense and are intuitive. Just > wondering what will happen for 'Not-equal' and 'Less-than' - when would > the counter count in such cases ? > > 0: Not-equal > 1: Equals > 2: Greater-than-or-equal > 3: Less-than > They would count when the event is not equal to or less than the threshold value on any cycle. Probably going into more detail would start to reproduce what's in the reference manual. All the pseudocode is in there which describes how it works. As for use cases, I'm not really sure. It probably wasn't any effort to add into the hardware with a single not gate, and something could have been missed if it wasn't added. You might be able to do things like count the inverse of something without having to open another event to subtract from to find what the inverse would be.
On 23/11/2023 15:45, James Clark wrote: > > > On 23/11/2023 05:50, Anshuman Khandual wrote: >> >> >> On 11/21/23 03:01, Namhyung Kim wrote: >>> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote: >>>> Add documentation for the new Perf event open parameters and >>>> the threshold_max capability file. >>>> >>>> Signed-off-by: James Clark <james.clark@arm.com> >>>> --- >>>> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >>>> 1 file changed, 56 insertions(+) >>>> >>>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst >>>> index 1f87b57c2332..36b8111a710d 100644 >>>> --- a/Documentation/arch/arm64/perf.rst >>>> +++ b/Documentation/arch/arm64/perf.rst >>>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >>>> .. _tools/lib/perf/tests/test-evsel.c: >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >>>> + >>>> +Event Counting Threshold >>>> +========================================== >>>> + >>>> +Overview >>>> +-------- >>>> + >>>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >>>> +events whose count meets a specified threshold condition. For example if >>>> +threshold_compare is set to 2 ('Greater than or equal'), and the >>>> +threshold is set to 2, then the PMU counter will now only increment by >>>> +when an event would have previously incremented the PMU counter by 2 or >>>> +more on a single processor cycle. >>>> + >>>> +To increment by 1 after passing the threshold condition instead of the >>>> +number of events on that cycle, add the 'threshold_count' option to the >>>> +commandline. >>>> + >>>> +How-to >>>> +------ >>>> + >>>> +The threshold, threshold_compare and threshold_count values can be >>>> +provided per event: >>>> + >>>> +.. code-block:: sh >>>> + >>>> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >>>> + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ >>> Can you please explain this a bit more? >>> >>> I guess the first event counts stall_slot PMU if the event if it's >>> greater than or equal to 2. And as threshold_count is not set, >>> it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. >> >> Hence without 'threshold_count' being set, the other two config requests >> will not have an effect, is that correct ? > > Yeah I can mention this. It's implied because 0 is the default value of > config fields, and 0 is a valid value for compare and count field, so > threshold=0 has to be the way to disable it. But I can mention it > explicitly. > To avoid any confusion, I thought you meant threshold here instead of threshold_count. But I replied in more detail about the same issue on patch 2.
diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst index 1f87b57c2332..36b8111a710d 100644 --- a/Documentation/arch/arm64/perf.rst +++ b/Documentation/arch/arm64/perf.rst @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c .. _tools/lib/perf/tests/test-evsel.c: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c + +Event Counting Threshold +========================================== + +Overview +-------- + +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on +events whose count meets a specified threshold condition. For example if +threshold_compare is set to 2 ('Greater than or equal'), and the +threshold is set to 2, then the PMU counter will now only increment by +when an event would have previously incremented the PMU counter by 2 or +more on a single processor cycle. + +To increment by 1 after passing the threshold condition instead of the +number of events on that cycle, add the 'threshold_count' option to the +commandline. + +How-to +------ + +The threshold, threshold_compare and threshold_count values can be +provided per event: + +.. code-block:: sh + + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ + +And the following comparison values are supported: + +.. code-block:: + + 0: Not-equal + 1: Equals + 2: Greater-than-or-equal + 3: Less-than + +The maximum supported threshold value can be read from the caps of each +PMU, for example: + +.. code-block:: sh + + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max + + 0x000000ff + +If a value higher than this is given, then it will be silently clamped +to the maximum. The highest possible maximum is 4095, as the config +field for threshold is limited to 12 bits, and the Perf tool will refuse +to parse higher values. + +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read +0, and both threshold and threshold_compare will be silently ignored. +threshold_max will also read as 0 on aarch32 guests, even if the host +is running on hardware with the feature.