Message ID | 20230330164556.31533-6-Jonathan.Cameron@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1270468vqo; Thu, 30 Mar 2023 09:52:03 -0700 (PDT) X-Google-Smtp-Source: AKy350YZygYt8m+Gqxj+rJzbnRg1tCsenFQwMxVqt5iVLmiBcH1yq5X7ZOn7nmNiMz3ueHN7boYN X-Received: by 2002:a17:902:d4c7:b0:19e:82d5:634c with SMTP id o7-20020a170902d4c700b0019e82d5634cmr26820099plg.53.1680195123021; Thu, 30 Mar 2023 09:52:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680195123; cv=none; d=google.com; s=arc-20160816; b=jafyMNhTVxcVK/xMdSHIcScGTfhghF/lnH5hwCZG2wN9vSePymZHUFHf/6ig6P9mvA rBa30xP2g7BOMjMx6nJQLkQXflT8iZcduPlw92z2vk9r99/mxJfNR6dwIUyZthi0LHnn ONMUNMRgBabqlOYOsQ52AYpWFYcX//8TForIpmR9dYYCe30IWmbL3VxB+A3R9emnf1cR 68h3S7esGmrot3XaU9Cr4KMjuCZCgG52PohGV1gqMvl+8SFEnVMVnuHnXdAsOzVCqZyk 0VPP7czcznC5ZS/AkCuCRc5NoJjgnplZVGnk12btTKbHtZ6Ek0K+E5oHZlukYd6IJNta Ivzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=KF3hnykmP6jFl0NXJPXIW7gK8OdSno5Vgu+cWyq5Xx0=; b=b8VfFPZuIv8kKUa27oMFpK4pY9cHErhg9kYUl5mzQexDh+51VydstQ6Y/l++p4gc9+ BuOLPzNPwZ4CcSfj6gqgsJoGTf6Ux0foHsm5I/Fbw+N04ZRNPS2gCIyNoKNLdhUO8Gq/ ZoP6DEORpAONPEuAfMx1ffpRp2RxOggl4ZjQEX2ZF5ieS3rInA7ttGSqOQUs9cX7izob ubsbNfv7abSEUbgOasYXcbVYLUT+iuzHCMmhLrTu3LbbxH8Vu/6zO1XoS4HQ7vRmukzi rRb3vIcFfjPTkCaAaodnzg+580IlF2d+ZNJCwBrYcDNmBBg8ksE6pFo2kxGgGeWSRC9u VMMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jk5-20020a170903330500b0019cb419df45si2386552plb.401.2023.03.30.09.51.49; Thu, 30 Mar 2023 09:52:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231610AbjC3QuB (ORCPT <rfc822;rua109.linux@gmail.com> + 99 others); Thu, 30 Mar 2023 12:50:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232260AbjC3Qtd (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 30 Mar 2023 12:49:33 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15C3B10257; Thu, 30 Mar 2023 09:48:42 -0700 (PDT) Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PnTm4688zz67CtD; Fri, 31 Mar 2023 00:44:36 +0800 (CST) Received: from SecurePC-101-06.china.huawei.com (10.122.247.231) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Thu, 30 Mar 2023 17:48:21 +0100 From: Jonathan Cameron <Jonathan.Cameron@huawei.com> To: Liang Kan <kan.liang@linux.intel.com>, <linux-cxl@vger.kernel.org>, <peterz@infradead.org> CC: <mingo@redhat.com>, <acme@kernel.org>, <mark.rutland@arm.com>, <will@kernel.org>, <dan.j.williams@intel.com>, <linuxarm@huawei.com>, <linux-perf-users@vger.kernel.org>, <linux-kernel@vger.kernel.org>, Davidlohr Bueso <dave@stgolabs.net>, Dave Jiang <dave.jiang@intel.com> Subject: [PATCH v4 5/5] docs: perf: Minimal introduction the the CXL PMU device and driver Date: Thu, 30 Mar 2023 17:45:56 +0100 Message-ID: <20230330164556.31533-6-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230330164556.31533-1-Jonathan.Cameron@huawei.com> References: <20230330164556.31533-1-Jonathan.Cameron@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.122.247.231] X-ClientProxiedBy: lhrpeml500002.china.huawei.com (7.191.160.78) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761812280976742049?= X-GMAIL-MSGID: =?utf-8?q?1761812280976742049?= |
Series |
CXL 3.0 Performance Monitoring Unit support
|
|
Commit Message
Jonathan Cameron
March 30, 2023, 4:45 p.m. UTC
Very basic introduction to the device and the current driver support provided. I expect to expand on this in future versions of this patch set. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> -- v4: No change --- Documentation/admin-guide/perf/cxl.rst | 65 ++++++++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 66 insertions(+)
Comments
On 2023-03-30 12:45 p.m., Jonathan Cameron wrote: > Very basic introduction to the device and the current driver support > provided. I expect to expand on this in future versions of this patch > set. > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > -- > v4: No change > --- > Documentation/admin-guide/perf/cxl.rst | 65 ++++++++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 66 insertions(+) > > diff --git a/Documentation/admin-guide/perf/cxl.rst b/Documentation/admin-guide/perf/cxl.rst > new file mode 100644 > index 000000000000..46235dff4b21 > --- /dev/null > +++ b/Documentation/admin-guide/perf/cxl.rst > @@ -0,0 +1,65 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +====================================== > +CXL Performance Monitoring Unit (CPMU) > +====================================== > + > +The CXL rev 3.0 specification provides a definition of CXL Performance > +Monitoring Unit in section 13.2: Performance Monitoring. > + > +CXL components (e.g. Root Port, Switch Upstream Port, End Point) may have > +any number of CPMU instances. CPMU capabilities are fully discoverable from > +the devices. The specification provides event definitions for all CXL protocol > +message types and a set of additional events for things commonly counted on > +CXL devices (e.g. DRAM events). > + > +CPMU driver > +=========== > + > +The CPMU driver register a perf PMU with the name cpmu<id> on the CXL bus. > + > + /sys/bus/cxl/device/cpmu<id> > + > +The associated PMU is registered as > + > + /sys/bus/event_sources/devices/cpmu<id> > + > +In common with other CXL bus devices, the id has no specific meaning and the > +relationship to specific CXL device should be established via the device parent > +of the device on the CXL bus. > + > +PMU driver provides description of available events and filter options in sysfs. > + > +The "format" directory describes all formats of the config (event vendor id, > +group id and mask) config1 (threshold, filter enables) and config2 (filter > +parameters) fields of the perf_event_attr structure. The "events" directory > +describes all documented events show in perf list. > + > +The events shown in perf list are the most fine grained events with a single > +bit of the event mask set. More general events may be enable by setting > +multiple mask bits in config. For example, all Device to Host Read Requests > +may be captured on a single counter by setting the bits for all of > + > +* d2h_req_rdcurr > +* d2h_req_rdown > +* d2h_req_rdshared > +* d2h_req_rdany > +* d2h_req_rdownnodata > + > +Example of usage:: > + > + $#perf list > + cpmu0/clock_ticks/ [Kernel PMU event] > + cpmu0/d2h_req_itomwr/ [Kernel PMU event] > + cpmu0/d2h_req_rdany/ [Kernel PMU event] > + cpmu0/d2h_req_rdcurr/ [Kernel PMU event] > + ----------------------------------------------------------- > + > + $# perf stat -e cpmu0/clock_ticks/ -e cpmu0/d2h_req_itowrm/ > + > +Vendor specific events may also be available and if so can be used via > + > + $# perf stat -e cpmu0/vid=VID,gid=GID,mask=MASK/ > + > +The driver does not support sampling. So "perf record" and attaching to > +a task are unsupported. The PMU only supports system-wide counting. That's the reason it doesn't support per-task profiling. Not because of missing sampling. Thanks, Kan > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > index 9de64a40adab..f60be04e4e33 100644 > --- a/Documentation/admin-guide/perf/index.rst > +++ b/Documentation/admin-guide/perf/index.rst > @@ -21,3 +21,4 @@ Performance monitor support > alibaba_pmu > nvidia-pmu > meson-ddr-pmu > + cxl
On Mon, 3 Apr 2023 13:45:52 -0400 "Liang, Kan" <kan.liang@linux.intel.com> wrote: > On 2023-03-30 12:45 p.m., Jonathan Cameron wrote: > > Very basic introduction to the device and the current driver support > > provided. I expect to expand on this in future versions of this patch > > set. > > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > -- > > v4: No change > > --- > > Documentation/admin-guide/perf/cxl.rst | 65 ++++++++++++++++++++++++ > > Documentation/admin-guide/perf/index.rst | 1 + > > 2 files changed, 66 insertions(+) > > > > diff --git a/Documentation/admin-guide/perf/cxl.rst b/Documentation/admin-guide/perf/cxl.rst > > new file mode 100644 > > index 000000000000..46235dff4b21 > > --- /dev/null > > +++ b/Documentation/admin-guide/perf/cxl.rst > > @@ -0,0 +1,65 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > + > > +====================================== > > +CXL Performance Monitoring Unit (CPMU) > > +====================================== > > + > > +The CXL rev 3.0 specification provides a definition of CXL Performance > > +Monitoring Unit in section 13.2: Performance Monitoring. > > + > > +CXL components (e.g. Root Port, Switch Upstream Port, End Point) may have > > +any number of CPMU instances. CPMU capabilities are fully discoverable from > > +the devices. The specification provides event definitions for all CXL protocol > > +message types and a set of additional events for things commonly counted on > > +CXL devices (e.g. DRAM events). > > + > > +CPMU driver > > +=========== > > + > > +The CPMU driver register a perf PMU with the name cpmu<id> on the CXL bus. > > + > > + /sys/bus/cxl/device/cpmu<id> > > + > > +The associated PMU is registered as > > + > > + /sys/bus/event_sources/devices/cpmu<id> > > + > > +In common with other CXL bus devices, the id has no specific meaning and the > > +relationship to specific CXL device should be established via the device parent > > +of the device on the CXL bus. > > + > > +PMU driver provides description of available events and filter options in sysfs. > > + > > +The "format" directory describes all formats of the config (event vendor id, > > +group id and mask) config1 (threshold, filter enables) and config2 (filter > > +parameters) fields of the perf_event_attr structure. The "events" directory > > +describes all documented events show in perf list. > > + > > +The events shown in perf list are the most fine grained events with a single > > +bit of the event mask set. More general events may be enable by setting > > +multiple mask bits in config. For example, all Device to Host Read Requests > > +may be captured on a single counter by setting the bits for all of > > + > > +* d2h_req_rdcurr > > +* d2h_req_rdown > > +* d2h_req_rdshared > > +* d2h_req_rdany > > +* d2h_req_rdownnodata > > + > > +Example of usage:: > > + > > + $#perf list > > + cpmu0/clock_ticks/ [Kernel PMU event] > > + cpmu0/d2h_req_itomwr/ [Kernel PMU event] > > + cpmu0/d2h_req_rdany/ [Kernel PMU event] > > + cpmu0/d2h_req_rdcurr/ [Kernel PMU event] > > + ----------------------------------------------------------- > > + > > + $# perf stat -e cpmu0/clock_ticks/ -e cpmu0/d2h_req_itowrm/ > > + > > +Vendor specific events may also be available and if so can be used via > > + > > + $# perf stat -e cpmu0/vid=VID,gid=GID,mask=MASK/ > > + > > +The driver does not support sampling. So "perf record" and attaching to > > +a task are unsupported. > > The PMU only supports system-wide counting. That's the reason it doesn't > support per-task profiling. Not because of missing sampling. Ah. I've managed to fuse two different conditions. I'll break them apart for v5. Thanks, Jonathan > > Thanks, > Kan > > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > > index 9de64a40adab..f60be04e4e33 100644 > > --- a/Documentation/admin-guide/perf/index.rst > > +++ b/Documentation/admin-guide/perf/index.rst > > @@ -21,3 +21,4 @@ Performance monitor support > > alibaba_pmu > > nvidia-pmu > > meson-ddr-pmu > > + cxl >
Jonathan Cameron wrote: > Very basic introduction to the device and the current driver support > provided. I expect to expand on this in future versions of this patch > set. > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > -- > v4: No change > --- > Documentation/admin-guide/perf/cxl.rst | 65 ++++++++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 66 insertions(+) > > diff --git a/Documentation/admin-guide/perf/cxl.rst b/Documentation/admin-guide/perf/cxl.rst > new file mode 100644 > index 000000000000..46235dff4b21 > --- /dev/null > +++ b/Documentation/admin-guide/perf/cxl.rst > @@ -0,0 +1,65 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +====================================== > +CXL Performance Monitoring Unit (CPMU) > +====================================== > + > +The CXL rev 3.0 specification provides a definition of CXL Performance > +Monitoring Unit in section 13.2: Performance Monitoring. > + > +CXL components (e.g. Root Port, Switch Upstream Port, End Point) may have > +any number of CPMU instances. CPMU capabilities are fully discoverable from > +the devices. The specification provides event definitions for all CXL protocol > +message types and a set of additional events for things commonly counted on > +CXL devices (e.g. DRAM events). > + > +CPMU driver > +=========== > + > +The CPMU driver register a perf PMU with the name cpmu<id> on the CXL bus. s/register/registers/ > + > + /sys/bus/cxl/device/cpmu<id> > + > +The associated PMU is registered as > + > + /sys/bus/event_sources/devices/cpmu<id> > + > +In common with other CXL bus devices, the id has no specific meaning and the > +relationship to specific CXL device should be established via the device parent > +of the device on the CXL bus. So I went to go add some text about how to identify PMUs in a persistent manner from one boot to the next. For CXL memdevs this is done by the 'serial' attribute which is always stable regardless of the device init order. That's harder to get to from the pmu device because it may be associated with a device that does not have a memdev. I think it's also going to be frustrating for userspace to see randomized pmu ids across devices since that probing will happen in parallel. So how about: 1/ Add serial as an attribute for each PMU to export 2/ Change the device name format to be "pmuX.Y" where X can just reuse the memdev id for endpoints and be another value for switches, and Y is guaranteed to be 0-based and in hardware discovery order. ...with that, someone can write a udev script that can persistently identify PMU[Y] on device[serial] each boot. That also cleans up a /sys/bus/cxl/devices listing to make it clear which pmu instances belong together. > + > +PMU driver provides description of available events and filter options in sysfs. > + > +The "format" directory describes all formats of the config (event vendor id, > +group id and mask) config1 (threshold, filter enables) and config2 (filter > +parameters) fields of the perf_event_attr structure. The "events" directory > +describes all documented events show in perf list. > + > +The events shown in perf list are the most fine grained events with a single > +bit of the event mask set. More general events may be enable by setting > +multiple mask bits in config. For example, all Device to Host Read Requests > +may be captured on a single counter by setting the bits for all of > + > +* d2h_req_rdcurr > +* d2h_req_rdown > +* d2h_req_rdshared > +* d2h_req_rdany > +* d2h_req_rdownnodata > + > +Example of usage:: > + > + $#perf list > + cpmu0/clock_ticks/ [Kernel PMU event] > + cpmu0/d2h_req_itomwr/ [Kernel PMU event] > + cpmu0/d2h_req_rdany/ [Kernel PMU event] > + cpmu0/d2h_req_rdcurr/ [Kernel PMU event] > + ----------------------------------------------------------- > + > + $# perf stat -e cpmu0/clock_ticks/ -e cpmu0/d2h_req_itowrm/ Ah here's the examples I was looking for in the last patch, nice. > + > +Vendor specific events may also be available and if so can be used via > + > + $# perf stat -e cpmu0/vid=VID,gid=GID,mask=MASK/ > + > +The driver does not support sampling. So "perf record" and attaching to > +a task are unsupported. Is this a common restriction for CPU-external pmus, or do you see sampling support required to get this upstream?
> > > + > > + /sys/bus/cxl/device/cpmu<id> > > + > > +The associated PMU is registered as > > + > > + /sys/bus/event_sources/devices/cpmu<id> > > + > > +In common with other CXL bus devices, the id has no specific meaning and the > > +relationship to specific CXL device should be established via the device parent > > +of the device on the CXL bus. > > So I went to go add some text about how to identify PMUs in a persistent > manner from one boot to the next. For CXL memdevs this is done by the > 'serial' attribute which is always stable regardless of the device init > order. That's harder to get to from the pmu device because it may be > associated with a device that does not have a memdev. > > I think it's also going to be frustrating for userspace to see > randomized pmu ids across devices since that probing will happen in > parallel. So how about: Solving this in general for perf PMU drivers was what the parent device thing was about. There is an argument that enabling any other path to get to this association is both unnecessary and just possibly unwise. The nice advantage of just using an IDA and relying on parentage for the association was that I could avoid naming questions for all the other places these might turn in a CXL topology. The Lazy / efficient option ;) You can now see exactly which PCI device a given instance is associated with. Custom ABI is going to be harder for anyone to use than that. I suppose we can potentially enable both paths - but it's not quite as straight forwards as you suggest. > > 1/ Add serial as an attribute for each PMU to export Where does it come from? We only have one source of serial number per device. That's no where near enough to work out where a PMU is. > 2/ Change the device name format to be "pmuX.Y" where X can just reuse Could use something a little more detailed cxl bus, but the one registered and use to address this in bus/event_sources needs to be cxl specific so a cxl_ prefix is needed I think Given we need to namespace what the ids refer to, I'm currently going with pmu_memX.Y pmu_dspX.Y.Z pmu_uspX.Y on the cxl bus and cxl_pmu_memX.Y cxl_pmu_dspX.Y.Z cxl_pmu_uspX.Y on even sources bus. (Z needed because dsp index from 0 for each usp) We can figure out what to do about other CXL EPs later and for now at least there is no way to hand a CPMU instance off a host bridge (nothing in CEDT to tell you where to find it). I've had a fun day hacking PMUs onto the other emulated CXL devices to test this. There is a can of worms I'll avoid for this series by just sticking to type3 device PMUs for now. I have no idea yet how we handle the interrupts safely for ports as those interrupts are in control the pcie port driver not the CXL dport one. At somepoint I'll send out an RFC about that if no one gets to it before me. For now I've hacked portdrv to always allocate max vectors and am ignoring the lovely back traces due to thing getting torn down in the wrong order on shutdown. For upstream ports I've hacked portdrv to pretend it knows there is something to handle. As a starting point I think we'll need to teach portdrv enough about CXL to be able to tell if it should provide interrupt services.. Hence I'll keep the code to register the other PMUs for a future patch set and just make sure the code is structured to enable that in this series. > the memdev id for endpoints and be another value for switches, and Y is > guaranteed to be 0-based and in hardware discovery order. Also need to change registration order as PMUs were registered before the memdev, but that's easy enough to do. > > ...with that, someone can write a udev script that can persistently > identify PMU[Y] on device[serial] each boot. > > That also cleans up a /sys/bus/cxl/devices listing to make it clear > which pmu instances belong together. > > > + > > +PMU driver provides description of available events and filter options in sysfs. > > + > > +The "format" directory describes all formats of the config (event vendor id, > > +group id and mask) config1 (threshold, filter enables) and config2 (filter > > +parameters) fields of the perf_event_attr structure. The "events" directory > > +describes all documented events show in perf list. > > + > > +The events shown in perf list are the most fine grained events with a single > > +bit of the event mask set. More general events may be enable by setting > > +multiple mask bits in config. For example, all Device to Host Read Requests > > +may be captured on a single counter by setting the bits for all of > > + > > +* d2h_req_rdcurr > > +* d2h_req_rdown > > +* d2h_req_rdshared > > +* d2h_req_rdany > > +* d2h_req_rdownnodata > > + > > +Example of usage:: > > + > > + $#perf list > > + cpmu0/clock_ticks/ [Kernel PMU event] > > + cpmu0/d2h_req_itomwr/ [Kernel PMU event] > > + cpmu0/d2h_req_rdany/ [Kernel PMU event] > > + cpmu0/d2h_req_rdcurr/ [Kernel PMU event] > > + ----------------------------------------------------------- > > + > > + $# perf stat -e cpmu0/clock_ticks/ -e cpmu0/d2h_req_itowrm/ > > Ah here's the examples I was looking for in the last patch, nice. > > > + > > +Vendor specific events may also be available and if so can be used via > > + > > + $# perf stat -e cpmu0/vid=VID,gid=GID,mask=MASK/ > > + > > +The driver does not support sampling. So "perf record" and attaching to > > +a task are unsupported. > > Is this a common restriction for CPU-external pmus, or do you see > sampling support required to get this upstream? It's a common restriction. Whilst we could potentially implement sampling based on the presence of a suitable clock_ticks event it don't see it as a requirement initially. Jonathan
diff --git a/Documentation/admin-guide/perf/cxl.rst b/Documentation/admin-guide/perf/cxl.rst new file mode 100644 index 000000000000..46235dff4b21 --- /dev/null +++ b/Documentation/admin-guide/perf/cxl.rst @@ -0,0 +1,65 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================== +CXL Performance Monitoring Unit (CPMU) +====================================== + +The CXL rev 3.0 specification provides a definition of CXL Performance +Monitoring Unit in section 13.2: Performance Monitoring. + +CXL components (e.g. Root Port, Switch Upstream Port, End Point) may have +any number of CPMU instances. CPMU capabilities are fully discoverable from +the devices. The specification provides event definitions for all CXL protocol +message types and a set of additional events for things commonly counted on +CXL devices (e.g. DRAM events). + +CPMU driver +=========== + +The CPMU driver register a perf PMU with the name cpmu<id> on the CXL bus. + + /sys/bus/cxl/device/cpmu<id> + +The associated PMU is registered as + + /sys/bus/event_sources/devices/cpmu<id> + +In common with other CXL bus devices, the id has no specific meaning and the +relationship to specific CXL device should be established via the device parent +of the device on the CXL bus. + +PMU driver provides description of available events and filter options in sysfs. + +The "format" directory describes all formats of the config (event vendor id, +group id and mask) config1 (threshold, filter enables) and config2 (filter +parameters) fields of the perf_event_attr structure. The "events" directory +describes all documented events show in perf list. + +The events shown in perf list are the most fine grained events with a single +bit of the event mask set. More general events may be enable by setting +multiple mask bits in config. For example, all Device to Host Read Requests +may be captured on a single counter by setting the bits for all of + +* d2h_req_rdcurr +* d2h_req_rdown +* d2h_req_rdshared +* d2h_req_rdany +* d2h_req_rdownnodata + +Example of usage:: + + $#perf list + cpmu0/clock_ticks/ [Kernel PMU event] + cpmu0/d2h_req_itomwr/ [Kernel PMU event] + cpmu0/d2h_req_rdany/ [Kernel PMU event] + cpmu0/d2h_req_rdcurr/ [Kernel PMU event] + ----------------------------------------------------------- + + $# perf stat -e cpmu0/clock_ticks/ -e cpmu0/d2h_req_itowrm/ + +Vendor specific events may also be available and if so can be used via + + $# perf stat -e cpmu0/vid=VID,gid=GID,mask=MASK/ + +The driver does not support sampling. So "perf record" and attaching to +a task are unsupported. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index 9de64a40adab..f60be04e4e33 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -21,3 +21,4 @@ Performance monitor support alibaba_pmu nvidia-pmu meson-ddr-pmu + cxl