Message ID | 20231017013235.27831-2-xueshuai@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3835110vqb; Mon, 16 Oct 2023 18:33:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IETblPAak4l+Oy+a7J3ym0HR2h8foZZUvrTbUkJKPpDgpLIE8GPg9SnSQVDu0q1UCk3G/oN X-Received: by 2002:a05:6a20:c182:b0:163:f945:42c4 with SMTP id bg2-20020a056a20c18200b00163f94542c4mr870799pzb.1.1697506405712; Mon, 16 Oct 2023 18:33:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697506405; cv=none; d=google.com; s=arc-20160816; b=sRRBbdjMuIbpe2I4QegER/yStuzbt2mFi/6bKkmok/dWLgdLasb3/shXKPh0w0jMAW c8DVy12NDR9fn3lOMFjPL6seZpnlN+KrQZILvp+yAcrLJ1Xemk1HA1iPQoeO4pAs34M1 /o06jA4AFexDSPu4x1YwdShccw7alIGUQGn7nmsUphgQliJMHf88+NNbuYH8YpF1RCyL zCa0P0zLePlI89DWkvVjx4N1ORbfAiuny8Zw4MQkxCurgoupJGwdPDA+6hmxfUQKWTTG +/m2ogZmLP/quJmNRMKFJXgaizSs2+F3RD3+GDEkLd2f/FM5qQNNA1LxQ10lCSIjQJLv CLVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=D9yodN5nQIh8ABUzJYPfFnWv9QQSq4cxqLGKxFZwnoU=; fh=YXkgGQjdwWXuBCaq3y30jhH9NVaym0uEtdyVivwN948=; b=T6NmMn9KHlw7dp9ac9UfJTRBTSRQ1fpXMcauPhVRzg6d9kInXp4vfsNUKXHnB9kn8t 2cZ1BFoUoL+FfdIhrmMh5HxcbGVU+hLO07xaUMDMf9wmpT8LjIW/GfQC3SR4ZQOXK0xZ dPKENLCj5dKGqQAQcHcfPJCUzxVSVw0Ul11O6QknOr7GBwKtP8pluu6/m8Law/PFQ1rA FXo8MCC4QF4kQH6ZXPVo7COwmt+hkfvHqLGTqa7wVCcwHFfNn6VUuBMu/wVBWEPm9fPQ f5rg7HiuNHkvbaAAK33XHNcDYH2He3GMyyItzl5IG0B1sgM0WlBnlSWBfs7l7yrabRCn 4WFA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id j9-20020a17090a318900b0027d11201a11si7189358pjb.170.2023.10.16.18.33.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 18:33:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 5AA1480B7AEC; Mon, 16 Oct 2023 18:33:19 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234302AbjJQBdA (ORCPT <rfc822;hjfbswb@gmail.com> + 18 others); Mon, 16 Oct 2023 21:33:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233800AbjJQBc6 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 16 Oct 2023 21:32:58 -0400 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BE70A7; Mon, 16 Oct 2023 18:32:56 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R511e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0VuL0Vvk_1697506371; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VuL0Vvk_1697506371) by smtp.aliyun-inc.com; Tue, 17 Oct 2023 09:32:52 +0800 From: Shuai Xue <xueshuai@linux.alibaba.com> To: chengyou@linux.alibaba.com, kaishen@linux.alibaba.com, helgaas@kernel.org, yangyicong@huawei.com, will@kernel.org, Jonathan.Cameron@huawei.com, baolin.wang@linux.alibaba.com, robin.murphy@arm.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, rdunlap@infradead.org, mark.rutland@arm.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com, renyu.zj@linux.alibaba.com Subject: [PATCH v8 1/4] docs: perf: Add description for Synopsys DesignWare PCIe PMU driver Date: Tue, 17 Oct 2023 09:32:32 +0800 Message-Id: <20231017013235.27831-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231017013235.27831-1-xueshuai@linux.alibaba.com> References: <20231017013235.27831-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.7 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 16 Oct 2023 18:33:19 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779964476865840204 X-GMAIL-MSGID: 1779964476865840204 |
Series |
drivers/perf: add Synopsys DesignWare PCIe PMU driver support
|
|
Commit Message
Shuai Xue
Oct. 17, 2023, 1:32 a.m. UTC
Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe controller which implements which implements PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> --- .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 95 insertions(+) create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst
Comments
On Tue, 17 Oct 2023 09:32:32 +0800 Shuai Xue <xueshuai@linux.alibaba.com> wrote: > Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > controller which implements which implements PMU for performance and > functional debugging to facilitate system maintenance. > > Document it to provide guidance on how to use it. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> A few minor things inline and one question that I'd like a comment on for my understanding at least! (why not multiply the counter by 16 and make the maths simpler?) With those tidied up, Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Thanks, Jonathan > --- > .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 95 insertions(+) > create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > > diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > new file mode 100644 > index 000000000000..eac1b6f36450 > --- /dev/null > +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > @@ -0,0 +1,94 @@ > +====================================================================== > +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > +====================================================================== > + > +DesignWare Cores (DWC) PCIe PMU > +=============================== > + > +The PMU is a PCIe configuration space register block provided by each PCIe Root > +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > +injection, and Statistics). > + > +As the name indicates, the RAS DES capability supports system level > +debugging, AER error injection, and collection of statistics. To facilitate > +collection of statistics, Synopsys DesignWare Cores PCIe controller > +provides the following two features: > + > +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > + time spent in each low-power LTSSM state) and > +- one 32-bit counter for Event Counting (error and non-error events for > + a specified lane) > + > +Note: There is no interrupt for counter overflow. > + > +Time Based Analysis > +------------------- > + > +Using this feature you can obtain information regarding RX/TX data > +throughput and time spent in each low-power LTSSM state by the controller. > +The PMU measures data in two categories: > + > +- Group#0: Percentage of time the controller stays in LTSSM states. > +- Group#1: Amount of data processed (Units of 16 bytes). > + > +Lane Event counters > +------------------- > + > +Using this feature you can obtain Error and Non-Error information in > +specific lane by the controller. The PMU event is select by: > + > +- Group i > +- Event j within the Group i > +- and Lane k The and here is a little confusing. I'd rework as The PMU event is selected by all of: - Group i - Event j within the Group i - Lane k > + > +Some of the event only exist for specific configurations. events > + > +DesignWare Cores (DWC) PCIe PMU Driver > +======================================= > + > +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > +the Root Port. For example, > + > + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > + > +the PMU device name for this Root Port is dwc_rootport_3018. > + > +The DWC PCIe PMU driver registers a perf PMU driver, which provides > +description of available events and configuration options in sysfs, see > +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > + > +The "format" directory describes format of the config fields of the > +perf_event_attr structure. The "events" directory provides configuration > +templates for all documented events. For example, > +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > + > +The "perf list" command shall list the available events from sysfs, e.g.:: > + > + $# perf list | grep dwc_rootport > + <...> > + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > + <...> > + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > + > +Time Based Analysis Event Usage > +------------------------------- > + > +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > + > + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > + > +The average RX/TX bandwidth can be calculated using the following formula: > + > + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window Silly question (sorry I didn't raise it earlier) but can we make the interface more intuitive by just multiplying the counter value at point of read by 16? > + > +Lane Event Usage > +------------------------------- > + > +Each lane has the same event set and to avoid generating a list of hundreds > +of events, the user need to specify the lane ID explicitly, e.g.:: > + > + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ > + > +The driver does not support sampling, therefore "perf record" will not > +work. Per-task (without "-a") perf sessions are not supported. > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > index f60be04e4e33..6bc7739fddb5 100644 > --- a/Documentation/admin-guide/perf/index.rst > +++ b/Documentation/admin-guide/perf/index.rst > @@ -19,6 +19,7 @@ Performance monitor support > arm_dsu_pmu > thunderx2-pmu > alibaba_pmu > + dwc_pcie_pmu > nvidia-pmu > meson-ddr-pmu > cxl
On 2023/10/17 17:16, Jonathan Cameron wrote: > On Tue, 17 Oct 2023 09:32:32 +0800 > Shuai Xue <xueshuai@linux.alibaba.com> wrote: > >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe >> controller which implements which implements PMU for performance and >> functional debugging to facilitate system maintenance. >> >> Document it to provide guidance on how to use it. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > A few minor things inline and one question that I'd like a comment on > for my understanding at least! (why not multiply the counter by 16 and > make the maths simpler?) > > With those tidied up, > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Thank you for providing prompt feedback and valuable comments to me. (please also see my replies inline) Best Regards, Shuai > > >> --- >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 95 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> new file mode 100644 >> index 000000000000..eac1b6f36450 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> @@ -0,0 +1,94 @@ >> +====================================================================== >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) >> +====================================================================== >> + >> +DesignWare Cores (DWC) PCIe PMU >> +=============================== >> + >> +The PMU is a PCIe configuration space register block provided by each PCIe Root >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error >> +injection, and Statistics). >> + >> +As the name indicates, the RAS DES capability supports system level >> +debugging, AER error injection, and collection of statistics. To facilitate >> +collection of statistics, Synopsys DesignWare Cores PCIe controller >> +provides the following two features: >> + >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and >> + time spent in each low-power LTSSM state) and >> +- one 32-bit counter for Event Counting (error and non-error events for >> + a specified lane) >> + >> +Note: There is no interrupt for counter overflow. >> + >> +Time Based Analysis >> +------------------- >> + >> +Using this feature you can obtain information regarding RX/TX data >> +throughput and time spent in each low-power LTSSM state by the controller. >> +The PMU measures data in two categories: >> + >> +- Group#0: Percentage of time the controller stays in LTSSM states. >> +- Group#1: Amount of data processed (Units of 16 bytes). >> + >> +Lane Event counters >> +------------------- >> + >> +Using this feature you can obtain Error and Non-Error information in >> +specific lane by the controller. The PMU event is select by: >> + >> +- Group i >> +- Event j within the Group i >> +- and Lane k > The and here is a little confusing. I'd rework as > The PMU event is selected by all of: > - Group i > - Event j within the Group i > - Lane k Will rework it in next version. > >> + >> +Some of the event only exist for specific configurations. > > events Sorry for typo, will fix it. > >> + >> +DesignWare Cores (DWC) PCIe PMU Driver >> +======================================= >> + >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of >> +the Root Port. For example, >> + >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) >> + >> +the PMU device name for this Root Port is dwc_rootport_3018. >> + >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >> +description of available events and configuration options in sysfs, see >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >> + >> +The "format" directory describes format of the config fields of the >> +perf_event_attr structure. The "events" directory provides configuration >> +templates for all documented events. For example, >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >> + >> +The "perf list" command shall list the available events from sysfs, e.g.:: >> + >> + $# perf list | grep dwc_rootport >> + <...> >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >> + <...> >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >> + >> +Time Based Analysis Event Usage >> +------------------------------- >> + >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >> + >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >> + >> +The average RX/TX bandwidth can be calculated using the following formula: >> + >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > > Silly question (sorry I didn't raise it earlier) but can we make the interface > more intuitive by just multiplying the counter value at point of read by 16? Really a good suggestion, and it is very convenient for end perf users. But the unit of 16 is only applied to group#1 as described in Time Based Analysis section. So I prefer to left the unit part to end users.
On 2023/10/17 9:32, Shuai Xue wrote: > Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > controller which implements which implements PMU for performance and > functional debugging to facilitate system maintenance. Double "which implements"? > > Document it to provide guidance on how to use it. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Others look good to me. Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> > --- > .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 95 insertions(+) > create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > > diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > new file mode 100644 > index 000000000000..eac1b6f36450 > --- /dev/null > +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > @@ -0,0 +1,94 @@ > +====================================================================== > +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > +====================================================================== > + > +DesignWare Cores (DWC) PCIe PMU > +=============================== > + > +The PMU is a PCIe configuration space register block provided by each PCIe Root > +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > +injection, and Statistics). > + > +As the name indicates, the RAS DES capability supports system level > +debugging, AER error injection, and collection of statistics. To facilitate > +collection of statistics, Synopsys DesignWare Cores PCIe controller > +provides the following two features: > + > +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > + time spent in each low-power LTSSM state) and > +- one 32-bit counter for Event Counting (error and non-error events for > + a specified lane) > + > +Note: There is no interrupt for counter overflow. > + > +Time Based Analysis > +------------------- > + > +Using this feature you can obtain information regarding RX/TX data > +throughput and time spent in each low-power LTSSM state by the controller. > +The PMU measures data in two categories: > + > +- Group#0: Percentage of time the controller stays in LTSSM states. > +- Group#1: Amount of data processed (Units of 16 bytes). > + > +Lane Event counters > +------------------- > + > +Using this feature you can obtain Error and Non-Error information in > +specific lane by the controller. The PMU event is select by: > + > +- Group i > +- Event j within the Group i > +- and Lane k > + > +Some of the event only exist for specific configurations. > + > +DesignWare Cores (DWC) PCIe PMU Driver > +======================================= > + > +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > +the Root Port. For example, > + > + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > + > +the PMU device name for this Root Port is dwc_rootport_3018. > + > +The DWC PCIe PMU driver registers a perf PMU driver, which provides > +description of available events and configuration options in sysfs, see > +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > + > +The "format" directory describes format of the config fields of the > +perf_event_attr structure. The "events" directory provides configuration > +templates for all documented events. For example, > +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > + > +The "perf list" command shall list the available events from sysfs, e.g.:: > + > + $# perf list | grep dwc_rootport > + <...> > + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > + <...> > + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > + > +Time Based Analysis Event Usage > +------------------------------- > + > +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > + > + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > + > +The average RX/TX bandwidth can be calculated using the following formula: > + > + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > + > +Lane Event Usage > +------------------------------- > + > +Each lane has the same event set and to avoid generating a list of hundreds > +of events, the user need to specify the lane ID explicitly, e.g.:: > + > + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ > + > +The driver does not support sampling, therefore "perf record" will not > +work. Per-task (without "-a") perf sessions are not supported. > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > index f60be04e4e33..6bc7739fddb5 100644 > --- a/Documentation/admin-guide/perf/index.rst > +++ b/Documentation/admin-guide/perf/index.rst > @@ -19,6 +19,7 @@ Performance monitor support > arm_dsu_pmu > thunderx2-pmu > alibaba_pmu > + dwc_pcie_pmu > nvidia-pmu > meson-ddr-pmu > cxl >
On Wed, 18 Oct 2023 09:19:51 +0800 Shuai Xue <xueshuai@linux.alibaba.com> wrote: > On 2023/10/17 17:16, Jonathan Cameron wrote: > > On Tue, 17 Oct 2023 09:32:32 +0800 > > Shuai Xue <xueshuai@linux.alibaba.com> wrote: > > > >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > >> controller which implements which implements PMU for performance and > >> functional debugging to facilitate system maintenance. > >> > >> Document it to provide guidance on how to use it. > >> > >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > > > A few minor things inline and one question that I'd like a comment on > > for my understanding at least! (why not multiply the counter by 16 and > > make the maths simpler?) > > > > With those tidied up, > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > Thank you for providing prompt feedback and valuable comments to me. > (please also see my replies inline) > > Best Regards, > Shuai > > > > > > >> --- > >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > >> Documentation/admin-guide/perf/index.rst | 1 + > >> 2 files changed, 95 insertions(+) > >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> > >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> new file mode 100644 > >> index 000000000000..eac1b6f36450 > >> --- /dev/null > >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> @@ -0,0 +1,94 @@ > >> +====================================================================== > >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > >> +====================================================================== > >> + > >> +DesignWare Cores (DWC) PCIe PMU > >> +=============================== > >> + > >> +The PMU is a PCIe configuration space register block provided by each PCIe Root > >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > >> +injection, and Statistics). > >> + > >> +As the name indicates, the RAS DES capability supports system level > >> +debugging, AER error injection, and collection of statistics. To facilitate > >> +collection of statistics, Synopsys DesignWare Cores PCIe controller > >> +provides the following two features: > >> + > >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > >> + time spent in each low-power LTSSM state) and > >> +- one 32-bit counter for Event Counting (error and non-error events for > >> + a specified lane) > >> + > >> +Note: There is no interrupt for counter overflow. > >> + > >> +Time Based Analysis > >> +------------------- > >> + > >> +Using this feature you can obtain information regarding RX/TX data > >> +throughput and time spent in each low-power LTSSM state by the controller. > >> +The PMU measures data in two categories: > >> + > >> +- Group#0: Percentage of time the controller stays in LTSSM states. > >> +- Group#1: Amount of data processed (Units of 16 bytes). > >> + > >> +Lane Event counters > >> +------------------- > >> + > >> +Using this feature you can obtain Error and Non-Error information in > >> +specific lane by the controller. The PMU event is select by: > >> + > >> +- Group i > >> +- Event j within the Group i > >> +- and Lane k > > The and here is a little confusing. I'd rework as > > The PMU event is selected by all of: > > - Group i > > - Event j within the Group i > > - Lane k > > Will rework it in next version. > > > > >> + > >> +Some of the event only exist for specific configurations. > > > > events > > Sorry for typo, will fix it. > > > > >> + > >> +DesignWare Cores (DWC) PCIe PMU Driver > >> +======================================= > >> + > >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > >> +the Root Port. For example, > >> + > >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > >> + > >> +the PMU device name for this Root Port is dwc_rootport_3018. > >> + > >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides > >> +description of available events and configuration options in sysfs, see > >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > >> + > >> +The "format" directory describes format of the config fields of the > >> +perf_event_attr structure. The "events" directory provides configuration > >> +templates for all documented events. For example, > >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > >> + > >> +The "perf list" command shall list the available events from sysfs, e.g.:: > >> + > >> + $# perf list | grep dwc_rootport > >> + <...> > >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > >> + <...> > >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > >> + > >> +Time Based Analysis Event Usage > >> +------------------------------- > >> + > >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > >> + > >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > >> + > >> +The average RX/TX bandwidth can be calculated using the following formula: > >> + > >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > > > > Silly question (sorry I didn't raise it earlier) but can we make the interface > > more intuitive by just multiplying the counter value at point of read by 16? > > Really a good suggestion, and it is very convenient for end perf users. > But the unit of 16 is only applied to group#1 as described in Time Based Analysis > section. How hard would it be to just apply it to those events? Userspace doesn't care what the hardware does underneath - it just wants to get moderately intuitive data back. Having the end user deal with this oddity + even the need to document it seems to me to be unnecessary burden given how simple it is (I assume) to remove the oddity. > > So I prefer to left the unit part to end users. >
On 2023/10/19 15:35, Yicong Yang wrote: > On 2023/10/17 9:32, Shuai Xue wrote: >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe >> controller which implements which implements PMU for performance and >> functional debugging to facilitate system maintenance. > > Double "which implements"? Sorry for the typo, will fix it. > >> >> Document it to provide guidance on how to use it. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > Others look good to me. > > Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> > Thank you for valuable comments :) Best Regards Shuai >> --- >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 95 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> new file mode 100644 >> index 000000000000..eac1b6f36450 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> @@ -0,0 +1,94 @@ >> +====================================================================== >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) >> +====================================================================== >> + >> +DesignWare Cores (DWC) PCIe PMU >> +=============================== >> + >> +The PMU is a PCIe configuration space register block provided by each PCIe Root >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error >> +injection, and Statistics). >> + >> +As the name indicates, the RAS DES capability supports system level >> +debugging, AER error injection, and collection of statistics. To facilitate >> +collection of statistics, Synopsys DesignWare Cores PCIe controller >> +provides the following two features: >> + >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and >> + time spent in each low-power LTSSM state) and >> +- one 32-bit counter for Event Counting (error and non-error events for >> + a specified lane) >> + >> +Note: There is no interrupt for counter overflow. >> + >> +Time Based Analysis >> +------------------- >> + >> +Using this feature you can obtain information regarding RX/TX data >> +throughput and time spent in each low-power LTSSM state by the controller. >> +The PMU measures data in two categories: >> + >> +- Group#0: Percentage of time the controller stays in LTSSM states. >> +- Group#1: Amount of data processed (Units of 16 bytes). >> + >> +Lane Event counters >> +------------------- >> + >> +Using this feature you can obtain Error and Non-Error information in >> +specific lane by the controller. The PMU event is select by: >> + >> +- Group i >> +- Event j within the Group i >> +- and Lane k >> + >> +Some of the event only exist for specific configurations. >> + >> +DesignWare Cores (DWC) PCIe PMU Driver >> +======================================= >> + >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of >> +the Root Port. For example, >> + >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) >> + >> +the PMU device name for this Root Port is dwc_rootport_3018. >> + >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >> +description of available events and configuration options in sysfs, see >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >> + >> +The "format" directory describes format of the config fields of the >> +perf_event_attr structure. The "events" directory provides configuration >> +templates for all documented events. For example, >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >> + >> +The "perf list" command shall list the available events from sysfs, e.g.:: >> + >> + $# perf list | grep dwc_rootport >> + <...> >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >> + <...> >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >> + >> +Time Based Analysis Event Usage >> +------------------------------- >> + >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >> + >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >> + >> +The average RX/TX bandwidth can be calculated using the following formula: >> + >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window >> + >> +Lane Event Usage >> +------------------------------- >> + >> +Each lane has the same event set and to avoid generating a list of hundreds >> +of events, the user need to specify the lane ID explicitly, e.g.:: >> + >> + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ >> + >> +The driver does not support sampling, therefore "perf record" will not >> +work. Per-task (without "-a") perf sessions are not supported. >> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst >> index f60be04e4e33..6bc7739fddb5 100644 >> --- a/Documentation/admin-guide/perf/index.rst >> +++ b/Documentation/admin-guide/perf/index.rst >> @@ -19,6 +19,7 @@ Performance monitor support >> arm_dsu_pmu >> thunderx2-pmu >> alibaba_pmu >> + dwc_pcie_pmu >> nvidia-pmu >> meson-ddr-pmu >> cxl >>
On 2023/10/19 19:06, Jonathan Cameron wrote: ... >>>> + >>>> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >>>> +description of available events and configuration options in sysfs, see >>>> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >>>> + >>>> +The "format" directory describes format of the config fields of the >>>> +perf_event_attr structure. The "events" directory provides configuration >>>> +templates for all documented events. For example, >>>> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >>>> + >>>> +The "perf list" command shall list the available events from sysfs, e.g.:: >>>> + >>>> + $# perf list | grep dwc_rootport >>>> + <...> >>>> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >>>> + <...> >>>> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >>>> + >>>> +Time Based Analysis Event Usage >>>> +------------------------------- >>>> + >>>> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >>>> + >>>> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >>>> + >>>> +The average RX/TX bandwidth can be calculated using the following formula: >>>> + >>>> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >>>> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window >>> >>> Silly question (sorry I didn't raise it earlier) but can we make the interface >>> more intuitive by just multiplying the counter value at point of read by 16? >> >> Really a good suggestion, and it is very convenient for end perf users. >> But the unit of 16 is only applied to group#1 as described in Time Based Analysis >> section. > > How hard would it be to just apply it to those events? > Userspace doesn't care what the hardware does underneath - it just wants to get > moderately intuitive data back. Having the end user deal with this oddity + even > the need to document it seems to me to be unnecessary burden given how simple it > is (I assume) to remove the oddity. Ok. Talked me into it :) I will multiply the counter value at point of read by 16 for group#1 events. Thank you. Best Regards, Shuai
diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst new file mode 100644 index 000000000000..eac1b6f36450 --- /dev/null +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst @@ -0,0 +1,94 @@ +====================================================================== +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) +====================================================================== + +DesignWare Cores (DWC) PCIe PMU +=============================== + +The PMU is a PCIe configuration space register block provided by each PCIe Root +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error +injection, and Statistics). + +As the name indicates, the RAS DES capability supports system level +debugging, AER error injection, and collection of statistics. To facilitate +collection of statistics, Synopsys DesignWare Cores PCIe controller +provides the following two features: + +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and + time spent in each low-power LTSSM state) and +- one 32-bit counter for Event Counting (error and non-error events for + a specified lane) + +Note: There is no interrupt for counter overflow. + +Time Based Analysis +------------------- + +Using this feature you can obtain information regarding RX/TX data +throughput and time spent in each low-power LTSSM state by the controller. +The PMU measures data in two categories: + +- Group#0: Percentage of time the controller stays in LTSSM states. +- Group#1: Amount of data processed (Units of 16 bytes). + +Lane Event counters +------------------- + +Using this feature you can obtain Error and Non-Error information in +specific lane by the controller. The PMU event is select by: + +- Group i +- Event j within the Group i +- and Lane k + +Some of the event only exist for specific configurations. + +DesignWare Cores (DWC) PCIe PMU Driver +======================================= + +This driver adds PMU devices for each PCIe Root Port named based on the BDF of +the Root Port. For example, + + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) + +the PMU device name for this Root Port is dwc_rootport_3018. + +The DWC PCIe PMU driver registers a perf PMU driver, which provides +description of available events and configuration options in sysfs, see +/sys/bus/event_source/devices/dwc_rootport_{bdf}. + +The "format" directory describes format of the config fields of the +perf_event_attr structure. The "events" directory provides configuration +templates for all documented events. For example, +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". + +The "perf list" command shall list the available events from sysfs, e.g.:: + + $# perf list | grep dwc_rootport + <...> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] + <...> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] + +Time Based Analysis Event Usage +------------------------------- + +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: + + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ + +The average RX/TX bandwidth can be calculated using the following formula: + + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window + +Lane Event Usage +------------------------------- + +Each lane has the same event set and to avoid generating a list of hundreds +of events, the user need to specify the lane ID explicitly, e.g.:: + + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ + +The driver does not support sampling, therefore "perf record" will not +work. Per-task (without "-a") perf sessions are not supported. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index f60be04e4e33..6bc7739fddb5 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -19,6 +19,7 @@ Performance monitor support arm_dsu_pmu thunderx2-pmu alibaba_pmu + dwc_pcie_pmu nvidia-pmu meson-ddr-pmu cxl