Message ID | 20230614090710.680330-1-sandipan.das@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1116005vqr; Wed, 14 Jun 2023 02:47:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6DSUEtlDAPSdG6H9+1SocDPV9uTQo1tkP0AOR6zhS5mGoSV/C37Pq1wpovWfEE6YYE2XdK X-Received: by 2002:a05:6402:2043:b0:514:9e1b:d21c with SMTP id bc3-20020a056402204300b005149e1bd21cmr9803798edb.31.1686736039907; Wed, 14 Jun 2023 02:47:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1686736039; cv=pass; d=google.com; s=arc-20160816; b=WJd3KlsxuPyEcSEw95cMdB9BjTlFXdUTs6gTuI3D1ZoFpQCG0tirFnVklrnceho3DT h1rMXBezqw28Uz7jlWoftqagKkO0vorvPUg1+/JjxHrT7g8qJ9Hvt7p1KxCFPP5eUBcl YzcmRGVj3tgyMtyC2hdMATcgyMze6GTGMrjtVejGSsTCSPNwETq3klBdiga1ocVHysaX LhTNly3wQZF68q/UoDURIM0X2b7dzvMTbt5P8WdkBysSJJozhZPkSBKpACBXRofGODTw E4uvBguPJ1/12z+WXQsDU7kzj4GS+uiyIqQI7gSJJvUrJWex3hccreaoGjKHY5Qie3ON V8wQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=t++Lwtu/YJZf+goiNLSZ9QAb4BXOhAZarGP2bKrmzZY=; b=OGVvTeWNDgMq5MUzAhPbgN5lHB3+caQy9YGHZS3/kgS+BfMYGnZdlIhMLn67yoxKE/ T1De6Hh75c0+vOzmhIT+B3Am9Ak2atVRXjxwEsrTgLQyt3l93uENkuoy72qr1QRID8f5 SWOWztQFbvvWNAsw95JFuVfwfw6MoGvI6/b8IGhiFp6cnsJ9MEdntUotZkYMmsxq1trm E368tnE4k2T/b/mDXxL20zOAPSq2VQT5xRBycAELZ0InRuxb6maIamuw4j52PYUZLLpG gv//xH0vIcmXDLDOdHjvONUXbvA3JpTVsTkDz/9checYpS2m+IP8DSUBQ+DzBZxN7hkY 8PkA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=WreHyFIm; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f23-20020a056402069700b00506b1ff4196si8870250edy.392.2023.06.14.02.46.50; Wed, 14 Jun 2023 02:47:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=WreHyFIm; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243891AbjFNJH3 (ORCPT <rfc822;jesperjuhl76@gmail.com> + 99 others); Wed, 14 Jun 2023 05:07:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235295AbjFNJH2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 14 Jun 2023 05:07:28 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2056.outbound.protection.outlook.com [40.107.244.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F29A110C2; Wed, 14 Jun 2023 02:07:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CfHLma4SpzUCdN1RiSW7zTaCINykjmblzGlAK+5xjNr31qxwacbs3u6rHUIZ3/D2HDM59YGNX5TWd2fh2agK5SkJzUP9Uz+zQ+zOPNTQDvtGFfpSaD/NHZeszk6ElxYg2DoBHJ3YHfj6Q52TWcEwMsWG95atVtCNyvrUTocRuNO1i0v85Am+EqmK60XaJjmXzMsV/7at6k+n7wNiWovRJ4ptLOFobWlgXqUnwuAFUJatIxrbW51EAmtuyiXOlAZ0Wol6zX2xU3vtWvMXEEk+vRz5USSxK/CMixftwAd3BYNUTQMH475tcVDU+sVKRrSvrZMEtrA3k3sSgIcEMDLI0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=t++Lwtu/YJZf+goiNLSZ9QAb4BXOhAZarGP2bKrmzZY=; b=Anz+HIBGFbMKF+Psn6KsnszD+SbA+SLpdSwWcbJ5qotsCx5j1uaaATTOCC+4Cfgp/fqQqGOkQmwSf652E+eojQC0OY3CHKgyDgjx43sd4aGIKdoMg6M63/HqpuH6b3eeAB8BvH4S0XfuS1KYmNgx8xafE7b0mueJqqULmAiY7aEyGgEa4qYbXiecOuVCClt5PVaGNqdy0yo2gf/A0Nnwu4q6GyCkBR089lBsrKngUGCydbP6yeNdnvAhi0i+u+0e5OftcEGbI03Q4u3fUwW+JZyGIuuzvYWDn8DAhTLTl3o0ESAXZiRZMg1Ejzx/tFvs32cUXPEWaw42KGo8jxi13A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=t++Lwtu/YJZf+goiNLSZ9QAb4BXOhAZarGP2bKrmzZY=; b=WreHyFImIU/7XOUcrOiyNernqpbRBHg04GXdJN42J+Npn/JvysQV2TvVmwq4JoKiSEpLSdzIyqt3pDh8ERQLo/cqzLRgtYexbIL6+mAKq11d911X5j9QtgRPn6moBrw1Wd0ipLqqB9oCXzMNJL7RcCzx1jnFYNZGLcsjAJ7d8+Y= Received: from SN7PR04CA0029.namprd04.prod.outlook.com (2603:10b6:806:f2::34) by DS0PR12MB8197.namprd12.prod.outlook.com (2603:10b6:8:f1::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37; Wed, 14 Jun 2023 09:07:24 +0000 Received: from SA2PEPF00001509.namprd04.prod.outlook.com (2603:10b6:806:f2:cafe::8f) by SN7PR04CA0029.outlook.office365.com (2603:10b6:806:f2::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37 via Frontend Transport; Wed, 14 Jun 2023 09:07:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by SA2PEPF00001509.mail.protection.outlook.com (10.167.242.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6500.25 via Frontend Transport; Wed, 14 Jun 2023 09:07:24 +0000 Received: from sindhu.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 14 Jun 2023 04:07:18 -0500 From: Sandipan Das <sandipan.das@amd.com> To: <linux-kernel@vger.kernel.org>, <linux-perf-users@vger.kernel.org> CC: <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>, <mark.rutland@arm.com>, <alexander.shishkin@linux.intel.com>, <jolsa@kernel.org>, <namhyung@kernel.org>, <irogers@google.com>, <adrian.hunter@intel.com>, <kjain@linux.ibm.com>, <atrajeev@linux.vnet.ibm.com>, <barnali@linux.ibm.com>, <ayush.jain3@amd.com>, <ananth.narayan@amd.com>, <ravi.bangoria@amd.com>, <santosh.shukla@amd.com>, <sandipan.das@amd.com> Subject: [PATCH] perf test: Retry without grouping for all metrics test Date: Wed, 14 Jun 2023 14:37:10 +0530 Message-ID: <20230614090710.680330-1-sandipan.das@amd.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SA2PEPF00001509:EE_|DS0PR12MB8197:EE_ X-MS-Office365-Filtering-Correlation-Id: 38e52ab4-1496-48b4-34f5-08db6cb6c4d5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /IiemGv5mL5UMIEonzhjJEqBSrlCJiCwoWhsBaFWPjQ3dK2mEPD/6pw9BVorAkI0RAHY2Uydt2jPGCvWtri4CiZX3VayrdFQ5QeqBXoAHqOh9XSqsuLGPskHKkdzfGF1bd+KkSkPHKeu8dJWT2o+LOY9KA2sss4o3VvevnuRIHLowX40qoQGKkItSKobaQx+WrcsmOyyp9Q/c4BOJfDUtEl9lyo8/08pTBnxZkY0aqV0Win1S7/0vGoMriYLf5mOqS0UKRRMOGijEM02u1erodiG5QavZO7i9i8OKpgyYtjN75aKnbOpuHnW2lOfwgQLBC7bBe1DObmsG//N7P06/2a7fqQZl/5v/2+IJhkbDgbHwFqlQkd1FN7MngpTYRe+1ZmuODC2VCFdftKUMrgxO3jzke+k+t0Xk+NIizlig/qlb5VdKOYkLoNqXE7S8JbK88744s8RBCMfbg3h90ANzumFH2fD8NbzmuAiSk1+ji/i/gD4Aez3T1jeQ/aNP50/YkKFylHBZNlSlCZqqu3QS8w/cbwS9dSUOWqfOunufjX3VcvQLzOPSjpRsL2A1kKiuApxDQbIFpsXIHXkKq+HuW2KFCmC14OO7IG7TcxxlDxRkiO9/WrIxzKG8yJe71OdnutlEu1l78CHut8hkeIJriW4z3aZYQVL+ARNMOU/sH6GYemHOEpama4GCtyVnQ5zHJyuUb9IRanZy4AZKVVw7+3sDhmiHK0ZdFiy8/W8jj5WnvRrnj7lGyNQ7+DwsMvIoZ9TYsZ+go9T7xwJHZ8myw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(376002)(136003)(396003)(346002)(39860400002)(451199021)(46966006)(40470700004)(36840700001)(7696005)(8936002)(26005)(40460700003)(8676002)(41300700001)(36860700001)(356005)(81166007)(478600001)(82740400003)(36756003)(82310400005)(86362001)(70586007)(70206006)(4326008)(1076003)(316002)(110136005)(54906003)(6666004)(40480700001)(44832011)(5660300002)(83380400001)(426003)(336012)(186003)(2616005)(16526019)(7416002)(47076005)(2906002)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Jun 2023 09:07:24.4586 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 38e52ab4-1496-48b4-34f5-08db6cb6c4d5 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SA2PEPF00001509.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8197 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768670929704399025?= X-GMAIL-MSGID: =?utf-8?q?1768670929704399025?= |
Series |
perf test: Retry without grouping for all metrics test
|
|
Commit Message
Sandipan Das
June 14, 2023, 9:07 a.m. UTC
There are cases where a metric uses more events than the number of
counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric
counters but the "nps1_die_to_dram" metric has eight events. By default,
the constituent events are placed in a group. Since the events cannot be
scheduled at the same time, the metric is not computed. The all metrics
test also fails because of this.
Before announcing failure, the test can try multiple options for each
available metric. After system-wide mode fails, retry once again with
the "--metric-no-group" option.
E.g.
$ sudo perf test -v 100
Before:
100: perf all metrics test :
--- start ---
test child forked, pid 672731
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Metric 'nps1_die_to_dram' not printed in:
Error:
Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with -1
---- end ----
perf all metrics test: FAILED!
After:
100: perf all metrics test :
--- start ---
test child forked, pid 672887
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok
Reported-by: Ayush Jain <ayush.jain3@amd.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++
1 file changed, 7 insertions(+)
Comments
Hello Sandipan, Thank you for this patch, On 6/14/2023 2:37 PM, Sandipan Das wrote: > There are cases where a metric uses more events than the number of > counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric > counters but the "nps1_die_to_dram" metric has eight events. By default, > the constituent events are placed in a group. Since the events cannot be > scheduled at the same time, the metric is not computed. The all metrics > test also fails because of this. > > Before announcing failure, the test can try multiple options for each > available metric. After system-wide mode fails, retry once again with > the "--metric-no-group" option. > > E.g. > > $ sudo perf test -v 100 > > Before: > > 100: perf all metrics test : > --- start --- > test child forked, pid 672731 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Metric 'nps1_die_to_dram' not printed in: > Error: > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with -1 > ---- end ---- > perf all metrics test: FAILED! > > After: > > 100: perf all metrics test : > --- start --- > test child forked, pid 672887 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with 0 > ---- end ---- > perf all metrics test: Ok > Issue gets resolved after applying this patch $ ./perf test 102 -vvv $102: perf all metrics test : $--- start --- $test child forked, pid 244991 $Testing branch_misprediction_ratio $Testing all_remote_links_outbound $Testing nps1_die_to_dram $Testing all_l2_cache_accesses $Testing all_l2_cache_hits $Testing all_l2_cache_misses $Testing ic_fetch_miss_ratio $Testing l2_cache_accesses_from_l2_hwpf $Testing l2_cache_misses_from_l2_hwpf $Testing l3_read_miss_latency $Testing l1_itlb_misses $test child finished with 0 $---- end ---- $perf all metrics test: Ok > Reported-by: Ayush Jain <ayush.jain3@amd.com> > Signed-off-by: Sandipan Das <sandipan.das@amd.com> Tested-by: Ayush Jain <ayush.jain3@amd.com> > --- > tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh > index 54774525e18a..1e88ea8c5677 100755 > --- a/tools/perf/tests/shell/stat_all_metrics.sh > +++ b/tools/perf/tests/shell/stat_all_metrics.sh > @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do > then > continue > fi > + # Failed again, possibly there are not enough counters so retry system wide > + # mode but without event grouping. > + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1) > + if [[ "$result" =~ ${m:0:50} ]] > + then > + continue > + fi > # Failed again, possibly the workload was too small so retry with something > # longer. > result=$(perf stat -M "$m" perf bench internals synthesize 2>&1) Thanks & Regards, Ayush Jain
On Wed, Jun 14, 2023 at 2:07 AM Sandipan Das <sandipan.das@amd.com> wrote: > > There are cases where a metric uses more events than the number of > counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric > counters but the "nps1_die_to_dram" metric has eight events. By default, > the constituent events are placed in a group. Since the events cannot be > scheduled at the same time, the metric is not computed. The all metrics > test also fails because of this. Thanks Sandipan. So this is exposing a bug in the AMD data fabric PMU driver. When the events are added the driver should create a fake PMU, check that adding the group is valid and if not fail. The failure is picked up by the tool and it will remove the group. I appreciate the need for a time machine to make such a fix work. To workaround the issue with the metrics add: "MetricConstraint": "NO_GROUP_EVENTS", to each metric in the json. > Before announcing failure, the test can try multiple options for each > available metric. After system-wide mode fails, retry once again with > the "--metric-no-group" option. > > E.g. > > $ sudo perf test -v 100 > > Before: > > 100: perf all metrics test : > --- start --- > test child forked, pid 672731 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Metric 'nps1_die_to_dram' not printed in: > Error: > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. This error doesn't relate to grouping, so I'm confused about having it in the commit message, aside from the test failure. Thanks, Ian > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with -1 > ---- end ---- > perf all metrics test: FAILED! > > After: > > 100: perf all metrics test : > --- start --- > test child forked, pid 672887 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with 0 > ---- end ---- > perf all metrics test: Ok > > Reported-by: Ayush Jain <ayush.jain3@amd.com> > Signed-off-by: Sandipan Das <sandipan.das@amd.com> > --- > tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh > index 54774525e18a..1e88ea8c5677 100755 > --- a/tools/perf/tests/shell/stat_all_metrics.sh > +++ b/tools/perf/tests/shell/stat_all_metrics.sh > @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do > then > continue > fi > + # Failed again, possibly there are not enough counters so retry system wide > + # mode but without event grouping. > + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1) > + if [[ "$result" =~ ${m:0:50} ]] > + then > + continue > + fi > # Failed again, possibly the workload was too small so retry with something > # longer. > result=$(perf stat -M "$m" perf bench internals synthesize 2>&1) > -- > 2.34.1 >
Hi Ian, On 6/14/2023 10:10 PM, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 2:07 AM Sandipan Das <sandipan.das@amd.com> wrote: >> >> There are cases where a metric uses more events than the number of >> counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric >> counters but the "nps1_die_to_dram" metric has eight events. By default, >> the constituent events are placed in a group. Since the events cannot be >> scheduled at the same time, the metric is not computed. The all metrics >> test also fails because of this. > > Thanks Sandipan. So this is exposing a bug in the AMD data fabric PMU > driver. When the events are added the driver should create a fake PMU, > check that adding the group is valid and if not fail. The failure is > picked up by the tool and it will remove the group. > > I appreciate the need for a time machine to make such a fix work. To > workaround the issue with the metrics add: > "MetricConstraint": "NO_GROUP_EVENTS", > to each metric in the json. > Thanks for the suggestions. The amd_uncore driver is indeed missing group validation checks during event init. Will send out a fix with the "NO_GROUP_EVENTS" workaround. >> Before announcing failure, the test can try multiple options for each >> available metric. After system-wide mode fails, retry once again with >> the "--metric-no-group" option. >> >> E.g. >> >> $ sudo perf test -v 100 >> >> Before: >> >> 100: perf all metrics test : >> --- start --- >> test child forked, pid 672731 >> Testing branch_misprediction_ratio >> Testing all_remote_links_outbound >> Testing nps1_die_to_dram >> Metric 'nps1_die_to_dram' not printed in: >> Error: >> Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. > > This error doesn't relate to grouping, so I'm confused about having it > in the commit message, aside from the test failure. > Agreed. That's the error message from the last attempt where the test tries to use a longer running workload (perf bench). - Sandipan
Em Wed, Jun 14, 2023 at 05:08:21PM +0530, Ayush Jain escreveu: > On 6/14/2023 2:37 PM, Sandipan Das wrote: > > There are cases where a metric uses more events than the number of > > counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric > > counters but the "nps1_die_to_dram" metric has eight events. By default, > > the constituent events are placed in a group. Since the events cannot be > > scheduled at the same time, the metric is not computed. The all metrics > > test also fails because of this. Humm, I'm not being able to reproduce here the problem, before applying this patch: [root@five ~]# grep -m1 "model name" /proc/cpuinfo model name : AMD Ryzen 9 5950X 16-Core Processor [root@five ~]# perf test -vvv "perf all metrics test" 104: perf all metrics test : --- start --- test child forked, pid 1379713 Testing branch_misprediction_ratio Testing all_remote_links_outbound Testing nps1_die_to_dram Testing macro_ops_dispatched Testing all_l2_cache_accesses Testing all_l2_cache_hits Testing all_l2_cache_misses Testing ic_fetch_miss_ratio Testing l2_cache_accesses_from_l2_hwpf Testing l2_cache_misses_from_l2_hwpf Testing op_cache_fetch_miss_ratio Testing l3_read_miss_latency Testing l1_itlb_misses test child finished with 0 ---- end ---- perf all metrics test: Ok [root@five ~]# [root@five ~]# perf stat -M nps1_die_to_dram -a sleep 2 Performance counter stats for 'system wide': 0 dram_channel_data_controller_4 # 10885.3 MiB nps1_die_to_dram (49.96%) 31,334,338 dram_channel_data_controller_1 (50.01%) 0 dram_channel_data_controller_6 (50.04%) 54,679,601 dram_channel_data_controller_3 (50.04%) 38,420,402 dram_channel_data_controller_0 (50.04%) 0 dram_channel_data_controller_5 (49.99%) 54,012,661 dram_channel_data_controller_2 (49.96%) 0 dram_channel_data_controller_7 (49.96%) 2.001465439 seconds time elapsed [root@five ~]# [root@five ~]# perf stat -v -M nps1_die_to_dram -a sleep 2 Using CPUID AuthenticAMD-25-21-0 metric expr dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7 for nps1_die_to_dram found event dram_channel_data_controller_4 found event dram_channel_data_controller_1 found event dram_channel_data_controller_6 found event dram_channel_data_controller_3 found event dram_channel_data_controller_0 found event dram_channel_data_controller_5 found event dram_channel_data_controller_2 found event dram_channel_data_controller_7 Parsing metric events 'dram_channel_data_controller_4/metric-id=dram_channel_data_controller_4/,dram_channel_data_controller_1/metric-id=dram_channel_data_controller_1/,dram_channel_data_controller_6/metric-id=dram_channel_data_controller_6/,dram_channel_data_controller_3/metric-id=dram_channel_data_controller_3/,dram_channel_data_controller_0/metric-id=dram_channel_data_controller_0/,dram_channel_data_controller_5/metric-id=dram_channel_data_controller_5/,dram_channel_data_controller_2/metric-id=dram_channel_data_controller_2/,dram_channel_data_controller_7/metric-id=dram_channel_data_controller_7/' dram_channel_data_controller_4 -> amd_df/metric-id=dram_channel_data_controller_4,dram_channel_data_controller_4/ dram_channel_data_controller_1 -> amd_df/metric-id=dram_channel_data_controller_1,dram_channel_data_controller_1/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_1'. Missing kernel support? (<no help>) dram_channel_data_controller_6 -> amd_df/metric-id=dram_channel_data_controller_6,dram_channel_data_controller_6/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_6'. Missing kernel support? (<no help>) dram_channel_data_controller_3 -> amd_df/metric-id=dram_channel_data_controller_3,dram_channel_data_controller_3/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_3'. Missing kernel support? (<no help>) dram_channel_data_controller_0 -> amd_df/metric-id=dram_channel_data_controller_0,dram_channel_data_controller_0/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_0'. Missing kernel support? (<no help>) dram_channel_data_controller_5 -> amd_df/metric-id=dram_channel_data_controller_5,dram_channel_data_controller_5/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_5'. Missing kernel support? (<no help>) dram_channel_data_controller_2 -> amd_df/metric-id=dram_channel_data_controller_2,dram_channel_data_controller_2/ Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_2'. Missing kernel support? (<no help>) dram_channel_data_controller_7 -> amd_df/metric-id=dram_channel_data_controller_7,dram_channel_data_controller_7/ Matched metric-id dram_channel_data_controller_4 to dram_channel_data_controller_4 Matched metric-id dram_channel_data_controller_1 to dram_channel_data_controller_1 Matched metric-id dram_channel_data_controller_6 to dram_channel_data_controller_6 Matched metric-id dram_channel_data_controller_3 to dram_channel_data_controller_3 Matched metric-id dram_channel_data_controller_0 to dram_channel_data_controller_0 Matched metric-id dram_channel_data_controller_5 to dram_channel_data_controller_5 Matched metric-id dram_channel_data_controller_2 to dram_channel_data_controller_2 Matched metric-id dram_channel_data_controller_7 to dram_channel_data_controller_7 Control descriptor is not initialized dram_channel_data_controller_4: 0 2001175127 999996394 dram_channel_data_controller_1: 32346663 2001169897 1000709803 dram_channel_data_controller_6: 0 2001168377 1001193443 dram_channel_data_controller_3: 47551247 2001166947 1001198122 dram_channel_data_controller_0: 38975242 2001165217 1001182923 dram_channel_data_controller_5: 0 2001163067 1000464054 dram_channel_data_controller_2: 49934162 2001160907 999974934 dram_channel_data_controller_7: 0 2001150317 999968825 Performance counter stats for 'system wide': 0 dram_channel_data_controller_4 # 10297.2 MiB nps1_die_to_dram (49.97%) 32,346,663 dram_channel_data_controller_1 (50.01%) 0 dram_channel_data_controller_6 (50.03%) 47,551,247 dram_channel_data_controller_3 (50.03%) 38,975,242 dram_channel_data_controller_0 (50.03%) 0 dram_channel_data_controller_5 (49.99%) 49,934,162 dram_channel_data_controller_2 (49.97%) 0 dram_channel_data_controller_7 (49.97%) 2.001196512 seconds time elapsed [root@five ~]# What am I missing? Ian, I also stumbled on this: [root@five ~]# perf stat -M dram_channel_data_controller_4 Cannot find metric or group `dram_channel_data_controller_4' ^C Performance counter stats for 'system wide': 284,908.91 msec cpu-clock # 32.002 CPUs utilized 6,485,456 context-switches # 22.763 K/sec 719 cpu-migrations # 2.524 /sec 32,800 page-faults # 115.125 /sec 189,779,273,552 cycles # 0.666 GHz (83.33%) 2,893,165,259 stalled-cycles-frontend # 1.52% frontend cycles idle (83.33%) 24,807,157,349 stalled-cycles-backend # 13.07% backend cycles idle (83.33%) 99,286,488,807 instructions # 0.52 insn per cycle # 0.25 stalled cycles per insn (83.33%) 24,120,737,678 branches # 84.661 M/sec (83.33%) 1,907,540,278 branch-misses # 7.91% of all branches (83.34%) 8.902784776 seconds time elapsed [root@five ~]# [root@five ~]# perf stat -e dram_channel_data_controller_4 ^C Performance counter stats for 'system wide': 0 dram_channel_data_controller_4 1.189638741 seconds time elapsed [root@five ~]# I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no? - Arnaldo > > Before announcing failure, the test can try multiple options for each > > available metric. After system-wide mode fails, retry once again with > > the "--metric-no-group" option. > > > > E.g. > > > > $ sudo perf test -v 100 > > > > Before: > > > > 100: perf all metrics test : > > --- start --- > > test child forked, pid 672731 > > Testing branch_misprediction_ratio > > Testing all_remote_links_outbound > > Testing nps1_die_to_dram > > Metric 'nps1_die_to_dram' not printed in: > > Error: > > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. > > Testing macro_ops_dispatched > > Testing all_l2_cache_accesses > > Testing all_l2_cache_hits > > Testing all_l2_cache_misses > > Testing ic_fetch_miss_ratio > > Testing l2_cache_accesses_from_l2_hwpf > > Testing l2_cache_misses_from_l2_hwpf > > Testing op_cache_fetch_miss_ratio > > Testing l3_read_miss_latency > > Testing l1_itlb_misses > > test child finished with -1 > > ---- end ---- > > perf all metrics test: FAILED! > > > > After: > > > > 100: perf all metrics test : > > --- start --- > > test child forked, pid 672887 > > Testing branch_misprediction_ratio > > Testing all_remote_links_outbound > > Testing nps1_die_to_dram > > Testing macro_ops_dispatched > > Testing all_l2_cache_accesses > > Testing all_l2_cache_hits > > Testing all_l2_cache_misses > > Testing ic_fetch_miss_ratio > > Testing l2_cache_accesses_from_l2_hwpf > > Testing l2_cache_misses_from_l2_hwpf > > Testing op_cache_fetch_miss_ratio > > Testing l3_read_miss_latency > > Testing l1_itlb_misses > > test child finished with 0 > > ---- end ---- > > perf all metrics test: Ok > > > > Issue gets resolved after applying this patch > > $ ./perf test 102 -vvv > $102: perf all metrics test : > $--- start --- > $test child forked, pid 244991 > $Testing branch_misprediction_ratio > $Testing all_remote_links_outbound > $Testing nps1_die_to_dram > $Testing all_l2_cache_accesses > $Testing all_l2_cache_hits > $Testing all_l2_cache_misses > $Testing ic_fetch_miss_ratio > $Testing l2_cache_accesses_from_l2_hwpf > $Testing l2_cache_misses_from_l2_hwpf > $Testing l3_read_miss_latency > $Testing l1_itlb_misses > $test child finished with 0 > $---- end ---- > $perf all metrics test: Ok > > > Reported-by: Ayush Jain <ayush.jain3@amd.com> > > Signed-off-by: Sandipan Das <sandipan.das@amd.com> > > Tested-by: Ayush Jain <ayush.jain3@amd.com> > > > --- > > tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh > > index 54774525e18a..1e88ea8c5677 100755 > > --- a/tools/perf/tests/shell/stat_all_metrics.sh > > +++ b/tools/perf/tests/shell/stat_all_metrics.sh > > @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do > > then > > continue > > fi > > + # Failed again, possibly there are not enough counters so retry system wide > > + # mode but without event grouping. > > + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1) > > + if [[ "$result" =~ ${m:0:50} ]] > > + then > > + continue > > + fi > > # Failed again, possibly the workload was too small so retry with something > > # longer. > > result=$(perf stat -M "$m" perf bench internals synthesize 2>&1) > > Thanks & Regards, > Ayush Jain
On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Jun 14, 2023 at 05:08:21PM +0530, Ayush Jain escreveu: > > On 6/14/2023 2:37 PM, Sandipan Das wrote: > > > There are cases where a metric uses more events than the number of > > > counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric > > > counters but the "nps1_die_to_dram" metric has eight events. By default, > > > the constituent events are placed in a group. Since the events cannot be > > > scheduled at the same time, the metric is not computed. The all metrics > > > test also fails because of this. > > Humm, I'm not being able to reproduce here the problem, before applying > this patch: > > [root@five ~]# grep -m1 "model name" /proc/cpuinfo > model name : AMD Ryzen 9 5950X 16-Core Processor > [root@five ~]# perf test -vvv "perf all metrics test" > 104: perf all metrics test : > --- start --- > test child forked, pid 1379713 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with 0 > ---- end ---- > perf all metrics test: Ok > [root@five ~]# Please don't apply the patch. The patch masks a bug in metrics/PMUs and the proper fix was: 8d40f74ebf21 perf vendor events amd: Fix large metrics https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com > [root@five ~]# perf stat -M nps1_die_to_dram -a sleep 2 > > Performance counter stats for 'system wide': > > 0 dram_channel_data_controller_4 # 10885.3 MiB nps1_die_to_dram (49.96%) > 31,334,338 dram_channel_data_controller_1 (50.01%) > 0 dram_channel_data_controller_6 (50.04%) > 54,679,601 dram_channel_data_controller_3 (50.04%) > 38,420,402 dram_channel_data_controller_0 (50.04%) > 0 dram_channel_data_controller_5 (49.99%) > 54,012,661 dram_channel_data_controller_2 (49.96%) > 0 dram_channel_data_controller_7 (49.96%) > > 2.001465439 seconds time elapsed > > [root@five ~]# > > [root@five ~]# perf stat -v -M nps1_die_to_dram -a sleep 2 > Using CPUID AuthenticAMD-25-21-0 > metric expr dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7 for nps1_die_to_dram > found event dram_channel_data_controller_4 > found event dram_channel_data_controller_1 > found event dram_channel_data_controller_6 > found event dram_channel_data_controller_3 > found event dram_channel_data_controller_0 > found event dram_channel_data_controller_5 > found event dram_channel_data_controller_2 > found event dram_channel_data_controller_7 > Parsing metric events 'dram_channel_data_controller_4/metric-id=dram_channel_data_controller_4/,dram_channel_data_controller_1/metric-id=dram_channel_data_controller_1/,dram_channel_data_controller_6/metric-id=dram_channel_data_controller_6/,dram_channel_data_controller_3/metric-id=dram_channel_data_controller_3/,dram_channel_data_controller_0/metric-id=dram_channel_data_controller_0/,dram_channel_data_controller_5/metric-id=dram_channel_data_controller_5/,dram_channel_data_controller_2/metric-id=dram_channel_data_controller_2/,dram_channel_data_controller_7/metric-id=dram_channel_data_controller_7/' > dram_channel_data_controller_4 -> amd_df/metric-id=dram_channel_data_controller_4,dram_channel_data_controller_4/ > dram_channel_data_controller_1 -> amd_df/metric-id=dram_channel_data_controller_1,dram_channel_data_controller_1/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_1'. Missing kernel support? (<no help>) > dram_channel_data_controller_6 -> amd_df/metric-id=dram_channel_data_controller_6,dram_channel_data_controller_6/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_6'. Missing kernel support? (<no help>) > dram_channel_data_controller_3 -> amd_df/metric-id=dram_channel_data_controller_3,dram_channel_data_controller_3/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_3'. Missing kernel support? (<no help>) > dram_channel_data_controller_0 -> amd_df/metric-id=dram_channel_data_controller_0,dram_channel_data_controller_0/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_0'. Missing kernel support? (<no help>) > dram_channel_data_controller_5 -> amd_df/metric-id=dram_channel_data_controller_5,dram_channel_data_controller_5/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_5'. Missing kernel support? (<no help>) > dram_channel_data_controller_2 -> amd_df/metric-id=dram_channel_data_controller_2,dram_channel_data_controller_2/ > Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_2'. Missing kernel support? (<no help>) > dram_channel_data_controller_7 -> amd_df/metric-id=dram_channel_data_controller_7,dram_channel_data_controller_7/ > Matched metric-id dram_channel_data_controller_4 to dram_channel_data_controller_4 > Matched metric-id dram_channel_data_controller_1 to dram_channel_data_controller_1 > Matched metric-id dram_channel_data_controller_6 to dram_channel_data_controller_6 > Matched metric-id dram_channel_data_controller_3 to dram_channel_data_controller_3 > Matched metric-id dram_channel_data_controller_0 to dram_channel_data_controller_0 > Matched metric-id dram_channel_data_controller_5 to dram_channel_data_controller_5 > Matched metric-id dram_channel_data_controller_2 to dram_channel_data_controller_2 > Matched metric-id dram_channel_data_controller_7 to dram_channel_data_controller_7 > Control descriptor is not initialized > dram_channel_data_controller_4: 0 2001175127 999996394 > dram_channel_data_controller_1: 32346663 2001169897 1000709803 > dram_channel_data_controller_6: 0 2001168377 1001193443 > dram_channel_data_controller_3: 47551247 2001166947 1001198122 > dram_channel_data_controller_0: 38975242 2001165217 1001182923 > dram_channel_data_controller_5: 0 2001163067 1000464054 > dram_channel_data_controller_2: 49934162 2001160907 999974934 > dram_channel_data_controller_7: 0 2001150317 999968825 > > Performance counter stats for 'system wide': > > 0 dram_channel_data_controller_4 # 10297.2 MiB nps1_die_to_dram (49.97%) > 32,346,663 dram_channel_data_controller_1 (50.01%) > 0 dram_channel_data_controller_6 (50.03%) > 47,551,247 dram_channel_data_controller_3 (50.03%) > 38,975,242 dram_channel_data_controller_0 (50.03%) > 0 dram_channel_data_controller_5 (49.99%) > 49,934,162 dram_channel_data_controller_2 (49.97%) > 0 dram_channel_data_controller_7 (49.97%) > > 2.001196512 seconds time elapsed > > [root@five ~]# > > What am I missing? > > Ian, I also stumbled on this: > > [root@five ~]# perf stat -M dram_channel_data_controller_4 > Cannot find metric or group `dram_channel_data_controller_4' > ^C > Performance counter stats for 'system wide': > > 284,908.91 msec cpu-clock # 32.002 CPUs utilized > 6,485,456 context-switches # 22.763 K/sec > 719 cpu-migrations # 2.524 /sec > 32,800 page-faults # 115.125 /sec > 189,779,273,552 cycles # 0.666 GHz (83.33%) > 2,893,165,259 stalled-cycles-frontend # 1.52% frontend cycles idle (83.33%) > 24,807,157,349 stalled-cycles-backend # 13.07% backend cycles idle (83.33%) > 99,286,488,807 instructions # 0.52 insn per cycle > # 0.25 stalled cycles per insn (83.33%) > 24,120,737,678 branches # 84.661 M/sec (83.33%) > 1,907,540,278 branch-misses # 7.91% of all branches (83.34%) > > 8.902784776 seconds time elapsed > > > [root@five ~]# > [root@five ~]# perf stat -e dram_channel_data_controller_4 > ^C > Performance counter stats for 'system wide': > > 0 dram_channel_data_controller_4 > > 1.189638741 seconds time elapsed > > > [root@five ~]# > > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no? We could. I suspect the code has always just not bailed out. I'll put together a patch adding the bail out. Thanks, Ian > - Arnaldo > > > > Before announcing failure, the test can try multiple options for each > > > available metric. After system-wide mode fails, retry once again with > > > the "--metric-no-group" option. > > > > > > E.g. > > > > > > $ sudo perf test -v 100 > > > > > > Before: > > > > > > 100: perf all metrics test : > > > --- start --- > > > test child forked, pid 672731 > > > Testing branch_misprediction_ratio > > > Testing all_remote_links_outbound > > > Testing nps1_die_to_dram > > > Metric 'nps1_die_to_dram' not printed in: > > > Error: > > > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. > > > Testing macro_ops_dispatched > > > Testing all_l2_cache_accesses > > > Testing all_l2_cache_hits > > > Testing all_l2_cache_misses > > > Testing ic_fetch_miss_ratio > > > Testing l2_cache_accesses_from_l2_hwpf > > > Testing l2_cache_misses_from_l2_hwpf > > > Testing op_cache_fetch_miss_ratio > > > Testing l3_read_miss_latency > > > Testing l1_itlb_misses > > > test child finished with -1 > > > ---- end ---- > > > perf all metrics test: FAILED! > > > > > > After: > > > > > > 100: perf all metrics test : > > > --- start --- > > > test child forked, pid 672887 > > > Testing branch_misprediction_ratio > > > Testing all_remote_links_outbound > > > Testing nps1_die_to_dram > > > Testing macro_ops_dispatched > > > Testing all_l2_cache_accesses > > > Testing all_l2_cache_hits > > > Testing all_l2_cache_misses > > > Testing ic_fetch_miss_ratio > > > Testing l2_cache_accesses_from_l2_hwpf > > > Testing l2_cache_misses_from_l2_hwpf > > > Testing op_cache_fetch_miss_ratio > > > Testing l3_read_miss_latency > > > Testing l1_itlb_misses > > > test child finished with 0 > > > ---- end ---- > > > perf all metrics test: Ok > > > > > > > Issue gets resolved after applying this patch > > > > $ ./perf test 102 -vvv > > $102: perf all metrics test : > > $--- start --- > > $test child forked, pid 244991 > > $Testing branch_misprediction_ratio > > $Testing all_remote_links_outbound > > $Testing nps1_die_to_dram > > $Testing all_l2_cache_accesses > > $Testing all_l2_cache_hits > > $Testing all_l2_cache_misses > > $Testing ic_fetch_miss_ratio > > $Testing l2_cache_accesses_from_l2_hwpf > > $Testing l2_cache_misses_from_l2_hwpf > > $Testing l3_read_miss_latency > > $Testing l1_itlb_misses > > $test child finished with 0 > > $---- end ---- > > $perf all metrics test: Ok > > > > > Reported-by: Ayush Jain <ayush.jain3@amd.com> > > > Signed-off-by: Sandipan Das <sandipan.das@amd.com> > > > > Tested-by: Ayush Jain <ayush.jain3@amd.com> > > > > > --- > > > tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh > > > index 54774525e18a..1e88ea8c5677 100755 > > > --- a/tools/perf/tests/shell/stat_all_metrics.sh > > > +++ b/tools/perf/tests/shell/stat_all_metrics.sh > > > @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do > > > then > > > continue > > > fi > > > + # Failed again, possibly there are not enough counters so retry system wide > > > + # mode but without event grouping. > > > + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1) > > > + if [[ "$result" =~ ${m:0:50} ]] > > > + then > > > + continue > > > + fi > > > # Failed again, possibly the workload was too small so retry with something > > > # longer. > > > result=$(perf stat -M "$m" perf bench internals synthesize 2>&1) > > > > Thanks & Regards, > > Ayush Jain > > -- > > - Arnaldo
Em Wed, Dec 06, 2023 at 08:35:23AM -0800, Ian Rogers escreveu: > On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Humm, I'm not being able to reproduce here the problem, before applying > > this patch: > Please don't apply the patch. The patch masks a bug in metrics/PMUs I didn't > and the proper fix was: > 8d40f74ebf21 perf vendor events amd: Fix large metrics > https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com that is upstream: ⬢[acme@toolbox perf-tools-next]$ git log tools/perf/pmu-events/arch/x86/amdzen1/recommended.json commit 8d40f74ebf217d3b9e9b7481721e6236b857cc55 Author: Sandipan Das <sandipan.das@amd.com> Date: Thu Jul 6 12:04:40 2023 +0530 perf vendor events amd: Fix large metrics There are cases where a metric requires more events than the number of available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric counters but the "nps1_die_to_dram" metric has eight events. By default, the constituent events are placed in a group and since the events cannot be scheduled at the same time, the metric is not computed. The "all metrics" test also fails because of this. Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect the user to run perf with "--metric-no-group". E.g. $ sudo perf test -v 101 Before: 101: perf all metrics test : --- start --- test child forked, pid 37131 Testing branch_misprediction_ratio Testing all_remote_links_outbound Testing nps1_die_to_dram Metric 'nps1_die_to_dram' not printed in: Error: Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. Testing macro_ops_dispatched Testing all_l2_cache_accesses Testing all_l2_cache_hits Testing all_l2_cache_misses Testing ic_fetch_miss_ratio Testing l2_cache_accesses_from_l2_hwpf Testing l2_cache_misses_from_l2_hwpf Testing op_cache_fetch_miss_ratio Testing l3_read_miss_latency Testing l1_itlb_misses test child finished with -1 ---- end ---- perf all metrics test: FAILED! After: 101: perf all metrics test : --- start --- test child forked, pid 43766 Testing branch_misprediction_ratio Testing all_remote_links_outbound Testing nps1_die_to_dram Testing macro_ops_dispatched Testing all_l2_cache_accesses Testing all_l2_cache_hits Testing all_l2_cache_misses Testing ic_fetch_miss_ratio Testing l2_cache_accesses_from_l2_hwpf Testing l2_cache_misses_from_l2_hwpf Testing op_cache_fetch_miss_ratio Testing l3_read_miss_latency Testing l1_itlb_misses test child finished with 0 ---- end ---- perf all metrics test: Ok Reported-by: Ayush Jain <ayush.jain3@amd.com> Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com > > Ian, I also stumbled on this: > > [root@five ~]# perf stat -M dram_channel_data_controller_4 > > Cannot find metric or group `dram_channel_data_controller_4' > > ^C > > Performance counter stats for 'system wide': > > 284,908.91 msec cpu-clock # 32.002 CPUs utilized > > 6,485,456 context-switches # 22.763 K/sec > > 719 cpu-migrations # 2.524 /sec > > 32,800 page-faults # 115.125 /sec <SNIP> > > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no? > We could. I suspect the code has always just not bailed out. I'll put > together a patch adding the bail out. Great, thanks, - Arnaldo
On Wed, Dec 6, 2023 at 9:54 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Dec 06, 2023 at 08:35:23AM -0800, Ian Rogers escreveu: > > On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > Humm, I'm not being able to reproduce here the problem, before applying > > > this patch: > > > Please don't apply the patch. The patch masks a bug in metrics/PMUs > > I didn't > > > and the proper fix was: > > 8d40f74ebf21 perf vendor events amd: Fix large metrics > > https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com > > that is upstream: > > ⬢[acme@toolbox perf-tools-next]$ git log tools/perf/pmu-events/arch/x86/amdzen1/recommended.json > commit 8d40f74ebf217d3b9e9b7481721e6236b857cc55 > Author: Sandipan Das <sandipan.das@amd.com> > Date: Thu Jul 6 12:04:40 2023 +0530 > > perf vendor events amd: Fix large metrics > > There are cases where a metric requires more events than the number of > available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four > data fabric counters but the "nps1_die_to_dram" metric has eight events. > > By default, the constituent events are placed in a group and since the > events cannot be scheduled at the same time, the metric is not computed. > The "all metrics" test also fails because of this. > > Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect > the user to run perf with "--metric-no-group". > > E.g. > > $ sudo perf test -v 101 > > Before: > > 101: perf all metrics test : > --- start --- > test child forked, pid 37131 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Metric 'nps1_die_to_dram' not printed in: > Error: > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'. > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with -1 > ---- end ---- > perf all metrics test: FAILED! > > After: > > 101: perf all metrics test : > --- start --- > test child forked, pid 43766 > Testing branch_misprediction_ratio > Testing all_remote_links_outbound > Testing nps1_die_to_dram > Testing macro_ops_dispatched > Testing all_l2_cache_accesses > Testing all_l2_cache_hits > Testing all_l2_cache_misses > Testing ic_fetch_miss_ratio > Testing l2_cache_accesses_from_l2_hwpf > Testing l2_cache_misses_from_l2_hwpf > Testing op_cache_fetch_miss_ratio > Testing l3_read_miss_latency > Testing l1_itlb_misses > test child finished with 0 > ---- end ---- > perf all metrics test: Ok > > Reported-by: Ayush Jain <ayush.jain3@amd.com> > Suggested-by: Ian Rogers <irogers@google.com> > Signed-off-by: Sandipan Das <sandipan.das@amd.com> > Acked-by: Ian Rogers <irogers@google.com> > Cc: Adrian Hunter <adrian.hunter@intel.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Ananth Narayan <ananth.narayan@amd.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Jiri Olsa <jolsa@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ravi Bangoria <ravi.bangoria@amd.com> > Cc: Santosh Shukla <santosh.shukla@amd.com> > Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com > > > > Ian, I also stumbled on this: > > > > [root@five ~]# perf stat -M dram_channel_data_controller_4 > > > Cannot find metric or group `dram_channel_data_controller_4' > > > ^C > > > Performance counter stats for 'system wide': > > > > 284,908.91 msec cpu-clock # 32.002 CPUs utilized > > > 6,485,456 context-switches # 22.763 K/sec > > > 719 cpu-migrations # 2.524 /sec > > > 32,800 page-faults # 115.125 /sec > > <SNIP> > > > > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no? > > > We could. I suspect the code has always just not bailed out. I'll put > > together a patch adding the bail out. > > Great, thanks, Sent: https://lore.kernel.org/lkml/20231206183533.972028-1-irogers@google.com/ Thanks, Ian > - Arnaldo
diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh index 54774525e18a..1e88ea8c5677 100755 --- a/tools/perf/tests/shell/stat_all_metrics.sh +++ b/tools/perf/tests/shell/stat_all_metrics.sh @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do then continue fi + # Failed again, possibly there are not enough counters so retry system wide + # mode but without event grouping. + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1) + if [[ "$result" =~ ${m:0:50} ]] + then + continue + fi # Failed again, possibly the workload was too small so retry with something # longer. result=$(perf stat -M "$m" perf bench internals synthesize 2>&1)