Message ID | 20231212193833.415110-1-kan.liang@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp7958871vqy; Tue, 12 Dec 2023 11:39:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IEx/wAV/fc7fntZp3rxtTqqGaGJwCUsQ4hyNDMz8oJfi+joddBKqV7l7fgP9VCVabKoOxAP X-Received: by 2002:a05:6a00:1487:b0:6ce:3584:b634 with SMTP id v7-20020a056a00148700b006ce3584b634mr4115140pfu.48.1702409967208; Tue, 12 Dec 2023 11:39:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702409967; cv=none; d=google.com; s=arc-20160816; b=mx/YcOfydZoY+WKlucWBKE/i+1bDY8WNrk9W1biLZwhf/uUwMCNN5AUT7zrSZSfjnC 8O7yR9QESaQnM2InQOS/DjJDqiZGq9S5LlQBbVrnDMdezExwVUAiEk8JagV9aVs+CXVr wW0lMPLrgdXxsmjKEjmQq5GpvBIn++nNqKUaAOv1QBZFyL9VRF7LKfxtUd8d7WUfqdLF BAnnrCDFuxAAmYvKS5NeJkC/QNCcgtUp+QF1XyNKpCeZyVvSJHJ0mPrSU7fu4QlXu3fa 2akubTe4kNQ+QcKMlPIzlYj6QVCvgrCYJVC1N4/2EMwvt7nuefpyQ0xRMYoH5OZWM1yk zmHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=5PGwPSXL+WeIgYnR42jxQbu0I+7MaodcThgJqjqYX5U=; fh=V97FFFi5RmpdUikarEoLTJWoOplAC1MpM9MGOX1xMoI=; b=Wx4JPa9GUTNVYT7L1MByELn12ZvX151RkpuPIWX2PWSaaLfkVwZipQIfGW59H2s+OH Ma7HlKg6L2CJEtzFPtG0l01lcju1wURQhM0cG2jYMApF+46vNnTI4vVd/7VA9HT3EZNS rhnGy2tGXXVShXs/zZMFovBhGZkdjjruSquZWwCGByZyI+Piwn3T+NcEPaVkEriYbumr x/RNPT3FYpx+zlh/4hSIBoaKowkmWDwZOSdapXcsugi7qOTeI87oY2/qAMYGQ99ub7xo tL9+kT9oQ/3118PcuOLijkfzSfWsOuEO2bFeYO7iSbAYZEBtmKr3t21v5reSQ/tgTE2k L4rw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BGncVD4H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id g3-20020a056a0023c300b006cedd840d8csi7859648pfc.62.2023.12.12.11.39.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 11:39:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BGncVD4H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id A3D84804C215; Tue, 12 Dec 2023 11:39:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377092AbjLLTjM (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Tue, 12 Dec 2023 14:39:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229975AbjLLTjL (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 12 Dec 2023 14:39:11 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61606A1; Tue, 12 Dec 2023 11:39:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702409958; x=1733945958; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=OP8Kh8R/PGKsbqJgH2vyY1FjTdW43FE57dbTsxnKgHc=; b=BGncVD4Hf77lLw2iBYcWwJTEvYUouDAzBn2LmZjSKcMkVmj9mjRkacfr 98OX/xO+Fdk8FOSthevX5Pm5E3dhScEGbN+feg9U4fRQshqVAl1CMTS5u hLNx/CWVZ85Y8jjCr7aJyziWC6O0zmM+/7JlYwNJzBSL9juR2RIY2m+fx U/ryWSEFlv+6WLOMxjMYFZNwYeQC5XAwq31wPp7d4PNyjegdXI8w3TeLi uNU0C+dGnmjuqxQ3BHLnv/gp++qH8WAJKGCPYGo0CktUdj908d5LAC7Co 8ak3v+iNNqObLKRPV3LBMrLSvs2O9ynyEe6mPA9zYa6qmUSUbZUv9oegu A==; X-IronPort-AV: E=McAfee;i="6600,9927,10922"; a="8223342" X-IronPort-AV: E=Sophos;i="6.04,271,1695711600"; d="scan'208";a="8223342" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2023 11:39:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10922"; a="807889787" X-IronPort-AV: E=Sophos;i="6.04,271,1695711600"; d="scan'208";a="807889787" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga001.jf.intel.com with ESMTP; 12 Dec 2023 11:39:17 -0800 From: kan.liang@linux.intel.com To: acme@kernel.org, irogers@google.com Cc: namhyung@kernel.org, mark.rutland@arm.com, maz@kernel.org, marcan@marcan.st, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Kan Liang <kan.liang@linux.intel.com> Subject: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus Date: Tue, 12 Dec 2023 11:38:33 -0800 Message-Id: <20231212193833.415110-1-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 12 Dec 2023 11:39:20 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785106234015225929 X-GMAIL-MSGID: 1785106234015225929 |
Series |
[V2] perf top: Use evsel's cpus to replace user_requested_cpus
|
|
Commit Message
Liang, Kan
Dec. 12, 2023, 7:38 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com> perf top errors out on a hybrid machine $perf top Error: The cycles:P event is not supported. The perf top expects that the "cycles" is collected on all CPUs in the system. But for hybrid there is no single "cycles" event which can cover all CPUs. Perf has to split it into two cycles events, e.g., cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. If a event is opened on the unsupported CPU. The open fails. That's the reason of the above error out. Perf should only open the cycles event on the corresponding CPU. The commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting core PMU maps") intersect the requested CPU map with the CPU map of the PMU. Use the evsel's cpus to replace user_requested_cpus. The evlist's threads are also propagated to the evsel's threads in __perf_evlist__propagate_maps(). For a system-wide event, perf appends a dummy event and assign it to the evsel's threads. For a per-thread event, the evlist's thread_map is assigned to the evsel's threads. The same as the other tools, e.g., perf record, using the evsel's threads when opening an event. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> --- Changes since V1: - Update the description - Add Reviewed-by from Ian tools/perf/builtin-top.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > > From: Kan Liang <kan.liang@linux.intel.com> > > perf top errors out on a hybrid machine > $perf top > > Error: > The cycles:P event is not supported. > > The perf top expects that the "cycles" is collected on all CPUs in the > system. But for hybrid there is no single "cycles" event which can cover > all CPUs. Perf has to split it into two cycles events, e.g., > cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > If a event is opened on the unsupported CPU. The open fails. That's the > reason of the above error out. > > Perf should only open the cycles event on the corresponding CPU. The > commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > core PMU maps") intersect the requested CPU map with the CPU map of the > PMU. Use the evsel's cpus to replace user_requested_cpus. > > The evlist's threads are also propagated to the evsel's threads in > __perf_evlist__propagate_maps(). For a system-wide event, perf appends > a dummy event and assign it to the evsel's threads. For a per-thread > event, the evlist's thread_map is assigned to the evsel's threads. The > same as the other tools, e.g., perf record, using the evsel's threads > when opening an event. > > Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > Reviewed-by: Ian Rogers <irogers@google.com> > Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > --- > > Changes since V1: > - Update the description > - Add Reviewed-by from Ian Thanks Kan, quick question. Does "perf top" on hybrid ask the user to select between the cycles event on cpu_atom and cpu_core? I'm wondering if there is some kind of missing "hybrid-merge" functionality like we have for perf stat. Thanks, Ian > tools/perf/builtin-top.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > index ea8c7eca5eee..cce9350177e2 100644 > --- a/tools/perf/builtin-top.c > +++ b/tools/perf/builtin-top.c > @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) > > evlist__for_each_entry(evlist, counter) { > try_again: > - if (evsel__open(counter, top->evlist->core.user_requested_cpus, > - top->evlist->core.threads) < 0) { > + if (evsel__open(counter, counter->core.cpus, > + counter->core.threads) < 0) { > > /* > * Specially handle overwrite fall back. > -- > 2.35.1 >
On 2023-12-12 3:37 p.m., Ian Rogers wrote: > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >> >> From: Kan Liang <kan.liang@linux.intel.com> >> >> perf top errors out on a hybrid machine >> $perf top >> >> Error: >> The cycles:P event is not supported. >> >> The perf top expects that the "cycles" is collected on all CPUs in the >> system. But for hybrid there is no single "cycles" event which can cover >> all CPUs. Perf has to split it into two cycles events, e.g., >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >> If a event is opened on the unsupported CPU. The open fails. That's the >> reason of the above error out. >> >> Perf should only open the cycles event on the corresponding CPU. The >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >> core PMU maps") intersect the requested CPU map with the CPU map of the >> PMU. Use the evsel's cpus to replace user_requested_cpus. >> >> The evlist's threads are also propagated to the evsel's threads in >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >> a dummy event and assign it to the evsel's threads. For a per-thread >> event, the evlist's thread_map is assigned to the evsel's threads. The >> same as the other tools, e.g., perf record, using the evsel's threads >> when opening an event. >> >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >> Reviewed-by: Ian Rogers <irogers@google.com> >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >> --- >> >> Changes since V1: >> - Update the description >> - Add Reviewed-by from Ian > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > select between the cycles event on cpu_atom and cpu_core? Yes, but the event doesn't include the PMU information. We probably need a follow up patch to append the PMU name. Available samples 385 cycles:P 903 cycles:P Thanks, Kan > I'm > wondering if there is some kind of missing "hybrid-merge" > functionality like we have for perf stat. > > Thanks, > Ian > >> tools/perf/builtin-top.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c >> index ea8c7eca5eee..cce9350177e2 100644 >> --- a/tools/perf/builtin-top.c >> +++ b/tools/perf/builtin-top.c >> @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) >> >> evlist__for_each_entry(evlist, counter) { >> try_again: >> - if (evsel__open(counter, top->evlist->core.user_requested_cpus, >> - top->evlist->core.threads) < 0) { >> + if (evsel__open(counter, counter->core.cpus, >> + counter->core.threads) < 0) { >> >> /* >> * Specially handle overwrite fall back. >> -- >> 2.35.1 >>
On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 2023-12-12 3:37 p.m., Ian Rogers wrote: > > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > >> > >> From: Kan Liang <kan.liang@linux.intel.com> > >> > >> perf top errors out on a hybrid machine > >> $perf top > >> > >> Error: > >> The cycles:P event is not supported. > >> > >> The perf top expects that the "cycles" is collected on all CPUs in the > >> system. But for hybrid there is no single "cycles" event which can cover > >> all CPUs. Perf has to split it into two cycles events, e.g., > >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > >> If a event is opened on the unsupported CPU. The open fails. That's the > >> reason of the above error out. > >> > >> Perf should only open the cycles event on the corresponding CPU. The > >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > >> core PMU maps") intersect the requested CPU map with the CPU map of the > >> PMU. Use the evsel's cpus to replace user_requested_cpus. > >> > >> The evlist's threads are also propagated to the evsel's threads in > >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > >> a dummy event and assign it to the evsel's threads. For a per-thread > >> event, the evlist's thread_map is assigned to the evsel's threads. The > >> same as the other tools, e.g., perf record, using the evsel's threads > >> when opening an event. > >> > >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > >> Reviewed-by: Ian Rogers <irogers@google.com> > >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > >> --- > >> > >> Changes since V1: > >> - Update the description > >> - Add Reviewed-by from Ian > > > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > > select between the cycles event on cpu_atom and cpu_core? > > Yes, but the event doesn't include the PMU information. > We probably need a follow up patch to append the PMU name. > > Available samples > 385 cycles:P > > 903 cycles:P Thanks and agreed, it isn't possible to tell which is which PMU/CPU type at the moment. I tried the patch with perf top --stdio, there wasn't a choice of event and I can't tell what counter is being displayed. When I quit I also see: ``` exiting. corrupted double-linked list Aborted (core dumped) ``` but I wasn't able to repro this on a debuggable binary/system. If my memory serves there was a patch where perf top was showing >1 event. It would be nice here to do some kind of hybrid merging rather than having to view each PMU's top separately. Thanks, Ian > Thanks, > Kan > > > I'm > > wondering if there is some kind of missing "hybrid-merge" > > functionality like we have for perf stat. > > > > Thanks, > > Ian > > > >> tools/perf/builtin-top.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > >> index ea8c7eca5eee..cce9350177e2 100644 > >> --- a/tools/perf/builtin-top.c > >> +++ b/tools/perf/builtin-top.c > >> @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) > >> > >> evlist__for_each_entry(evlist, counter) { > >> try_again: > >> - if (evsel__open(counter, top->evlist->core.user_requested_cpus, > >> - top->evlist->core.threads) < 0) { > >> + if (evsel__open(counter, counter->core.cpus, > >> + counter->core.threads) < 0) { > >> > >> /* > >> * Specially handle overwrite fall back. > >> -- > >> 2.35.1 > >>
On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > > On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > > > > > On 2023-12-12 3:37 p.m., Ian Rogers wrote: > > > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > > >> > > >> From: Kan Liang <kan.liang@linux.intel.com> > > >> > > >> perf top errors out on a hybrid machine > > >> $perf top > > >> > > >> Error: > > >> The cycles:P event is not supported. > > >> > > >> The perf top expects that the "cycles" is collected on all CPUs in the > > >> system. But for hybrid there is no single "cycles" event which can cover > > >> all CPUs. Perf has to split it into two cycles events, e.g., > > >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > > >> If a event is opened on the unsupported CPU. The open fails. That's the > > >> reason of the above error out. > > >> > > >> Perf should only open the cycles event on the corresponding CPU. The > > >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > > >> core PMU maps") intersect the requested CPU map with the CPU map of the > > >> PMU. Use the evsel's cpus to replace user_requested_cpus. > > >> > > >> The evlist's threads are also propagated to the evsel's threads in > > >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > > >> a dummy event and assign it to the evsel's threads. For a per-thread > > >> event, the evlist's thread_map is assigned to the evsel's threads. The > > >> same as the other tools, e.g., perf record, using the evsel's threads > > >> when opening an event. > > >> > > >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > > >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > > >> Reviewed-by: Ian Rogers <irogers@google.com> > > >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > > >> --- > > >> > > >> Changes since V1: > > >> - Update the description > > >> - Add Reviewed-by from Ian > > > > > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > > > select between the cycles event on cpu_atom and cpu_core? > > > > Yes, but the event doesn't include the PMU information. > > We probably need a follow up patch to append the PMU name. > > > > Available samples > > 385 cycles:P > > > > 903 cycles:P > > Thanks and agreed, it isn't possible to tell which is which PMU/CPU > type at the moment. I tried the patch with perf top --stdio, there > wasn't a choice of event and I can't tell what counter is being > displayed. When I quit I also see: > ``` > exiting. > corrupted double-linked list > Aborted (core dumped) > ``` > but I wasn't able to repro this on a debuggable binary/system. > > If my memory serves there was a patch where perf top was showing >1 > event. It would be nice here to do some kind of hybrid merging rather > than having to view each PMU's top separately. Using event groups, but I noticed you removed the --group option. Maybe perf top can just use `{ ... }` notation for explicit grouping, but it might be implicit like in the hybrid case. Thanks, Namhyung
On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: >> >> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: >>> >>> >>> >>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: >>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >>>>> >>>>> From: Kan Liang <kan.liang@linux.intel.com> >>>>> >>>>> perf top errors out on a hybrid machine >>>>> $perf top >>>>> >>>>> Error: >>>>> The cycles:P event is not supported. >>>>> >>>>> The perf top expects that the "cycles" is collected on all CPUs in the >>>>> system. But for hybrid there is no single "cycles" event which can cover >>>>> all CPUs. Perf has to split it into two cycles events, e.g., >>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >>>>> If a event is opened on the unsupported CPU. The open fails. That's the >>>>> reason of the above error out. >>>>> >>>>> Perf should only open the cycles event on the corresponding CPU. The >>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >>>>> core PMU maps") intersect the requested CPU map with the CPU map of the >>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. >>>>> >>>>> The evlist's threads are also propagated to the evsel's threads in >>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >>>>> a dummy event and assign it to the evsel's threads. For a per-thread >>>>> event, the evlist's thread_map is assigned to the evsel's threads. The >>>>> same as the other tools, e.g., perf record, using the evsel's threads >>>>> when opening an event. >>>>> >>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >>>>> Reviewed-by: Ian Rogers <irogers@google.com> >>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >>>>> --- >>>>> >>>>> Changes since V1: >>>>> - Update the description >>>>> - Add Reviewed-by from Ian >>>> >>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to >>>> select between the cycles event on cpu_atom and cpu_core? >>> >>> Yes, but the event doesn't include the PMU information. >>> We probably need a follow up patch to append the PMU name. >>> >>> Available samples >>> 385 cycles:P >>> >>> 903 cycles:P >> >> Thanks and agreed, it isn't possible to tell which is which PMU/CPU >> type at the moment. I tried the patch with perf top --stdio, there >> wasn't a choice of event The perf top --stdio uses a dedicated display function, see perf_top__header_snprintf() in util/top.c It only shows one event at a time. "E" is used to switch the event. >> and I can't tell what counter is being >> displayed. For the hybrid case, I think we may display both PMU name and event name. For example, Available samples 656 cycles:P cpu_atom 701 cycles:P cpu_core diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index f4812b226818..afc7a1d54fe4 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -3433,8 +3433,10 @@ static void perf_evsel_menu__write(struct ui_browser *browser, } nr_events = convert_unit(nr_events, &unit); - printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s", nr_events, - unit, unit == ' ' ? "" : " ", ev_name); + printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s %s", nr_events, + unit, unit == ' ' ? "" : " ", ev_name, + evsel->pmu ? evsel->pmu_name : ""); + ui_browser__printf(browser, "%s", bf); nr_events = evsel->evlist->stats.nr_events[PERF_RECORD_LOST]; >> When I quit I also see: >> ``` >> exiting. >> corrupted double-linked list >> Aborted (core dumped) >> ``` >> but I wasn't able to repro this on a debuggable binary/system. I haven't see the issue yet. >> >> If my memory serves there was a patch where perf top was showing >1 >> event. It would be nice here to do some kind of hybrid merging rather >> than having to view each PMU's top separately. The current perf top doesn't merge when there are >1 event. sudo ./perf top -e "cycles,instructions" Available samples 2K cycles 2K instructions For now, I think we may just append a PMU name to distinguish the hybrid case. We may implement the merging feature which impacts both hybrid and non-hybrid cases later separately. > > Using event groups, but I noticed you removed the --group option. > Maybe perf top can just use `{ ... }` notation for explicit grouping, > but it might be implicit like in the hybrid case. > Yes, if the events are from different PMUs, the perf tool will implicitly de-group the hybrid events. "{ ... }" may not help here. Thanks, Kan
On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > >> > >> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > >>> > >>> > >>> > >>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: > >>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > >>>>> > >>>>> From: Kan Liang <kan.liang@linux.intel.com> > >>>>> > >>>>> perf top errors out on a hybrid machine > >>>>> $perf top > >>>>> > >>>>> Error: > >>>>> The cycles:P event is not supported. > >>>>> > >>>>> The perf top expects that the "cycles" is collected on all CPUs in the > >>>>> system. But for hybrid there is no single "cycles" event which can cover > >>>>> all CPUs. Perf has to split it into two cycles events, e.g., > >>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > >>>>> If a event is opened on the unsupported CPU. The open fails. That's the > >>>>> reason of the above error out. > >>>>> > >>>>> Perf should only open the cycles event on the corresponding CPU. The > >>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > >>>>> core PMU maps") intersect the requested CPU map with the CPU map of the > >>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. > >>>>> > >>>>> The evlist's threads are also propagated to the evsel's threads in > >>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > >>>>> a dummy event and assign it to the evsel's threads. For a per-thread > >>>>> event, the evlist's thread_map is assigned to the evsel's threads. The > >>>>> same as the other tools, e.g., perf record, using the evsel's threads > >>>>> when opening an event. > >>>>> > >>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > >>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > >>>>> Reviewed-by: Ian Rogers <irogers@google.com> > >>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > >>>>> --- > >>>>> > >>>>> Changes since V1: > >>>>> - Update the description > >>>>> - Add Reviewed-by from Ian > >>>> > >>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > >>>> select between the cycles event on cpu_atom and cpu_core? > >>> > >>> Yes, but the event doesn't include the PMU information. > >>> We probably need a follow up patch to append the PMU name. > >>> > >>> Available samples > >>> 385 cycles:P > >>> > >>> 903 cycles:P > >> > >> Thanks and agreed, it isn't possible to tell which is which PMU/CPU > >> type at the moment. I tried the patch with perf top --stdio, there > >> wasn't a choice of event > > The perf top --stdio uses a dedicated display function, see > perf_top__header_snprintf() in util/top.c > > It only shows one event at a time. "E" is used to switch the event. > > >> and I can't tell what counter is being > >> displayed. > > For the hybrid case, I think we may display both PMU name and event > name. For example, > > Available samples > 656 cycles:P cpu_atom > > 701 cycles:P cpu_core I think there would be more uniformity if we could do: cpu/cycles/P I'm just reminded of the stat output where sometimes you can get things like: cpu/cycles/ or sometimes: cycles [cpu] It seems the slash style approach is the most uniform given it looks like what is written on the command line. Perhaps we need some kind of helper function to do this as reformatting the modifier is a pain. > diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c > index f4812b226818..afc7a1d54fe4 100644 > --- a/tools/perf/ui/browsers/hists.c > +++ b/tools/perf/ui/browsers/hists.c > @@ -3433,8 +3433,10 @@ static void perf_evsel_menu__write(struct > ui_browser *browser, > } > > nr_events = convert_unit(nr_events, &unit); > - printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s", nr_events, > - unit, unit == ' ' ? "" : " ", ev_name); > + printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s %s", nr_events, > + unit, unit == ' ' ? "" : " ", ev_name, > + evsel->pmu ? evsel->pmu_name : ""); > + > ui_browser__printf(browser, "%s", bf); > > nr_events = evsel->evlist->stats.nr_events[PERF_RECORD_LOST]; > > > >> When I quit I also see: > >> ``` > >> exiting. > >> corrupted double-linked list > >> Aborted (core dumped) > >> ``` > >> but I wasn't able to repro this on a debuggable binary/system. > > I haven't see the issue yet. > > >> > >> If my memory serves there was a patch where perf top was showing >1 > >> event. It would be nice here to do some kind of hybrid merging rather > >> than having to view each PMU's top separately. > > The current perf top doesn't merge when there are >1 event. > sudo ./perf top -e "cycles,instructions" > > Available samples > 2K cycles > > 2K instructions > > For now, I think we may just append a PMU name to distinguish the hybrid > case. > > We may implement the merging feature which impacts both hybrid and > non-hybrid cases later separately. > > > > > Using event groups, but I noticed you removed the --group option. > > Maybe perf top can just use `{ ... }` notation for explicit grouping, > > but it might be implicit like in the hybrid case. > > > > Yes, if the events are from different PMUs, the perf tool will > implicitly de-group the hybrid events. "{ ... }" may not help here. I think grouping may have been the situation where I saw >1 event displayed but I agree with Kan, we implicitly de-group events on different PMUs so it won't help here. Thanks, Ian > Thanks, > Kan
On 2023-12-13 12:42 p.m., Ian Rogers wrote: > On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: >> >> >> On 2023-12-12 8:06 p.m., Namhyung Kim wrote: >>> On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: >>>> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: >>>>> >>>>> >>>>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: >>>>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >>>>>>> From: Kan Liang <kan.liang@linux.intel.com> >>>>>>> >>>>>>> perf top errors out on a hybrid machine >>>>>>> $perf top >>>>>>> >>>>>>> Error: >>>>>>> The cycles:P event is not supported. >>>>>>> >>>>>>> The perf top expects that the "cycles" is collected on all CPUs in the >>>>>>> system. But for hybrid there is no single "cycles" event which can cover >>>>>>> all CPUs. Perf has to split it into two cycles events, e.g., >>>>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >>>>>>> If a event is opened on the unsupported CPU. The open fails. That's the >>>>>>> reason of the above error out. >>>>>>> >>>>>>> Perf should only open the cycles event on the corresponding CPU. The >>>>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >>>>>>> core PMU maps") intersect the requested CPU map with the CPU map of the >>>>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. >>>>>>> >>>>>>> The evlist's threads are also propagated to the evsel's threads in >>>>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >>>>>>> a dummy event and assign it to the evsel's threads. For a per-thread >>>>>>> event, the evlist's thread_map is assigned to the evsel's threads. The >>>>>>> same as the other tools, e.g., perf record, using the evsel's threads >>>>>>> when opening an event. >>>>>>> >>>>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >>>>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >>>>>>> Reviewed-by: Ian Rogers <irogers@google.com> >>>>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >>>>>>> --- >>>>>>> >>>>>>> Changes since V1: >>>>>>> - Update the description >>>>>>> - Add Reviewed-by from Ian >>>>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to >>>>>> select between the cycles event on cpu_atom and cpu_core? >>>>> Yes, but the event doesn't include the PMU information. >>>>> We probably need a follow up patch to append the PMU name. >>>>> >>>>> Available samples >>>>> 385 cycles:P >>>>> >>>>> 903 cycles:P >>>> Thanks and agreed, it isn't possible to tell which is which PMU/CPU >>>> type at the moment. I tried the patch with perf top --stdio, there >>>> wasn't a choice of event >> The perf top --stdio uses a dedicated display function, see >> perf_top__header_snprintf() in util/top.c >> >> It only shows one event at a time. "E" is used to switch the event. >> >>>> and I can't tell what counter is being >>>> displayed. >> For the hybrid case, I think we may display both PMU name and event >> name. For example, >> >> Available samples >> 656 cycles:P cpu_atom >> >> 701 cycles:P cpu_core > I think there would be more uniformity if we could do: > cpu/cycles/P > I'm just reminded of the stat output where sometimes you can get things like: > cpu/cycles/ > or sometimes: > cycles [cpu] > It seems the slash style approach is the most uniform given it looks > like what is written on the command line. Perhaps we need some kind of > helper function to do this as reformatting the modifier is a pain. Actually, we already have a helper in the perf record, record__uniquify_name(). I can move it to the util/record.c and let's perf top invoke it before the open as well. Then we can get this in the perf top. Available samples 148 cpu_atom/cycles:P/ 1K cpu_core/cycles:P/ It should be good enough to distinguish the events. I will send a patch to support it. Thanks, Kan
On Wed, Dec 13, 2023 at 9:42 AM Ian Rogers <irogers@google.com> wrote: > > On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > > > > > On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > > > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > > >> If my memory serves there was a patch where perf top was showing >1 > > >> event. It would be nice here to do some kind of hybrid merging rather > > >> than having to view each PMU's top separately. > > > > The current perf top doesn't merge when there are >1 event. > > sudo ./perf top -e "cycles,instructions" > > > > Available samples > > 2K cycles > > > > 2K instructions > > > > For now, I think we may just append a PMU name to distinguish the hybrid > > case. > > > > We may implement the merging feature which impacts both hybrid and > > non-hybrid cases later separately. > > > > > > > > Using event groups, but I noticed you removed the --group option. > > > Maybe perf top can just use `{ ... }` notation for explicit grouping, > > > but it might be implicit like in the hybrid case. > > > > > > > Yes, if the events are from different PMUs, the perf tool will > > implicitly de-group the hybrid events. "{ ... }" may not help here. > > I think grouping may have been the situation where I saw >1 event > displayed but I agree with Kan, we implicitly de-group events on > different PMUs so it won't help here. The --group option was implemented in perf report which means it doesn't matter if events are in different PMUs. It's just to display the results together. Thanks, Namhyung
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index ea8c7eca5eee..cce9350177e2 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) evlist__for_each_entry(evlist, counter) { try_again: - if (evsel__open(counter, top->evlist->core.user_requested_cpus, - top->evlist->core.threads) < 0) { + if (evsel__open(counter, counter->core.cpus, + counter->core.threads) < 0) { /* * Specially handle overwrite fall back.