Message ID | 20240119163924.2801678-1-ben.gainey@arm.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp1128354dyb; Fri, 19 Jan 2024 08:40:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IFq07cz5Eu7YxXBM1OSsUnH2LxM6a+7RPf1wXpx4sLSgAEJWE/DHPUx3oCvUPdEihEwfxmO X-Received: by 2002:ae9:ea0d:0:b0:783:36aa:468 with SMTP id f13-20020ae9ea0d000000b0078336aa0468mr127085qkg.130.1705682447813; Fri, 19 Jan 2024 08:40:47 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705682447; cv=pass; d=google.com; s=arc-20160816; b=KsRuXtIV3IPHuWsBPUlgy9rzfqaA6VgP783SCOZrit+euftOxpCZGvNQx26hg1H7vk CjoWzna/Debe6t4GMJ1DtdnRaBKURkX8LvqsEp3Z2Q84Zt2gDhBqVPMjbR8/0+9ayrMr fCoz7rke0rTPB1zUIyOHTlJTt1MYENA65nNQrx1JgpBTaYJarp4GPTDKiVfHXqomJAuZ MBPJMU3Af+yZeLosLDTyoCDAysWK4T2ZCEq/ba/a+l42kAtyH2WRYEWDmQNhGwr27e37 iOVATrazBYurX7d8ZzmIOGlxuNs/oGHE5DrAJDcp70y6zol7/Irsqliw+z4oyRGsGQil BQEw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from; bh=F/eDKjp9Y3Ji79JuIi0W5/M1cwDeXEtbpt3z1QTDsQY=; fh=G91xSwRmDhPqekiiQQapZLC6wvNZ0cm3WTKASgdYN18=; b=vuhd0/hhNMXGiDu+8k1sf7c6oFxclCXj+ZJRnVioFL6r2ts62r2dsCId3/9f84pFGP oA/qjIyMzZ3nrfJMZvmZ7WWoXoZ6CadgIR9DTA1tG+S3ZF2kOoEHU7VgdtpVJVqziBRG 1iLK5UI3sZdrlnizDBpbwF78eUOglIYUWfb/kqxHYOawNMEhN2nDBmZ4AXheU8YvzWHA CYqRskEHyc6Cdx98FGkyAts8ZJH+xlmf7fbnDKKSbjCVEhN30xZB1c/7ZKYVxd6CkA8M t/kI+yQA+iKsjRbQatj+IK3pBz/HyvnJt/FzackyMhnAp8QYNxBaIdbgNvua2Kpyz/eW FcJg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id g2-20020a05620a218200b0077f69b7dbd2si16188682qka.176.2024.01.19.08.40.47 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 08:40:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31370-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 788F81C245B4 for <ouuuleilei@gmail.com>; Fri, 19 Jan 2024 16:40:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3855755C11; Fri, 19 Jan 2024 16:39:55 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 142654F1FA; Fri, 19 Jan 2024 16:39:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705682393; cv=none; b=UuWEJTKuwZoUlQ6eYcLuBYFpfO4gLXEj+h0LUadFyS09+lwZ/6eTZX/1tiAwqRHjAmzRc1QCNlZZOnv8uQcgsY6TR/p6YdpxGQnKpDGIPHEDeOPO0RQcEt5XpsJ9jrvMX/qtiiZEZbDnka2lITExJcTQqQmVP7LptrC+KF7GGDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705682393; c=relaxed/simple; bh=hRwFxaLVM/9prIFMvwSX+LskTGsLe2VRFiqmVSZ/hW0=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=i9DA8hrPkO6j8sHQ1myn8tAw64qXKwivHbNIEYbnxTcq6sZ+8juyN0540bIEFkSQr7fTHLyPrbstNecRkeTq5i22bXflZ0HdtNBUEDtluD/QbFdXSeLuBxR31ooRWqKw0TBQUzWkSY9hDERJW4p6HnUqPRtfXk1snc4IPvY7oeI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 933021042; Fri, 19 Jan 2024 08:40:35 -0800 (PST) Received: from e126817.. (e126817.cambridge.arm.com [10.2.3.5]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C4CFD3F73F; Fri, 19 Jan 2024 08:39:46 -0800 (PST) From: Ben Gainey <ben.gainey@arm.com> To: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, james.clark@arm.com, Ben Gainey <ben.gainey@arm.com> Subject: [PATCH 0/1] Support PERF_SAMPLE_READ with inherit_stat Date: Fri, 19 Jan 2024 16:39:23 +0000 Message-ID: <20240119163924.2801678-1-ben.gainey@arm.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788537678162822223 X-GMAIL-MSGID: 1788537678162822223 |
Series |
Support PERF_SAMPLE_READ with inherit_stat
|
|
Message
Ben Gainey
Jan. 19, 2024, 4:39 p.m. UTC
This change allows events to use PERF_SAMPLE READ with inherit so long as both inherit_stat and PERF_SAMPLE_TID are set. Currently it is not possible to use PERF_SAMPLE_READ with inherit. This restriction assumes the user is interested in collecting aggregate statistics as per `perf stat`. It prevents a user from collecting per-thread samples using counter groups from a multi-threaded or multi-process application, as with `perf record -e '{....}:S'`. Instead users must use system-wide mode, or forgo the ability to sample counter groups. System-wide mode is often problematic as it requires specific permissions (no CAP_PERFMON / root access), or may lead to capture of significant amounts of extra data from other processes running on the system. Perf already supports the ability to collect per-thread counts with `inherit` via the `inherit_stat` flag. This patch changes `perf_event_alloc` relaxing the restriction to combine `inherit` with `PERF_SAMPLE_READ` so that the combination will be allowed so long as `inherit_stat` and `PERF_SAMPLE_TID` are enabled. In this configuration stream ids (such as may appear in the read_format field of a PERF_RECORD_SAMPLE) are no longer globally unique, rather the pair of (stream id, tid) uniquely identify each event. Tools that rely on this, for example to calculate a delta between samples, would need updating to take this into account. Previously valid event configurations (system-wide, no-inherit and so on) where each stream id is the identifier are unaffected. This patch has been tested on aarch64 both my manual inspection of the output of `perf script -D` and through a modified version of Arm's commercial profiling tools and the numbers appear to line up as one would expect, but some further validation across other architectures and/or edge cases would be welcome. This patch was developed and tested on top of v6.7. Ben Gainey (1): perf: Support PERF_SAMPLE_READ with inherit_stat kernel/events/core.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
Comments
Hello, On Fri, Jan 19, 2024 at 8:39 AM Ben Gainey <ben.gainey@arm.com> wrote: > > This change allows events to use PERF_SAMPLE READ with inherit so long > as both inherit_stat and PERF_SAMPLE_TID are set. > > Currently it is not possible to use PERF_SAMPLE_READ with inherit. This > restriction assumes the user is interested in collecting aggregate > statistics as per `perf stat`. It prevents a user from collecting > per-thread samples using counter groups from a multi-threaded or > multi-process application, as with `perf record -e '{....}:S'`. Instead > users must use system-wide mode, or forgo the ability to sample counter > groups. System-wide mode is often problematic as it requires specific > permissions (no CAP_PERFMON / root access), or may lead to capture of > significant amounts of extra data from other processes running on the > system. > > Perf already supports the ability to collect per-thread counts with > `inherit` via the `inherit_stat` flag. This patch changes > `perf_event_alloc` relaxing the restriction to combine `inherit` with > `PERF_SAMPLE_READ` so that the combination will be allowed so long as > `inherit_stat` and `PERF_SAMPLE_TID` are enabled. I'm not sure if it's correct. Maybe I misunderstand inherit_stat but AFAIK it's just to use prev_task's events when next_task has the compatible event context. So the event values it sees in samples would depend on the timing or scheduler behavior. Also event counts and time values PERF_SAMPLE_READ sees include child event's so the values of the parent event can be updated even if it's inactive. And the values will vary for the next_task whether prev_task is the parent or not. I think it would return consistent values only if it iterates all child events and sums up the values like it does for read(2). But it cannot do that in the NMI handler. Frankly I don't understand how inherit_stat supports per-thread counts properly. Also it doesn't seem to be used by default in the perf tools. IIUC per-thread count is supported when you don't set the inherit bit and open separate events for each thread but I guess that's not what you want. Anyway, I'm ok with the idea of using PERF_SAMPLE_READ to improve per-thread profiling especially with event groups. But I think it should not use inherit_stat and it needs a way to not include child stats in the samples. What do you think? Thanks, Namhyung > > In this configuration stream ids (such as may appear in the read_format > field of a PERF_RECORD_SAMPLE) are no longer globally unique, rather > the pair of (stream id, tid) uniquely identify each event. Tools that > rely on this, for example to calculate a delta between samples, would > need updating to take this into account. Previously valid event > configurations (system-wide, no-inherit and so on) where each stream id > is the identifier are unaffected. > > This patch has been tested on aarch64 both my manual inspection of the > output of `perf script -D` and through a modified version of Arm's > commercial profiling tools and the numbers appear to line up as one > would expect, but some further validation across other architectures > and/or edge cases would be welcome. > > This patch was developed and tested on top of v6.7. > > > Ben Gainey (1): > perf: Support PERF_SAMPLE_READ with inherit_stat > > kernel/events/core.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > -- > 2.43.0 >