Message ID | 20221107062514.2851047-5-anshuman.khandual@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1869331wru; Sun, 6 Nov 2022 22:29:01 -0800 (PST) X-Google-Smtp-Source: AA0mqf4p+CQPN3Ppp0Gzu8r0R1Bi+Cth6ttcesfW5/pDOLeweJoDbjTh23p6iKLnvC7zKdz5bWVT X-Received: by 2002:a17:907:c485:b0:7ae:566e:3eb5 with SMTP id tp5-20020a170907c48500b007ae566e3eb5mr8909494ejc.22.1667802541412; Sun, 06 Nov 2022 22:29:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667802541; cv=none; d=google.com; s=arc-20160816; b=TSY/VItEiFQbAAAd8ZVG00Nse3ZyOKt7D/g9Ko+UXuicGNDIZ7eS2UlSKVPESZxlpi MjkqfS/FFCgp304kNfTvORqH/bhinf4akQryDOtw4Rwv083pZgbwMYSdPcuOn28Hl7Rr k6xEQlD3tI8NBGDk9iX75olr+4TUNOn76AUW7J0IRIciSPy8zhsHdy+Ar4PptKvc8Cym wEFXH7iBHQJGYoP9kknnff6oZU08br8uFh7NAlJrRzIlQjrmF8kCnCg4qhzKc65Ctgl/ wvMIPyuEWZr2e1kT4Az6VrnoDTIwyFEmsu2pcHmfONAg3b/4O2ryVtsv7CxXBcLP5YNF Y2QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jmKfVqBDx6Q1JsVfcf2bav3tkpdhoe5RzNBcN7OyJnU=; b=ZV5W+W7RhqfCEk24fWBQO94RSDbH6yNtkqaUsfmT8Swe1FYI2z0wcflv9CzE5r6ure gC1W37mB5XmSrZdWR8F+1UER+a9EYHQgYQC057lJOL7IinIf7Ne1UwX2XEFFXxx0Qe1y Z+db4/3eKGh9O/s4gjVmqjoZBPinx/xT14zS+u/7JXGNJLItTTUaqkFPEC77GzLumYzn zLloFBiw1S6VVAEZgX+pGGRhty6Z+ULhwVPVzBM/NHu89RPeSKSfJb46GqMMdjC9qohw wFDuzaCqB+I43eBiLWxJGx80q+zklQn2VZ3qCSto/EI5eaSw2T1a232DekfLBz2Zo2KO gHdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gs10-20020a1709072d0a00b0078d930212c0si7840236ejc.347.2022.11.06.22.28.37; Sun, 06 Nov 2022 22:29:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229840AbiKGG0G (ORCPT <rfc822;hjfbswb@gmail.com> + 99 others); Mon, 7 Nov 2022 01:26:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231180AbiKGG0B (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 7 Nov 2022 01:26:01 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EF32512A93; Sun, 6 Nov 2022 22:25:57 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DF2191FB; Sun, 6 Nov 2022 22:26:03 -0800 (PST) Received: from a077893.blr.arm.com (unknown [10.162.42.7]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0DB8B3F534; Sun, 6 Nov 2022 22:25:52 -0800 (PST) From: Anshuman Khandual <anshuman.khandual@arm.com> To: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-arm-kernel@lists.infradead.org, peterz@infradead.org, acme@kernel.org, mark.rutland@arm.com, will@kernel.org, catalin.marinas@arm.com Cc: Anshuman Khandual <anshuman.khandual@arm.com>, Mark Brown <broonie@kernel.org>, James Clark <james.clark@arm.com>, Rob Herring <robh@kernel.org>, Marc Zyngier <maz@kernel.org>, Suzuki Poulose <suzuki.poulose@arm.com>, Ingo Molnar <mingo@redhat.com> Subject: [PATCH V5 4/7] driver/perf/arm_pmu_platform: Add support for BRBE attributes detection Date: Mon, 7 Nov 2022 11:55:11 +0530 Message-Id: <20221107062514.2851047-5-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221107062514.2851047-1-anshuman.khandual@arm.com> References: <20221107062514.2851047-1-anshuman.khandual@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748817717544795561?= X-GMAIL-MSGID: =?utf-8?q?1748817717544795561?= |
Series |
arm64/perf: Enable branch stack sampling
|
|
Commit Message
Anshuman Khandual
Nov. 7, 2022, 6:25 a.m. UTC
This adds arm pmu infrastrure to probe BRBE implementation's attributes via
driver exported callbacks later. The actual BRBE feature detection will be
added by the driver itself.
CPU specific BRBE entries, cycle count, format support gets detected during
PMU init. This information gets saved in per-cpu struct pmu_hw_events which
later helps in operating BRBE during a perf event context.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
drivers/perf/arm_pmu_platform.c | 34 +++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
Comments
On Mon, Nov 07, 2022 at 11:55:11AM +0530, Anshuman Khandual wrote: > This adds arm pmu infrastrure to probe BRBE implementation's attributes via > driver exported callbacks later. The actual BRBE feature detection will be > added by the driver itself. > > CPU specific BRBE entries, cycle count, format support gets detected during > PMU init. This information gets saved in per-cpu struct pmu_hw_events which > later helps in operating BRBE during a perf event context. Do we expect this to vary between CPUs handled by the same struct arm_pmu ? > Cc: Will Deacon <will@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > drivers/perf/arm_pmu_platform.c | 34 +++++++++++++++++++++++++++++++++ > 1 file changed, 34 insertions(+) > > diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c > index 933b96e243b8..acdc445081aa 100644 > --- a/drivers/perf/arm_pmu_platform.c > +++ b/drivers/perf/arm_pmu_platform.c > @@ -172,6 +172,36 @@ static int armpmu_request_irqs(struct arm_pmu *armpmu) > return err; > } > > +static void arm_brbe_probe_cpu(void *info) > +{ > + struct pmu_hw_events *hw_events; > + struct arm_pmu *armpmu = info; > + > + /* > + * Return from here, if BRBE driver has not been > + * implemented for this PMU. This helps prevent > + * kernel crash later when brbe_probe() will be > + * called on the PMU. > + */ > + if (!armpmu->brbe_probe) > + return; Since this is a field on struct arm_pmu, why doesn't armpmu_request_brbe() check this before calling smp_call_function_single(), to avoid the redundant IPI? > + > + hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id()); > + armpmu->brbe_probe(hw_events); > +} > + > +static int armpmu_request_brbe(struct arm_pmu *armpmu) > +{ > + int cpu, err = 0; > + > + for_each_cpu(cpu, &armpmu->supported_cpus) { > + err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1); Why does this need to be called on each CPU in the supported_cpus mask? I don't see anything here to handle late hotplug, so this looks suspicious. Either we're missing something, or it's redundant at boot time. Thanks, Mark. > + if (err) > + return err; > + } > + return err; > +} > + > static void armpmu_free_irqs(struct arm_pmu *armpmu) > { > int cpu; > @@ -229,6 +259,10 @@ int arm_pmu_device_probe(struct platform_device *pdev, > if (ret) > goto out_free_irqs; > > + ret = armpmu_request_brbe(pmu); > + if (ret) > + goto out_free_irqs; > + > ret = armpmu_register(pmu); > if (ret) { > dev_err(dev, "failed to register PMU devices!\n"); > -- > 2.25.1 >
On 11/18/22 23:31, Mark Rutland wrote: > On Mon, Nov 07, 2022 at 11:55:11AM +0530, Anshuman Khandual wrote: >> This adds arm pmu infrastrure to probe BRBE implementation's attributes via >> driver exported callbacks later. The actual BRBE feature detection will be >> added by the driver itself. >> >> CPU specific BRBE entries, cycle count, format support gets detected during >> PMU init. This information gets saved in per-cpu struct pmu_hw_events which >> later helps in operating BRBE during a perf event context. > > Do we expect this to vary between CPUs handled by the same struct arm_pmu ? BRBE registers are per CPU, and the spec does not assert about BRBE properties being the same across the system, served via same the struct arm_pmu. Hence it would be inaccurate to make that assumption, which might have just avoided all these IPI based probes during boot. > >> Cc: Will Deacon <will@kernel.org> >> Cc: Mark Rutland <mark.rutland@arm.com> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> drivers/perf/arm_pmu_platform.c | 34 +++++++++++++++++++++++++++++++++ >> 1 file changed, 34 insertions(+) >> >> diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c >> index 933b96e243b8..acdc445081aa 100644 >> --- a/drivers/perf/arm_pmu_platform.c >> +++ b/drivers/perf/arm_pmu_platform.c >> @@ -172,6 +172,36 @@ static int armpmu_request_irqs(struct arm_pmu *armpmu) >> return err; >> } >> >> +static void arm_brbe_probe_cpu(void *info) >> +{ >> + struct pmu_hw_events *hw_events; >> + struct arm_pmu *armpmu = info; >> + >> + /* >> + * Return from here, if BRBE driver has not been >> + * implemented for this PMU. This helps prevent >> + * kernel crash later when brbe_probe() will be >> + * called on the PMU. >> + */ >> + if (!armpmu->brbe_probe) >> + return; > > Since this is a field on struct arm_pmu, why doesn't armpmu_request_brbe() > check this before calling smp_call_function_single(), to avoid the redundant > IPI? Makes sense, I will move the check inside armpmu_request_brbe() with return code -ENODEV when not available. > >> + >> + hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id()); >> + armpmu->brbe_probe(hw_events); >> +} >> + >> +static int armpmu_request_brbe(struct arm_pmu *armpmu) >> +{ >> + int cpu, err = 0; >> + >> + for_each_cpu(cpu, &armpmu->supported_cpus) { >> + err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1); > > Why does this need to be called on each CPU in the supported_cpus mask? Is not supported_cpus derived after partitioning the IRQ in pmu_parse_percpu_irq(). The idea is to fill up BRBE buffer attributes, on all such supported cpus which could trigger PMU interrupt. Is the concern, that not all cpus in supported_cpus mask might not be online during boot, hence IPIs could not be served, hence BRBE attributed for them could not be fetched ? > > I don't see anything here to handle late hotplug, so this looks suspicious. Right, I should add cpu hotplug handling, otherwise risk loosing BRBE support on cpus which might have been offline during boot i.e when above IPI based probe happened ? > Either we're missing something, or it's redundant at boot time. Should we add cpu hotplug online-offline handlers like some other PMU drivers ? Let me know if there are some other concerns. cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME, arm_brbe_cpu_startup, arm_brbe_cpu_teardown)
On Mon, Nov 21, 2022 at 12:06:31PM +0530, Anshuman Khandual wrote: > > > On 11/18/22 23:31, Mark Rutland wrote: > > On Mon, Nov 07, 2022 at 11:55:11AM +0530, Anshuman Khandual wrote: > >> This adds arm pmu infrastrure to probe BRBE implementation's attributes via > >> driver exported callbacks later. The actual BRBE feature detection will be > >> added by the driver itself. > >> > >> CPU specific BRBE entries, cycle count, format support gets detected during > >> PMU init. This information gets saved in per-cpu struct pmu_hw_events which > >> later helps in operating BRBE during a perf event context. > > > > Do we expect this to vary between CPUs handled by the same struct arm_pmu ? > > BRBE registers are per CPU, and the spec does not assert about BRBE properties > being the same across the system, served via same the struct arm_pmu. The same is true of the PMU, and struct arm_pmu does not cover the whole system, it covers each *micro-architecture* within the system. I think BRBE should be treated the same, i.e. uniform *within* a struct arm_pmu. > Hence it would be inaccurate to make that assumption, which might have just > avoided all these IPI based probes during boot. FWIW, I would be happy to IPI all CPUs during boot to verify uniformity of CPUs within an arm_pmu; I just don't think that BRBE should be treated differently from the rest of the PMU features. [...] > >> + hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id()); > >> + armpmu->brbe_probe(hw_events); > >> +} > >> + > >> +static int armpmu_request_brbe(struct arm_pmu *armpmu) > >> +{ > >> + int cpu, err = 0; > >> + > >> + for_each_cpu(cpu, &armpmu->supported_cpus) { > >> + err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1); > > > > Why does this need to be called on each CPU in the supported_cpus mask? > > Is not supported_cpus derived after partitioning the IRQ in pmu_parse_percpu_irq(). > The idea is to fill up BRBE buffer attributes, on all such supported cpus which could > trigger PMU interrupt. Is the concern, that not all cpus in supported_cpus mask might > not be online during boot, hence IPIs could not be served, hence BRBE attributed for > them could not be fetched ? As above, I think this is solvable if we mandate that BRBE must be uniform *within* an arm_pmu's supported CPUs; then we only need one CPU in the supported_cpus mask to be present at boot time, as with the rest of the PMU code. We could *verify* that when onlining a CPU. > > I don't see anything here to handle late hotplug, so this looks suspicious. > > Right, I should add cpu hotplug handling, otherwise risk loosing BRBE support on cpus > which might have been offline during boot i.e when above IPI based probe happened ? > > > Either we're missing something, or it's redundant at boot time. > > Should we add cpu hotplug online-offline handlers like some other PMU drivers ? Let > me know if there are some other concerns. > > cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME, > arm_brbe_cpu_startup, > arm_brbe_cpu_teardown) We *could* add that, but that's going to require ordering against the existing hooks for probing arm_pmu. Why can't this hang off the exising hooks for arm_pmu? We're treating this as part of the PMU anyway, so I don't understand why we should probe it separately. Thanks, Mark.
On 11/21/22 17:09, Mark Rutland wrote: > On Mon, Nov 21, 2022 at 12:06:31PM +0530, Anshuman Khandual wrote: >> >> >> On 11/18/22 23:31, Mark Rutland wrote: >>> On Mon, Nov 07, 2022 at 11:55:11AM +0530, Anshuman Khandual wrote: >>>> This adds arm pmu infrastrure to probe BRBE implementation's attributes via >>>> driver exported callbacks later. The actual BRBE feature detection will be >>>> added by the driver itself. >>>> >>>> CPU specific BRBE entries, cycle count, format support gets detected during >>>> PMU init. This information gets saved in per-cpu struct pmu_hw_events which >>>> later helps in operating BRBE during a perf event context. >>> >>> Do we expect this to vary between CPUs handled by the same struct arm_pmu ? >> >> BRBE registers are per CPU, and the spec does not assert about BRBE properties >> being the same across the system, served via same the struct arm_pmu. > > The same is true of the PMU, and struct arm_pmu does not cover the whole > system, it covers each *micro-architecture* within the system. > > I think BRBE should be treated the same, i.e. uniform *within* a struct > arm_pmu. Understood, detected on one and verified on all ? > >> Hence it would be inaccurate to make that assumption, which might have just >> avoided all these IPI based probes during boot. > > FWIW, I would be happy to IPI all CPUs during boot to verify uniformity of CPUs > within an arm_pmu; I just don't think that BRBE should be treated differently > from the rest of the PMU features. Hence BRBE probing should be done inside an updated __armv8pmu_probe_pmu(). static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) { struct armv8pmu_probe_info probe = { .pmu = cpu_pmu, .present = false, }; int ret; ret = smp_call_function_any(&cpu_pmu->supported_cpus, __armv8pmu_probe_pmu, &probe, 1); if (ret) return ret; return probe.present ? 0 : -ENODEV; } But if BRBE is assumed (and verified) to be same across the micro-architecture, then following BRBE attributes when captured should be part of 'struct arm_pmu' instead of 'struct pmu_hw_events' as is the case currently. /* Detected BRBE attributes */ bool brbe_v1p1; int brbe_cc; int brbe_nr; int brbe_format; > > [...] > >>>> + hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id()); >>>> + armpmu->brbe_probe(hw_events); >>>> +} >>>> + >>>> +static int armpmu_request_brbe(struct arm_pmu *armpmu) >>>> +{ >>>> + int cpu, err = 0; >>>> + >>>> + for_each_cpu(cpu, &armpmu->supported_cpus) { >>>> + err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1); >>> >>> Why does this need to be called on each CPU in the supported_cpus mask? >> >> Is not supported_cpus derived after partitioning the IRQ in pmu_parse_percpu_irq(). >> The idea is to fill up BRBE buffer attributes, on all such supported cpus which could >> trigger PMU interrupt. Is the concern, that not all cpus in supported_cpus mask might >> not be online during boot, hence IPIs could not be served, hence BRBE attributed for >> them could not be fetched ? > > As above, I think this is solvable if we mandate that BRBE must be uniform > *within* an arm_pmu's supported CPUs; then we only need one CPU in the > supported_cpus mask to be present at boot time, as with the rest of the PMU > code. > > We could *verify* that when onlining a CPU. Understood. > >>> I don't see anything here to handle late hotplug, so this looks suspicious. >> >> Right, I should add cpu hotplug handling, otherwise risk loosing BRBE support on cpus >> which might have been offline during boot i.e when above IPI based probe happened ? >> >>> Either we're missing something, or it's redundant at boot time. >> >> Should we add cpu hotplug online-offline handlers like some other PMU drivers ? Let >> me know if there are some other concerns. >> >> cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME, >> arm_brbe_cpu_startup, >> arm_brbe_cpu_teardown) > > We *could* add that, but that's going to require ordering against the existing > hooks for probing arm_pmu. Right. > > Why can't this hang off the exising hooks for arm_pmu? We're treating this as > part of the PMU anyway, so I don't understand why we should probe it > separately. Okay, will try and see what all changes are required to move the probing into generic arm_pmu probe, and capture the BRBE attributes inside struct arm_pmu.
diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c index 933b96e243b8..acdc445081aa 100644 --- a/drivers/perf/arm_pmu_platform.c +++ b/drivers/perf/arm_pmu_platform.c @@ -172,6 +172,36 @@ static int armpmu_request_irqs(struct arm_pmu *armpmu) return err; } +static void arm_brbe_probe_cpu(void *info) +{ + struct pmu_hw_events *hw_events; + struct arm_pmu *armpmu = info; + + /* + * Return from here, if BRBE driver has not been + * implemented for this PMU. This helps prevent + * kernel crash later when brbe_probe() will be + * called on the PMU. + */ + if (!armpmu->brbe_probe) + return; + + hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id()); + armpmu->brbe_probe(hw_events); +} + +static int armpmu_request_brbe(struct arm_pmu *armpmu) +{ + int cpu, err = 0; + + for_each_cpu(cpu, &armpmu->supported_cpus) { + err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1); + if (err) + return err; + } + return err; +} + static void armpmu_free_irqs(struct arm_pmu *armpmu) { int cpu; @@ -229,6 +259,10 @@ int arm_pmu_device_probe(struct platform_device *pdev, if (ret) goto out_free_irqs; + ret = armpmu_request_brbe(pmu); + if (ret) + goto out_free_irqs; + ret = armpmu_register(pmu); if (ret) { dev_err(dev, "failed to register PMU devices!\n");