Message ID | 20230216141240.3833272-2-mark.rutland@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2388:b0:96:219d:e725 with SMTP id i8csp548503dyf; Thu, 16 Feb 2023 06:14:08 -0800 (PST) X-Google-Smtp-Source: AK7set+r16s9Ztv1fgj2D/wStfxyqkuJw9EIi/2ijs7LY4rJOXUkeeneUaVW92RZtWOs7QOUgKiX X-Received: by 2002:a17:902:b610:b0:19a:b67a:5bd8 with SMTP id b16-20020a170902b61000b0019ab67a5bd8mr5031558pls.55.1676556847700; Thu, 16 Feb 2023 06:14:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676556847; cv=none; d=google.com; s=arc-20160816; b=OHvsWojE5mtKVAK7UNqkoHWH/U46DAmJ5xRrnHEPJ0BkVnv4p1cpBafaM7F3Sn9HTt 8JHEvaaIF79Rz7vrKEwqbybAK034tnJL0NVrZtnY5DmhT7PzL+hS6k4wHocupn5Adti9 tggKKR7WtLKESZyX1TwdJS49Kt/jaFYAQWdeQhTkEFm0xG9DUGTBmZC4sCRaP36r/T5m 1P6Dmu5lPBAb+xl+TT1pbY6TulmkdT1bZytmo9EU4J3BXozEgZ64O57FJ2tP9+wIRz88 S9tZpeQe5wht/Y+NKJx3rtzx/cQBmUat3h14D34tCCUHk5mXj9lqzXtEVcqxSYL8K7Uq e3lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Ud0uEa/Yl8o92oWh8AtY4IMoLQ67q/DBKAAamzrU/5g=; b=B2IpH3sdgk2MlFvAVQ/u5Gse/GW0eYd6kjsYNuKykEpJ/x1TTmV4zvELYB8LqHKKLd qBd3sgiAG9gx6ekLUmvEf/JZw/4EmzyQ1pL+eAKbM7/CE9ojylSuuG7C013rtMwz+Pxz 0UAdS21gpa2meVEfG+30WqZ0s5NXSmCrMxIkNDvYjWMo/lJbcSrVMp1bJ9X2uMd8xnNC 6ALDlrsLUa5pCsg5XXsnrqcWvm9Tsy5ft+Qato5dEKqyiYShOBJ8wnfH8YNyJ5xNBkP7 LV1J2Vqy/ldbVITA1BlbPBYx7s76NNjiOauMi5OzQvugS5j4YjOxYm12XZXez3rkAPFC d1Wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q24-20020a63d618000000b004fbdb315cf0si1938565pgg.322.2023.02.16.06.13.55; Thu, 16 Feb 2023 06:14:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230005AbjBPOM7 (ORCPT <rfc822;aimixsaka@gmail.com> + 99 others); Thu, 16 Feb 2023 09:12:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229712AbjBPOMt (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 16 Feb 2023 09:12:49 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B7D23212A8 for <linux-kernel@vger.kernel.org>; Thu, 16 Feb 2023 06:12:47 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 37F55113E; Thu, 16 Feb 2023 06:13:30 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2A97C3F703; Thu, 16 Feb 2023 06:12:46 -0800 (PST) From: Mark Rutland <mark.rutland@arm.com> To: linux-arm-kernel@lists.infradead.org Cc: asahi@lists.linux.dev, ecurtin@redhat.com, j@jannau.net, lina@asahilina.net, linux-kernel@vger.kernel.org, mark.rutland@arm.com, peterz@infradead.org, ravi.bangoria@amd.com, will@kernel.org Subject: [PATCH 1/2] arm_pmu: fix event CPU filtering Date: Thu, 16 Feb 2023 14:12:38 +0000 Message-Id: <20230216141240.3833272-2-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230216141240.3833272-1-mark.rutland@arm.com> References: <20230216141240.3833272-1-mark.rutland@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757997273273958688?= X-GMAIL-MSGID: =?utf-8?q?1757997273273958688?= |
Series |
arm_pmu: fix fallout from context handling rewrite
|
|
Commit Message
Mark Rutland
Feb. 16, 2023, 2:12 p.m. UTC
Janne reports that perf has been broken on Apple M1 as of commit:
bd27568117664b8b ("perf: Rewrite core context handling")
That commit replaced the pmu::filter_match() callback with
pmu::filter(), whose return value has the opposite polarity, with true
implying events should be ignored rather than scheduled. While an
attempt was made to update the logic in armv8pmu_filter() and
armpmu_filter() accordingly, the return value remains inverted in a
couple of cases:
* If the arm_pmu does not have an arm_pmu::filter() callback,
armpmu_filter() will always return whether the CPU is supported rather
than whether the CPU is not supported.
As a result, the perf core will not schedule events on supported CPUs,
resulting in a loss of events. Additionally, the perf core will
attempt to schedule events on unsupported CPUs, but this will be
rejected by armpmu_add(), which may result in a loss of events from
other PMUs on those unsupported CPUs.
* If the arm_pmu does have an arm_pmu::filter() callback, and
armpmu_filter() is called on a CPU which is not supported by the
arm_pmu, armpmu_filter() will return false rather than true.
As a result, the perf core will attempt to schedule events on
unsupported CPUs, but this will be rejected by armpmu_add(), which may
result in a loss of events from other PMUs on those unsupported CPUs.
This means a loss of events can be seen with any arm_pmu driver, but
with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an
arm_pmu::filter() callback) the event loss will be more limited and may
go unnoticed, which is how this issue evaded testing so far.
Fix the CPU filtering by performing this consistently in
armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
armv8pmu_filter() implementation.
Commit bd2756811766 also silently removed the CHAIN event filtering from
armv8pmu_filter(), which will be addressed by a separate patch without
using the filter callback.
Fixes: bd27568117664b8b ("perf: Rewrite core context handling")
Reported-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Eric Curtin <ecurtin@redhat.com>
---
arch/arm64/kernel/perf_event.c | 7 -------
drivers/perf/arm_pmu.c | 8 +-------
include/linux/perf/arm_pmu.h | 1 -
3 files changed, 1 insertion(+), 15 deletions(-)
Comments
On 2023-02-16 14:12:38 +0000, Mark Rutland wrote: > Janne reports that perf has been broken on Apple M1 as of commit: > > bd27568117664b8b ("perf: Rewrite core context handling") > > That commit replaced the pmu::filter_match() callback with > pmu::filter(), whose return value has the opposite polarity, with true > implying events should be ignored rather than scheduled. While an > attempt was made to update the logic in armv8pmu_filter() and > armpmu_filter() accordingly, the return value remains inverted in a > couple of cases: > > * If the arm_pmu does not have an arm_pmu::filter() callback, > armpmu_filter() will always return whether the CPU is supported rather > than whether the CPU is not supported. > > As a result, the perf core will not schedule events on supported CPUs, > resulting in a loss of events. Additionally, the perf core will > attempt to schedule events on unsupported CPUs, but this will be > rejected by armpmu_add(), which may result in a loss of events from > other PMUs on those unsupported CPUs. > > * If the arm_pmu does have an arm_pmu::filter() callback, and > armpmu_filter() is called on a CPU which is not supported by the > arm_pmu, armpmu_filter() will return false rather than true. > > As a result, the perf core will attempt to schedule events on > unsupported CPUs, but this will be rejected by armpmu_add(), which may > result in a loss of events from other PMUs on those unsupported CPUs. > > This means a loss of events can be seen with any arm_pmu driver, but > with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an > arm_pmu::filter() callback) the event loss will be more limited and may > go unnoticed, which is how this issue evaded testing so far. > > Fix the CPU filtering by performing this consistently in > armpmu_filter(), and remove the redundant arm_pmu::filter() callback and > armv8pmu_filter() implementation. > > Commit bd2756811766 also silently removed the CHAIN event filtering from > armv8pmu_filter(), which will be addressed by a separate patch without > using the filter callback. > > Fixes: bd27568117664b8b ("perf: Rewrite core context handling") > Reported-by: Janne Grunau <j@jannau.net> > Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/ > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ravi Bangoria <ravi.bangoria@amd.com> > Cc: Asahi Lina <lina@asahilina.net> > Cc: Eric Curtin <ecurtin@redhat.com> > --- > arch/arm64/kernel/perf_event.c | 7 ------- > drivers/perf/arm_pmu.c | 8 +------- > include/linux/perf/arm_pmu.h | 1 - > 3 files changed, 1 insertion(+), 15 deletions(-) > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > index a5193f2146a6..3e43538f6b72 100644 > --- a/arch/arm64/kernel/perf_event.c > +++ b/arch/arm64/kernel/perf_event.c > @@ -1023,12 +1023,6 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, > return 0; > } > > -static bool armv8pmu_filter(struct pmu *pmu, int cpu) > -{ > - struct arm_pmu *armpmu = to_arm_pmu(pmu); > - return !cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus); > -} > - > static void armv8pmu_reset(void *info) > { > struct arm_pmu *cpu_pmu = (struct arm_pmu *)info; > @@ -1258,7 +1252,6 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name, > cpu_pmu->stop = armv8pmu_stop; > cpu_pmu->reset = armv8pmu_reset; > cpu_pmu->set_event_filter = armv8pmu_set_event_filter; > - cpu_pmu->filter = armv8pmu_filter; > > cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx; > > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c > index 9b593f985805..40f70f83daba 100644 > --- a/drivers/perf/arm_pmu.c > +++ b/drivers/perf/arm_pmu.c > @@ -550,13 +550,7 @@ static void armpmu_disable(struct pmu *pmu) > static bool armpmu_filter(struct pmu *pmu, int cpu) > { > struct arm_pmu *armpmu = to_arm_pmu(pmu); > - bool ret; > - > - ret = cpumask_test_cpu(cpu, &armpmu->supported_cpus); > - if (ret && armpmu->filter) > - return armpmu->filter(pmu, cpu); > - > - return ret; > + return !cpumask_test_cpu(cpu, &armpmu->supported_cpus); > } > > static ssize_t cpus_show(struct device *dev, > diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h > index ef914a600087..525b5d64e394 100644 > --- a/include/linux/perf/arm_pmu.h > +++ b/include/linux/perf/arm_pmu.h > @@ -100,7 +100,6 @@ struct arm_pmu { > void (*stop)(struct arm_pmu *); > void (*reset)(void *); > int (*map_event)(struct perf_event *event); > - bool (*filter)(struct pmu *pmu, int cpu); > int num_events; > bool secure_access; /* 32-bit ARM only */ > #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40 This works as well. I limited the patch to the minimal fix this this late in the cycle. Tested-by: Janne Grunau <j@jannau.net> thanks, Janne
On Thu, Feb 16, 2023 at 03:35:19PM +0100, Janne Grunau wrote: > On 2023-02-16 14:12:38 +0000, Mark Rutland wrote: > > Fix the CPU filtering by performing this consistently in > > armpmu_filter(), and remove the redundant arm_pmu::filter() callback and > > armv8pmu_filter() implementation. > > > > Commit bd2756811766 also silently removed the CHAIN event filtering from > > armv8pmu_filter(), which will be addressed by a separate patch without > > using the filter callback. [...] > This works as well. I limited the patch to the minimal fix this > this late in the cycle. I did appreciate that you'd made the effort for the minimal fix; had the issue with CHAIN events not existed I would have acked that as-is and done the simplification later. Given the CHAIN issue and given the simplification make the code "obviously correct" I think it's preferable to do both bits now. > Tested-by: Janne Grunau <j@jannau.net> Thanks! Hopefully Will or Peter can pick this up shortly; I'm assuming that Will can take this via the arm64 tree. Mark.
On Thu, Feb 16, 2023 at 03:13:11PM +0000, Mark Rutland wrote: > On Thu, Feb 16, 2023 at 03:35:19PM +0100, Janne Grunau wrote: > > On 2023-02-16 14:12:38 +0000, Mark Rutland wrote: > > > Fix the CPU filtering by performing this consistently in > > > armpmu_filter(), and remove the redundant arm_pmu::filter() callback and > > > armv8pmu_filter() implementation. > > > > > > Commit bd2756811766 also silently removed the CHAIN event filtering from > > > armv8pmu_filter(), which will be addressed by a separate patch without > > > using the filter callback. > > [...] > > > This works as well. I limited the patch to the minimal fix this > > this late in the cycle. > > I did appreciate that you'd made the effort for the minimal fix; had the issue > with CHAIN events not existed I would have acked that as-is and done the > simplification later. Given the CHAIN issue and given the simplification make > the code "obviously correct" I think it's preferable to do both bits now. > > > Tested-by: Janne Grunau <j@jannau.net> > > Thanks! > > Hopefully Will or Peter can pick this up shortly; I'm assuming that Will can > take this via the arm64 tree. I'll grab 'em. Will
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index a5193f2146a6..3e43538f6b72 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -1023,12 +1023,6 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, return 0; } -static bool armv8pmu_filter(struct pmu *pmu, int cpu) -{ - struct arm_pmu *armpmu = to_arm_pmu(pmu); - return !cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus); -} - static void armv8pmu_reset(void *info) { struct arm_pmu *cpu_pmu = (struct arm_pmu *)info; @@ -1258,7 +1252,6 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name, cpu_pmu->stop = armv8pmu_stop; cpu_pmu->reset = armv8pmu_reset; cpu_pmu->set_event_filter = armv8pmu_set_event_filter; - cpu_pmu->filter = armv8pmu_filter; cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx; diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 9b593f985805..40f70f83daba 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -550,13 +550,7 @@ static void armpmu_disable(struct pmu *pmu) static bool armpmu_filter(struct pmu *pmu, int cpu) { struct arm_pmu *armpmu = to_arm_pmu(pmu); - bool ret; - - ret = cpumask_test_cpu(cpu, &armpmu->supported_cpus); - if (ret && armpmu->filter) - return armpmu->filter(pmu, cpu); - - return ret; + return !cpumask_test_cpu(cpu, &armpmu->supported_cpus); } static ssize_t cpus_show(struct device *dev, diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index ef914a600087..525b5d64e394 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -100,7 +100,6 @@ struct arm_pmu { void (*stop)(struct arm_pmu *); void (*reset)(void *); int (*map_event)(struct perf_event *event); - bool (*filter)(struct pmu *pmu, int cpu); int num_events; bool secure_access; /* 32-bit ARM only */ #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40