Message ID | 20230315051444.1683170-1-anshuman.khandual@arm.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp2150863wrd; Tue, 14 Mar 2023 22:39:07 -0700 (PDT) X-Google-Smtp-Source: AK7set9uY8qFgZj2+fyb+5S0MGdx9SmQ7JAnaKPzfkubjs9AUsfweyWAa4sNjsSthrZX69u3Waai X-Received: by 2002:a17:903:187:b0:19c:dc38:3eb5 with SMTP id z7-20020a170903018700b0019cdc383eb5mr1782624plg.14.1678858747499; Tue, 14 Mar 2023 22:39:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678858747; cv=none; d=google.com; s=arc-20160816; b=DixDE4sdfDCU/c5c9XIcbsQeiggLafIUNWQYPG6s9o8QJBxbgQiPQ6TfxhAe+GBzyB dBq3YOLm0ESx0J/ERofZp8APJmE3oov7jTra1LMPWRKeKfMJZwmUL4jR5EUiwC+J85XS C9zAO1xO7Z/p81Np5/y2BaiG2els7wy+4uiVOlqNCH8k1Zo7yHXoXw6ZjJNDVds9u5kY O4GdHMtgbZqT8KI7IA8dQIkfl1rHJ0fLXJLSumhamFqq+iu1O+6GOpveiCio50h+QUSu BaPCGvHs36UiscaIdOLeIXA0ofQCyuXPwg0FAMr511D5+e9b2Kggjsia8ioMJjeOO58+ XknA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=gUSzKa10ZJP72DjdAH9HHlh+8zuu2v+AIIwSyUpQyEA=; b=AadC0ySacNjg3Os+VJmLplNZug6ixMKfIj1LcWP6hQRL/IpMXiBxzSXvuJZPUczGOw cfdifLTf63xKQ5i20Iq1FhFjUqGeTYe9hSC1JDwg4PHLssW6Z3spoFPo/yhxEYGp9mVq tWqo3wVdX6ag/5KjypqoLlUy1Re6KpNZgV0rZH+8N/3LgjB1Dp21LxB7RAaDQSzx+cNv +t0b3TGLyIhmU1gHssGItfSEXfqTiQOO6anIapozZLIxDR5pbDZQPWWh6aBBnqrsKik2 P+Hvd509nRVHIOn+dvwvubvAa7gQgFU7p/9l3OpC+ocSWrXRePIf917SaCdDzcZaAciP t+xQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jy4-20020a17090342c400b0019ea3ee6f0bsi4086628plb.492.2023.03.14.22.38.52; Tue, 14 Mar 2023 22:39:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231680AbjCOFRK (ORCPT <rfc822;realc9580@gmail.com> + 99 others); Wed, 15 Mar 2023 01:17:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231400AbjCOFPX (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 15 Mar 2023 01:15:23 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CADB42BF03; Tue, 14 Mar 2023 22:14:58 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BFFA32F4; Tue, 14 Mar 2023 22:15:41 -0700 (PDT) Received: from a077893.blr.arm.com (unknown [10.162.41.10]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5192E3F67D; Tue, 14 Mar 2023 22:14:53 -0700 (PDT) From: Anshuman Khandual <anshuman.khandual@arm.com> To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, catalin.marinas@arm.com, mark.rutland@arm.com Cc: Anshuman Khandual <anshuman.khandual@arm.com>, Mark Brown <broonie@kernel.org>, James Clark <james.clark@arm.com>, Rob Herring <robh@kernel.org>, Marc Zyngier <maz@kernel.org>, Suzuki Poulose <suzuki.poulose@arm.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, linux-perf-users@vger.kernel.org Subject: [PATCH V9 00/10] arm64/perf: Enable branch stack sampling Date: Wed, 15 Mar 2023 10:44:34 +0530 Message-Id: <20230315051444.1683170-1-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760410990067156397?= X-GMAIL-MSGID: =?utf-8?q?1760410990067156397?= |
Series |
arm64/perf: Enable branch stack sampling
|
|
Message
Anshuman Khandual
March 15, 2023, 5:14 a.m. UTC
This series enables perf branch stack sampling support on arm64 platform via a new arch feature called Branch Record Buffer Extension (BRBE). All relevant register definitions could be accessed here. https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers This series applies on 6.3-rc1 after applying the following patch from Mark which allows enums in SysregFields blocks in sysreg tools. https://lore.kernel.org/all/20230306114836.2575432-1-mark.rutland@arm.com/ Changes in V9: - Fixed build problem with has_branch_stack() in arm64 header - BRBINF_EL1 definition has been changed from 'Sysreg' to 'SysregFields' - Renamed all BRBINF_EL1 call sites as BRBINFx_EL1 - Dropped static const char branch_filter_error_msg[] - Implemented a positive list check for BRBE supported perf branch filters - Added a comment in armv8pmu_handle_irq() - Implemented per-cpu allocation for struct branch_record records - Skipped looping through bank 1 if an invalid record is detected in bank 0 - Added comment in armv8pmu_branch_read() explaining prohibited region etc - Added comment warning about erroneously marking transactions as aborted - Replaced the first argument (perf_branch_entry) in capture_brbe_flags() - Dropped the last argument (idx) in capture_brbe_flags() - Dropped the brbcr argument from capture_brbe_flags() - Used perf_sample_save_brstack() to capture branch records for perf_sample_data - Added comment explaining rationale for setting BRBCR_EL1_FZP for user only traces - Dropped BRBE prohibited state mechanism while in armv8pmu_branch_read() - Implemented event task context based branch records save mechanism Changes in V8: https://lore.kernel.org/all/20230123125956.1350336-1-anshuman.khandual@arm.com/ - Replaced arm_pmu->features as arm_pmu->has_branch_stack, updated its helper - Added a comment and line break before arm_pmu->private element - Added WARN_ON_ONCE() in helpers i.e armv8pmu_branch_[read|valid|enable|disable]() - Dropped comments in armv8pmu_enable_event() and armv8pmu_disable_event() - Replaced open bank encoding in BRBFCR_EL1 with SYS_FIELD_PREP() - Changed brbe_hw_attr->brbe_version from 'bool' to 'int' - Updated pr_warn() as pr_warn_once() with values in brbe_get_perf_[type|priv]() - Replaced all pr_warn_once() as pr_debug_once() in armv8pmu_branch_valid() - Added a comment in branch_type_to_brbcr() for the BRBCR_EL1 privilege settings - Modified the comment related to BRBINFx_EL1.LASTFAILED in capture_brbe_flags() - Modified brbe_get_perf_entry_type() as brbe_set_perf_entry_type() - Renamed brbe_valid() as brbe_record_is_complete() - Renamed brbe_source() as brbe_record_is_source_only() - Renamed brbe_target() as brbe_record_is_target_only() - Inverted checks for !brbe_record_is_[target|source]_only() for info capture - Replaced 'fetch' with 'get' in all helpers that extract field value - Dropped 'static int brbe_current_bank' optimization in select_brbe_bank() - Dropped select_brbe_bank_index() completely, added capture_branch_entry() - Process captured branch entries in two separate loops one for each BRBE bank - Moved branch_records_alloc() inside armv8pmu_probe_pmu() - Added a forward declaration for the helper has_branch_stack() - Added new callbacks armv8pmu_private_alloc() and armv8pmu_private_free() - Updated armv8pmu_probe_pmu() to allocate the private structure before SMP call Changes in V7: https://lore.kernel.org/all/20230105031039.207972-1-anshuman.khandual@arm.com/ - Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event - Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header - Defined BRBFCR_EL1_DEFAULT_CONFIG in the header - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP - Defined BRBCR_EL1_DEFAULT_TS in the header - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS - Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr() - Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr() - Also set BRBE in paused state in armv8pmu_branch_disable() - Dropped brbe_paused(), set_brbe_paused() helpers - Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid() - Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr - Added valid_brbe_[cc, format, version]() helpers - Split a separate brbe_attributes_probe() from armv8pmu_branch_probe() - Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid() - Defined enum brbe_bank_idx with possible values for BRBE bank indices - Changed armpmu->hw_attr into armpmu->private - Added missing space in stub definition for armv8pmu_branch_valid() - Replaced both kmalloc() with kzalloc() - Added BRBE_BANK_MAX_ENTRIES - Updated comment for capture_brbe_flags() - Updated comment for struct brbe_hw_attr - Dropped space after type cast in couple of places - Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read() - Captured cpuc->branches->branch_entries[idx] in a local variable - Dropped saved_priv from armv8pmu_branch_read() - Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration - Replaced with FIELD_GET() and FIELD_PREP() wherever applicable - Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL - Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version() select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation - Reorganized brbe_valid_nr() and dropped the pr_warn() message - Changed probe sequence in brbe_attributes_probe() - Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state - Disable BRBE before disabling the PMU event counter - Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode() - Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported() Changes in V6: https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.khandual@arm.com/ - Restore the exception level privilege after reading the branch records - Unpause the buffer after reading the branch records - Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level - Reworked BRBE implementation and branch stack sampling support on arm pmu - BRBE implementation is now part of overall ARMV8 PMU implementation - BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/ - CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig - File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c - File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h - BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events - BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events - BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation - Added sched_task() callback into struct arm_pmu - Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes - Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu - Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events - Dropped brbe_users, brbe_context tracking in struct hw_pmu_events - Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag - armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation - Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe - Added armv8pmu_branch_reset() inside armv8pmu_branch_enable() - Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK - Dropped set_brbe_disabled() as well - Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events Changes in V5: https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/ - Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01 - Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI - Changed config ARM_BRBE_PMU from 'tristate' to 'bool' Changes in V4: https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/ - Changed ../tools/sysreg declarations as suggested - Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags - Dropped perfmon_capable() check in armpmu_event_init() - s/pr_warn_once/pr_info in armpmu_event_init() - Added brbe_format element into struct pmu_hw_events - Changed v1p1 as brbe_v1p1 in struct pmu_hw_events - Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning Changes in V3: https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/ - Moved brbe_stack from the stack and now dynamically allocated - Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv() - Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg - Created dummy BRBINF_EL1 field definitions in tools/sysreg - Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable() - Both exception and exception return branche records are now captured only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already been checked in generic perf via perf_allow_kernel() Changes in V2: https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/ - Dropped branch sample filter helpers consolidation patch from this series - Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable() - Use cached perfmon_capable() while configuring BRBE branch record filters Changes in V1: https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/ - Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers - Process new perf branch types via PERF_BR_EXTEND_ABI Changes in RFC V2: https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/ - Added branch_sample_priv() while consolidating other branch sample filter helpers - Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc - Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5) - Added documentation for struct arm_pmu changes, updated commit message - Updated commit message for BRBE detection infrastructure patch - PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver) - Branch privilege state capture mechanism has now moved inside the driver Changes in RFC V1: https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/ Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Anshuman Khandual (10): drivers: perf: arm_pmu: Add new sched_task() callback arm64/perf: Add BRBE registers and fields arm64/perf: Add branch stack support in struct arm_pmu arm64/perf: Add branch stack support in struct pmu_hw_events arm64/perf: Add branch stack support in ARMV8 PMU arm64/perf: Enable branch stack events via FEAT_BRBE arm64/perf: Add PERF_ATTACH_TASK_DATA to events with has_branch_stack() arm64/perf: Add struct brbe_regset helper functions arm64/perf: Implement branch records save on task sched out arm64/perf: Implement branch records save on PMU IRQ arch/arm64/Kconfig | 11 + arch/arm64/include/asm/perf_event.h | 46 ++ arch/arm64/include/asm/sysreg.h | 103 ++++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/brbe.c | 758 ++++++++++++++++++++++++++++ arch/arm64/kernel/brbe.h | 270 ++++++++++ arch/arm64/kernel/perf_event.c | 106 +++- arch/arm64/tools/sysreg | 159 ++++++ drivers/perf/arm_pmu.c | 12 +- include/linux/perf/arm_pmu.h | 22 +- 10 files changed, 1462 insertions(+), 26 deletions(-) create mode 100644 arch/arm64/kernel/brbe.c create mode 100644 arch/arm64/kernel/brbe.h
Comments
On Wed, Mar 15, 2023 at 10:44:34AM +0530, Anshuman Khandual wrote: > This series enables perf branch stack sampling support on arm64 platform > via a new arch feature called Branch Record Buffer Extension (BRBE). All > relevant register definitions could be accessed here. > > https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers While looking at another feature I noticed that HFGITR_EL2 has two traps for BRBE instructions, nBRBINJ and nBRBIALL which trap BRB INJ and BRB IALL. Even if we don't use those right now does it make sense to document a requirement for those traps to be disabled now in case we need them later, and do so during EL2 setup for KVM guests? That could always be done incrementally. I've got a patch adding the definition of that register to sysreg which I should be sending shortly, no need to duplicate that effort.
Hello Mark, On 3/22/23 00:32, Mark Brown wrote: > On Wed, Mar 15, 2023 at 10:44:34AM +0530, Anshuman Khandual wrote: >> This series enables perf branch stack sampling support on arm64 platform >> via a new arch feature called Branch Record Buffer Extension (BRBE). All >> relevant register definitions could be accessed here. >> >> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers > > While looking at another feature I noticed that HFGITR_EL2 has two traps > for BRBE instructions, nBRBINJ and nBRBIALL which trap BRB INJ and BRB > IALL. Even if we don't use those right now does it make sense to Right, current branch stack sampling experiments have been on EL2 host itself. > document a requirement for those traps to be disabled now in case we > need them later, and do so during EL2 setup for KVM guests? That could > always be done incrementally. Unlike all other instruction trap enable fields in SYS_HFGITR_EL2, these BRBE instructions ones are actually inverted in semantics i.e the particular fields need to be set for these traps to be disabled in EL2. SYS_HFGITR_EL2.nBRBIALL SYS_HFGITR_EL2.nBRBINJ By default entire SYS_HFGITR_EL2 is set as cleared during init and that would prevent a guest from using BRBE. init_kernel_el() init_el2() init_el2_state() __init_el2_fgt() ........ msr_s SYS_HFGITR_EL2, xzr ........ I guess something like the following (untested) needs to be done, to enable BRBE in guests. diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 037724b19c5c..309708127a2a 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -161,6 +161,15 @@ msr_s SYS_HFGWTR_EL2, x0 msr_s SYS_HFGITR_EL2, xzr + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_\@ + mov x0, xzr + orr x0, x0, #HFGITR_EL2_nBRBIALL + orr x0, x0, #HFGITR_EL2_nBRBINJ + msr_s SYS_HFGITR_EL2, x0 + +.Lskip_brbe_\@: mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4 cbz x1, .Lskip_fgt_\@ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index b3bc03ee22bd..3b939c42f3b8 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -527,6 +527,9 @@ #define SYS_HFGITR_EL2 sys_reg(3, 4, 1, 1, 6) #define SYS_HACR_EL2 sys_reg(3, 4, 1, 1, 7) +#define HFGITR_EL2_nBRBIALL (BIT(56)) +#define HFGITR_EL2_nBRBINJ (BIT(55)) + #define SYS_TTBR0_EL2 sys_reg(3, 4, 2, 0, 0) #define SYS_TTBR1_EL2 sys_reg(3, 4, 2, 0, 1) #define SYS_TCR_EL2 sys_reg(3, 4, 2, 0, 2) > > I've got a patch adding the definition of that register to sysreg which > I should be sending shortly, no need to duplicate that effort. Sure, I assume you are moving the existing definition for SYS_HFGITR_EL2 along with all its fields from ../include/asm/sysreg.h to ../tools/sysreg. Right, it makes sense. - Anshuman
On Thu, Mar 23, 2023 at 09:55:47AM +0530, Anshuman Khandual wrote: > On 3/22/23 00:32, Mark Brown wrote: > > document a requirement for those traps to be disabled now in case we > > need them later, and do so during EL2 setup for KVM guests? That could > > always be done incrementally. > Unlike all other instruction trap enable fields in SYS_HFGITR_EL2, these BRBE > instructions ones are actually inverted in semantics i.e the particular fields > need to be set for these traps to be disabled in EL2. Right, for backwards compatibility all newly added fields are trap by default. > SYS_HFGITR_EL2.nBRBIALL > SYS_HFGITR_EL2.nBRBINJ > By default entire SYS_HFGITR_EL2 is set as cleared during init and that would > prevent a guest from using BRBE. It should prevent the host as well shouldn't it? > I guess something like the following (untested) needs to be done, to enable > BRBE in guests. > + mrs x1, id_aa64dfr0_el1 > + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 > + cbz x1, .Lskip_brbe_\@ > + mov x0, xzr > + orr x0, x0, #HFGITR_EL2_nBRBIALL > + orr x0, x0, #HFGITR_EL2_nBRBINJ > + msr_s SYS_HFGITR_EL2, x0 > + > +.Lskip_brbe_\@: Yes, looks roughly what I'd expect. > > I've got a patch adding the definition of that register to sysreg which > > I should be sending shortly, no need to duplicate that effort. > Sure, I assume you are moving the existing definition for SYS_HFGITR_EL2 along > with all its fields from ../include/asm/sysreg.h to ../tools/sysreg. Right, it > makes sense. No fields at the minute but yes, like the other conversions.
On 3/23/23 18:24, Mark Brown wrote: > On Thu, Mar 23, 2023 at 09:55:47AM +0530, Anshuman Khandual wrote: >> On 3/22/23 00:32, Mark Brown wrote: > >>> document a requirement for those traps to be disabled now in case we >>> need them later, and do so during EL2 setup for KVM guests? That could >>> always be done incrementally. > >> Unlike all other instruction trap enable fields in SYS_HFGITR_EL2, these BRBE >> instructions ones are actually inverted in semantics i.e the particular fields >> need to be set for these traps to be disabled in EL2. > > Right, for backwards compatibility all newly added fields are trap by > default. Okay > >> SYS_HFGITR_EL2.nBRBIALL >> SYS_HFGITR_EL2.nBRBINJ > >> By default entire SYS_HFGITR_EL2 is set as cleared during init and that would >> prevent a guest from using BRBE. > > It should prevent the host as well shouldn't it? In a EL2 host environment, BRBE is being enabled either in EL2 (kernel/hv) or in EL0 (user space), it never gets enabled on EL1. Moreover BRBIALL/BRBINJ instructions are always executed while being inside EL2 (kernel/hv). Hence how could these instructions cause trap in EL2 ? > >> I guess something like the following (untested) needs to be done, to enable >> BRBE in guests. > >> + mrs x1, id_aa64dfr0_el1 >> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 >> + cbz x1, .Lskip_brbe_\@ >> + mov x0, xzr >> + orr x0, x0, #HFGITR_EL2_nBRBIALL >> + orr x0, x0, #HFGITR_EL2_nBRBINJ >> + msr_s SYS_HFGITR_EL2, x0 >> + >> +.Lskip_brbe_\@: > > Yes, looks roughly what I'd expect. I could send an stand alone patch after your latest series [1], which disables BRBINJ/BRBIALL instruction trap in EL2 to enable BRBE usage in the guest. https://lore.kernel.org/all/20230306-arm64-fgt-reg-gen-v3-2-decba93cbaab@kernel.org/T/ > >>> I've got a patch adding the definition of that register to sysreg which >>> I should be sending shortly, no need to duplicate that effort. > >> Sure, I assume you are moving the existing definition for SYS_HFGITR_EL2 along >> with all its fields from ../include/asm/sysreg.h to ../tools/sysreg. Right, it >> makes sense. > > No fields at the minute but yes, like the other conversions. Sure.
On Fri, Mar 24, 2023 at 08:50:32AM +0530, Anshuman Khandual wrote: > On 3/23/23 18:24, Mark Brown wrote: > > On Thu, Mar 23, 2023 at 09:55:47AM +0530, Anshuman Khandual wrote: > >> By default entire SYS_HFGITR_EL2 is set as cleared during init and that would > >> prevent a guest from using BRBE. > > It should prevent the host as well shouldn't it? > In a EL2 host environment, BRBE is being enabled either in EL2 (kernel/hv) or > in EL0 (user space), it never gets enabled on EL1. Moreover BRBIALL/BRBINJ > instructions are always executed while being inside EL2 (kernel/hv). Hence how > could these instructions cause trap in EL2 ? Ah, I see - I didn't realise this couldn't run at EL1. > > Yes, looks roughly what I'd expect. > I could send an stand alone patch after your latest series [1], which disables > BRBINJ/BRBIALL instruction trap in EL2 to enable BRBE usage in the guest. Sounds resaonable enough to me.
Hi Anshuman On Wed, Mar 15, 2023 at 10:44:34AM +0530, Anshuman Khandual wrote: > This series enables perf branch stack sampling support on arm64 platform > via a new arch feature called Branch Record Buffer Extension (BRBE). All > relevant register definitions could be accessed here. > > https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers > > This series applies on 6.3-rc1 after applying the following patch from Mark > which allows enums in SysregFields blocks in sysreg tools. > > https://lore.kernel.org/all/20230306114836.2575432-1-mark.rutland@arm.com/ As mentioned by Mark at: https://lore.kernel.org/r/ZB2sGrsbr58ttoWI@FVFF77S0Q05N this conflicts with supporting PMUv3 on AArch32. Please can you rebase onto for-next/perf, which will mean moving this driver back into drivers/perf/ now? Thanks, Will
On 4/11/23 18:33, Will Deacon wrote: > Hi Anshuman > > On Wed, Mar 15, 2023 at 10:44:34AM +0530, Anshuman Khandual wrote: >> This series enables perf branch stack sampling support on arm64 platform >> via a new arch feature called Branch Record Buffer Extension (BRBE). All >> relevant register definitions could be accessed here. >> >> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers >> >> This series applies on 6.3-rc1 after applying the following patch from Mark >> which allows enums in SysregFields blocks in sysreg tools. >> >> https://lore.kernel.org/all/20230306114836.2575432-1-mark.rutland@arm.com/ > > As mentioned by Mark at: > > https://lore.kernel.org/r/ZB2sGrsbr58ttoWI@FVFF77S0Q05N > > this conflicts with supporting PMUv3 on AArch32. Please can you rebase onto > for-next/perf, which will mean moving this driver back into drivers/perf/ > now? Hi Will, I am back from a long vacation, will go through the earlier discussions on this and rework the series as required. - Anshuman