[V11,03/10] arm64/perf: Add branch stack support in struct arm_pmu

Message ID 20230531040428.501523-4-anshuman.khandual@arm.com
State New
Headers
Series arm64/perf: Enable branch stack sampling |

Commit Message

Anshuman Khandual May 31, 2023, 4:04 a.m. UTC
  This updates 'struct arm_pmu' for branch stack sampling support later. This
adds a new 'features' element in the structure to track supported features,
and another 'private' element to encapsulate implementation attributes on a
given 'struct arm_pmu'. These updates here will help in tracking any branch
stack sampling support, which is being added later. This also adds a helper
arm_pmu_branch_stack_supported().

This also enables perf branch stack sampling event on all 'struct arm pmu',
supporting the feature but after removing the current gate that blocks such
events unconditionally in armpmu_event_init(). Instead a quick probe can be
initiated via arm_pmu_branch_stack_supported() to ascertain the support.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Tested-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c       |  3 +--
 include/linux/perf/arm_pmu.h | 12 +++++++++++-
 2 files changed, 12 insertions(+), 3 deletions(-)
  

Comments

Mark Rutland June 5, 2023, 7:58 a.m. UTC | #1
On Wed, May 31, 2023 at 09:34:21AM +0530, Anshuman Khandual wrote:
> This updates 'struct arm_pmu' for branch stack sampling support later. This
> adds a new 'features' element in the structure to track supported features,
> and another 'private' element to encapsulate implementation attributes on a
> given 'struct arm_pmu'. These updates here will help in tracking any branch
> stack sampling support, which is being added later. This also adds a helper
> arm_pmu_branch_stack_supported().
> 
> This also enables perf branch stack sampling event on all 'struct arm pmu',
> supporting the feature but after removing the current gate that blocks such
> events unconditionally in armpmu_event_init(). Instead a quick probe can be
> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Tested-by: James Clark <james.clark@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  drivers/perf/arm_pmu.c       |  3 +--
>  include/linux/perf/arm_pmu.h | 12 +++++++++++-
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index aada47e3b126..d4a4f2bd89a5 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>  		return -ENOENT;
>  
> -	/* does not support taken branch sampling */
> -	if (has_branch_stack(event))
> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>  		return -EOPNOTSUPP;
>  
>  	return __hw_perf_event_init(event);
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index f7fbd162ca4c..0da745eaf426 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -102,7 +102,9 @@ struct arm_pmu {
>  	int		(*map_event)(struct perf_event *event);
>  	void		(*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
>  	int		num_events;
> -	bool		secure_access; /* 32-bit ARM only */
> +	unsigned int	secure_access	: 1, /* 32-bit ARM only */
> +			has_branch_stack: 1, /* 64-bit ARM only */
> +			reserved	: 30;
>  #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
>  	DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
>  #define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE	0x4000
> @@ -118,8 +120,16 @@ struct arm_pmu {
>  
>  	/* Only to be used by ACPI probing code */
>  	unsigned long acpi_cpuid;
> +
> +	/* Implementation specific attributes */
> +	void		*private;
>  };
>  
> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
> +{
> +	return armpmu->has_branch_stack;
> +}

Since this is a trivial test, and we already access the 'secure_access' field
directly, I'd prefer we removed this helper and directly accessesed
arm_pmu::has_branch_stack, e.g. with the logic in armpmu_event_init() being:

	if (has_branch_stack(event) && !armpmu->has_branch_stack)
		return -EOPNOTSUPP;

With that:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.
  
Anshuman Khandual June 6, 2023, 4:47 a.m. UTC | #2
On 6/5/23 13:28, Mark Rutland wrote:
> On Wed, May 31, 2023 at 09:34:21AM +0530, Anshuman Khandual wrote:
>> This updates 'struct arm_pmu' for branch stack sampling support later. This
>> adds a new 'features' element in the structure to track supported features,
>> and another 'private' element to encapsulate implementation attributes on a
>> given 'struct arm_pmu'. These updates here will help in tracking any branch
>> stack sampling support, which is being added later. This also adds a helper
>> arm_pmu_branch_stack_supported().
>>
>> This also enables perf branch stack sampling event on all 'struct arm pmu',
>> supporting the feature but after removing the current gate that blocks such
>> events unconditionally in armpmu_event_init(). Instead a quick probe can be
>> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Tested-by: James Clark <james.clark@arm.com>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  drivers/perf/arm_pmu.c       |  3 +--
>>  include/linux/perf/arm_pmu.h | 12 +++++++++++-
>>  2 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>> index aada47e3b126..d4a4f2bd89a5 100644
>> --- a/drivers/perf/arm_pmu.c
>> +++ b/drivers/perf/arm_pmu.c
>> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>>  		return -ENOENT;
>>  
>> -	/* does not support taken branch sampling */
>> -	if (has_branch_stack(event))
>> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>>  		return -EOPNOTSUPP;
>>  
>>  	return __hw_perf_event_init(event);
>> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
>> index f7fbd162ca4c..0da745eaf426 100644
>> --- a/include/linux/perf/arm_pmu.h
>> +++ b/include/linux/perf/arm_pmu.h
>> @@ -102,7 +102,9 @@ struct arm_pmu {
>>  	int		(*map_event)(struct perf_event *event);
>>  	void		(*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
>>  	int		num_events;
>> -	bool		secure_access; /* 32-bit ARM only */
>> +	unsigned int	secure_access	: 1, /* 32-bit ARM only */
>> +			has_branch_stack: 1, /* 64-bit ARM only */
>> +			reserved	: 30;
>>  #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
>>  	DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
>>  #define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE	0x4000
>> @@ -118,8 +120,16 @@ struct arm_pmu {
>>  
>>  	/* Only to be used by ACPI probing code */
>>  	unsigned long acpi_cpuid;
>> +
>> +	/* Implementation specific attributes */
>> +	void		*private;
>>  };
>>  
>> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
>> +{
>> +	return armpmu->has_branch_stack;
>> +}
> 
> Since this is a trivial test, and we already access the 'secure_access' field
> directly, I'd prefer we removed this helper and directly accessesed
> arm_pmu::has_branch_stack, e.g. with the logic in armpmu_event_init() being:
> 
> 	if (has_branch_stack(event) && !armpmu->has_branch_stack)
> 		return -EOPNOTSUPP;

Sure, will drop the helper and change as suggested in all the call sites.

> 
> With that:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.
  

Patch

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index aada47e3b126..d4a4f2bd89a5 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -510,8 +510,7 @@  static int armpmu_event_init(struct perf_event *event)
 		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
 		return -ENOENT;
 
-	/* does not support taken branch sampling */
-	if (has_branch_stack(event))
+	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
 		return -EOPNOTSUPP;
 
 	return __hw_perf_event_init(event);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index f7fbd162ca4c..0da745eaf426 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -102,7 +102,9 @@  struct arm_pmu {
 	int		(*map_event)(struct perf_event *event);
 	void		(*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
 	int		num_events;
-	bool		secure_access; /* 32-bit ARM only */
+	unsigned int	secure_access	: 1, /* 32-bit ARM only */
+			has_branch_stack: 1, /* 64-bit ARM only */
+			reserved	: 30;
 #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
 	DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
 #define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE	0x4000
@@ -118,8 +120,16 @@  struct arm_pmu {
 
 	/* Only to be used by ACPI probing code */
 	unsigned long acpi_cpuid;
+
+	/* Implementation specific attributes */
+	void		*private;
 };
 
+static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
+{
+	return armpmu->has_branch_stack;
+}
+
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
 
 u64 armpmu_event_update(struct perf_event *event);