[V13,-,RESEND,02/10] arm64/perf: Add BRBE registers and fields

Message ID 20230711082455.215983-3-anshuman.khandual@arm.com
State New
Headers
Series arm64/perf: Enable branch stack sampling |

Commit Message

Anshuman Khandual July 11, 2023, 8:24 a.m. UTC
  This adds BRBE related register definitions and various other related field
macros there in. These will be used subsequently in a BRBE driver which is
being added later on.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Tested-by: James Clark <james.clark@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
 arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
 2 files changed, 261 insertions(+)
  

Comments

Will Deacon July 28, 2023, 4:20 p.m. UTC | #1
On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
> This adds BRBE related register definitions and various other related field
> macros there in. These will be used subsequently in a BRBE driver which is
> being added later on.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Tested-by: James Clark <james.clark@arm.com>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>  2 files changed, 261 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index b481935e9314..f95e30c13c8b 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -163,6 +163,109 @@
>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>  
> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))

It's that time on a Friday but... aren't these macros busted? I think you
need brackets before adding the offset, otherwise wouldn't, for example,
target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
then start accessing source register 0?

I'm surprised that the compiler doesn't warn about this, but even more
surprised that you managed to test this.

Please tell me I'm wrong!

Will
  
James Clark July 28, 2023, 4:52 p.m. UTC | #2
On 28/07/2023 17:20, Will Deacon wrote:
> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
>> This adds BRBE related register definitions and various other related field
>> macros there in. These will be used subsequently in a BRBE driver which is
>> being added later on.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Tested-by: James Clark <james.clark@arm.com>
>> Reviewed-by: Mark Brown <broonie@kernel.org>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>>  2 files changed, 261 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index b481935e9314..f95e30c13c8b 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -163,6 +163,109 @@
>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>>  
>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
> 
> It's that time on a Friday but... aren't these macros busted? I think you
> need brackets before adding the offset, otherwise wouldn't, for example,
> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
> then start accessing source register 0?
> 
> I'm surprised that the compiler doesn't warn about this, but even more
> surprised that you managed to test this.
> 
> Please tell me I'm wrong!
> 
> Will

No I think you are right, it is wrong. Luckily there is already an
extraneous bracket so you you can fix it by moving one a place down:

  sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))

It's interesting because the test [1] is doing quite a bit and looking
at the branch info, and that src and targets match up to function names.
I also manually looked at the branch buffers and didn't see anything
obviously wrong like things that looked like branch infos in the source
or target fields. Will have to take another look to see if it would be
possible for the test to catch this.

James

[1]:
https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
  
Anshuman Khandual July 31, 2023, 2:33 a.m. UTC | #3
On 7/28/23 22:22, James Clark wrote:
> 
> 
> On 28/07/2023 17:20, Will Deacon wrote:
>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
>>> This adds BRBE related register definitions and various other related field
>>> macros there in. These will be used subsequently in a BRBE driver which is
>>> being added later on.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Marc Zyngier <maz@kernel.org>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Tested-by: James Clark <james.clark@arm.com>
>>> Reviewed-by: Mark Brown <broonie@kernel.org>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>>>  2 files changed, 261 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>> index b481935e9314..f95e30c13c8b 100644
>>> --- a/arch/arm64/include/asm/sysreg.h
>>> +++ b/arch/arm64/include/asm/sysreg.h
>>> @@ -163,6 +163,109 @@
>>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>>>  
>>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
>>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
>>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
>>
>> It's that time on a Friday but... aren't these macros busted? I think you
>> need brackets before adding the offset, otherwise wouldn't, for example,
>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
>> then start accessing source register 0?
>>
>> I'm surprised that the compiler doesn't warn about this, but even more
>> surprised that you managed to test this.
>>
>> Please tell me I'm wrong!
>>
>> Will
> 
> No I think you are right, it is wrong. Luckily there is already an
> extraneous bracket so you you can fix it by moving one a place down:
> 
>   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
> 
> It's interesting because the test [1] is doing quite a bit and looking
> at the branch info, and that src and targets match up to function names.
> I also manually looked at the branch buffers and didn't see anything
> obviously wrong like things that looked like branch infos in the source
> or target fields. Will have to take another look to see if it would be
> possible for the test to catch this.
> 
> James
> 
> [1]:
> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32

((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)

The additional brackets are useful in explicitly telling the compiler but
what it the compiler is just doing the right thing implicitly i.e computing
the shifting operation before doing the offset addition. During testing, all
those captured branch records looked alright. But that is no excuse, for not
doing the right thing to begin with i.e adding explicit brackets. I will fix
these in next version.
  
James Clark July 31, 2023, 8:07 a.m. UTC | #4
On 31/07/2023 03:33, Anshuman Khandual wrote:
> 
> 
> On 7/28/23 22:22, James Clark wrote:
>>
>>
>> On 28/07/2023 17:20, Will Deacon wrote:
>>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
>>>> This adds BRBE related register definitions and various other related field
>>>> macros there in. These will be used subsequently in a BRBE driver which is
>>>> being added later on.
>>>>
>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>> Cc: Will Deacon <will@kernel.org>
>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>> Cc: linux-kernel@vger.kernel.org
>>>> Tested-by: James Clark <james.clark@arm.com>
>>>> Reviewed-by: Mark Brown <broonie@kernel.org>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>>>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>>>>  2 files changed, 261 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>> index b481935e9314..f95e30c13c8b 100644
>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>> @@ -163,6 +163,109 @@
>>>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>>>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>>>>  
>>>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
>>>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
>>>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
>>>
>>> It's that time on a Friday but... aren't these macros busted? I think you
>>> need brackets before adding the offset, otherwise wouldn't, for example,
>>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
>>> then start accessing source register 0?
>>>
>>> I'm surprised that the compiler doesn't warn about this, but even more
>>> surprised that you managed to test this.
>>>
>>> Please tell me I'm wrong!
>>>
>>> Will
>>
>> No I think you are right, it is wrong. Luckily there is already an
>> extraneous bracket so you you can fix it by moving one a place down:
>>
>>   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
>>
>> It's interesting because the test [1] is doing quite a bit and looking
>> at the branch info, and that src and targets match up to function names.
>> I also manually looked at the branch buffers and didn't see anything
>> obviously wrong like things that looked like branch infos in the source
>> or target fields. Will have to take another look to see if it would be
>> possible for the test to catch this.
>>
>> James
>>
>> [1]:
>> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
> 
> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
> 
> The additional brackets are useful in explicitly telling the compiler but
> what it the compiler is just doing the right thing implicitly i.e computing
> the shifting operation before doing the offset addition. During testing, all
> those captured branch records looked alright. But that is no excuse, for not
> doing the right thing to begin with i.e adding explicit brackets. I will fix
> these in next version.

Are you sure? If you see the return value here, it's 0 until register
16, then it becomes 1:

  https://godbolt.org/z/c7zhbno3n

If you add the bracket it does actually change the return value.
  
Mark Rutland July 31, 2023, 9:06 a.m. UTC | #5
On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote:
> 
> 
> On 7/28/23 22:22, James Clark wrote:
> > 
> > 
> > On 28/07/2023 17:20, Will Deacon wrote:
> >> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
> >>> This adds BRBE related register definitions and various other related field
> >>> macros there in. These will be used subsequently in a BRBE driver which is
> >>> being added later on.
> >>>
> >>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>> Cc: Will Deacon <will@kernel.org>
> >>> Cc: Marc Zyngier <maz@kernel.org>
> >>> Cc: Mark Rutland <mark.rutland@arm.com>
> >>> Cc: linux-arm-kernel@lists.infradead.org
> >>> Cc: linux-kernel@vger.kernel.org
> >>> Tested-by: James Clark <james.clark@arm.com>
> >>> Reviewed-by: Mark Brown <broonie@kernel.org>
> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >>> ---
> >>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
> >>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
> >>>  2 files changed, 261 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> >>> index b481935e9314..f95e30c13c8b 100644
> >>> --- a/arch/arm64/include/asm/sysreg.h
> >>> +++ b/arch/arm64/include/asm/sysreg.h
> >>> @@ -163,6 +163,109 @@
> >>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
> >>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
> >>>  
> >>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
> >>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
> >>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
> >>
> >> It's that time on a Friday but... aren't these macros busted? I think you
> >> need brackets before adding the offset, otherwise wouldn't, for example,
> >> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
> >> then start accessing source register 0?
> >>
> >> I'm surprised that the compiler doesn't warn about this, but even more
> >> surprised that you managed to test this.
> >>
> >> Please tell me I'm wrong!
> >>
> >> Will
> > 
> > No I think you are right, it is wrong. Luckily there is already an
> > extraneous bracket so you you can fix it by moving one a place down:
> > 
> >   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
> > 
> > It's interesting because the test [1] is doing quite a bit and looking
> > at the branch info, and that src and targets match up to function names.
> > I also manually looked at the branch buffers and didn't see anything
> > obviously wrong like things that looked like branch infos in the source
> > or target fields. Will have to take another look to see if it would be
> > possible for the test to catch this.
> > 
> > James
> > 
> > [1]:
> > https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
> 
> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
> 
> The additional brackets are useful in explicitly telling the compiler but
> what it the compiler is just doing the right thing implicitly i.e computing
> the shifting operation before doing the offset addition.

No; that is not correct. In c, '+' has higher precedence than '>>'.

For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as
'(a >> b) + c'

That's trivial to test:

| [mark@gravadlaks:~]% cat shiftadd.c 
| #include <stdio.h>
| 
| unsigned long logshiftadd(unsigned long a,
|                           unsigned long b,
|                           unsigned long c)
| {
|         printf("%ld >> %ld + %ld is %ld\n",
|                a, b, c, a >> b + c);
| }
| 
| int main(int argc, char *argv)
| {
|         logshiftadd(0, 0, 0);
|         logshiftadd(0, 0, 1);
|         logshiftadd(0, 0, 2);
|         printf("\n");
|         logshiftadd(1024, 0, 0);
|         logshiftadd(1024, 0, 1);
|         logshiftadd(1024, 0, 2);
|         printf("\n");
|         logshiftadd(1024, 2, 0);
|         logshiftadd(1024, 2, 1);
|         logshiftadd(1024, 2, 2);
| 
|         return 0;
| }
| [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd
| [mark@gravadlaks:~]% ./shiftadd 
| 0 >> 0 + 0 is 0
| 0 >> 0 + 1 is 0
| 0 >> 0 + 2 is 0
| 
| 1024 >> 0 + 0 is 1024
| 1024 >> 0 + 1 is 512
| 1024 >> 0 + 2 is 256
| 
| 1024 >> 2 + 0 is 256
| 1024 >> 2 + 1 is 128
| 1024 >> 2 + 2 is 64

> During testing, all > those captured branch records looked alright.

I think we clearly need better testing here.

Thanks,
Mark.
  
Anshuman Khandual July 31, 2023, 12:19 p.m. UTC | #6
On 7/31/23 14:36, Mark Rutland wrote:
> On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 7/28/23 22:22, James Clark wrote:
>>>
>>>
>>> On 28/07/2023 17:20, Will Deacon wrote:
>>>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
>>>>> This adds BRBE related register definitions and various other related field
>>>>> macros there in. These will be used subsequently in a BRBE driver which is
>>>>> being added later on.
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> Cc: Will Deacon <will@kernel.org>
>>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>> Cc: linux-kernel@vger.kernel.org
>>>>> Tested-by: James Clark <james.clark@arm.com>
>>>>> Reviewed-by: Mark Brown <broonie@kernel.org>
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>> ---
>>>>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>>>>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>>>>>  2 files changed, 261 insertions(+)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>>> index b481935e9314..f95e30c13c8b 100644
>>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>>> @@ -163,6 +163,109 @@
>>>>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>>>>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>>>>>  
>>>>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
>>>>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
>>>>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
>>>>
>>>> It's that time on a Friday but... aren't these macros busted? I think you
>>>> need brackets before adding the offset, otherwise wouldn't, for example,
>>>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
>>>> then start accessing source register 0?
>>>>
>>>> I'm surprised that the compiler doesn't warn about this, but even more
>>>> surprised that you managed to test this.
>>>>
>>>> Please tell me I'm wrong!
>>>>
>>>> Will
>>>
>>> No I think you are right, it is wrong. Luckily there is already an
>>> extraneous bracket so you you can fix it by moving one a place down:
>>>
>>>   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
>>>
>>> It's interesting because the test [1] is doing quite a bit and looking
>>> at the branch info, and that src and targets match up to function names.
>>> I also manually looked at the branch buffers and didn't see anything
>>> obviously wrong like things that looked like branch infos in the source
>>> or target fields. Will have to take another look to see if it would be
>>> possible for the test to catch this.
>>>
>>> James
>>>
>>> [1]:
>>> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
>>
>> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
>>
>> The additional brackets are useful in explicitly telling the compiler but
>> what it the compiler is just doing the right thing implicitly i.e computing
>> the shifting operation before doing the offset addition.
> 
> No; that is not correct. In c, '+' has higher precedence than '>>'.
> 
> For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as
> '(a >> b) + c'
> 
> That's trivial to test:
> 
> | [mark@gravadlaks:~]% cat shiftadd.c 
> | #include <stdio.h>
> | 
> | unsigned long logshiftadd(unsigned long a,
> |                           unsigned long b,
> |                           unsigned long c)
> | {
> |         printf("%ld >> %ld + %ld is %ld\n",
> |                a, b, c, a >> b + c);
> | }
> | 
> | int main(int argc, char *argv)
> | {
> |         logshiftadd(0, 0, 0);
> |         logshiftadd(0, 0, 1);
> |         logshiftadd(0, 0, 2);
> |         printf("\n");
> |         logshiftadd(1024, 0, 0);
> |         logshiftadd(1024, 0, 1);
> |         logshiftadd(1024, 0, 2);
> |         printf("\n");
> |         logshiftadd(1024, 2, 0);
> |         logshiftadd(1024, 2, 1);
> |         logshiftadd(1024, 2, 2);
> | 
> |         return 0;
> | }
> | [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd
> | [mark@gravadlaks:~]% ./shiftadd 
> | 0 >> 0 + 0 is 0
> | 0 >> 0 + 1 is 0
> | 0 >> 0 + 2 is 0
> | 
> | 1024 >> 0 + 0 is 1024
> | 1024 >> 0 + 1 is 512
> | 1024 >> 0 + 2 is 256
> | 
> | 1024 >> 2 + 0 is 256
> | 1024 >> 2 + 1 is 128
> | 1024 >> 2 + 2 is 64

Understood.

> 
>> During testing, all > those captured branch records looked alright.
> 
> I think we clearly need better testing here
I am still thinking - how could this might have been missed. Could it be
possible that these wrongly computed higher indices were getting folded
back/rolled over into the same legal range indices for a given bank. If
they did, branch record processing would have still captured almost all
of them, may be in an incorrect order. Branch order does matter for the
stitched mechanism.
  
James Clark Aug. 15, 2023, 10:17 a.m. UTC | #7
On 31/07/2023 10:06, Mark Rutland wrote:
> On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 7/28/23 22:22, James Clark wrote:
>>>
>>>
>>> On 28/07/2023 17:20, Will Deacon wrote:
>>>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
>>>>> This adds BRBE related register definitions and various other related field
>>>>> macros there in. These will be used subsequently in a BRBE driver which is
>>>>> being added later on.
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> Cc: Will Deacon <will@kernel.org>
>>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>> Cc: linux-kernel@vger.kernel.org
>>>>> Tested-by: James Clark <james.clark@arm.com>
>>>>> Reviewed-by: Mark Brown <broonie@kernel.org>
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>> ---
>>>>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
>>>>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
>>>>>  2 files changed, 261 insertions(+)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>>> index b481935e9314..f95e30c13c8b 100644
>>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>>> @@ -163,6 +163,109 @@
>>>>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
>>>>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
>>>>>  
>>>>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
>>>>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
>>>>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
>>>>
>>>> It's that time on a Friday but... aren't these macros busted? I think you
>>>> need brackets before adding the offset, otherwise wouldn't, for example,
>>>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
>>>> then start accessing source register 0?
>>>>
>>>> I'm surprised that the compiler doesn't warn about this, but even more
>>>> surprised that you managed to test this.
>>>>
>>>> Please tell me I'm wrong!
>>>>
>>>> Will
>>>
>>> No I think you are right, it is wrong. Luckily there is already an
>>> extraneous bracket so you you can fix it by moving one a place down:
>>>
>>>   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
>>>
>>> It's interesting because the test [1] is doing quite a bit and looking
>>> at the branch info, and that src and targets match up to function names.
>>> I also manually looked at the branch buffers and didn't see anything
>>> obviously wrong like things that looked like branch infos in the source
>>> or target fields. Will have to take another look to see if it would be
>>> possible for the test to catch this.
>>>
>>> James
>>>
>>> [1]:
>>> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
>>
>> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
>>
>> The additional brackets are useful in explicitly telling the compiler but
>> what it the compiler is just doing the right thing implicitly i.e computing
>> the shifting operation before doing the offset addition.
> 
> No; that is not correct. In c, '+' has higher precedence than '>>'.
> 
> For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as
> '(a >> b) + c'
> 
> That's trivial to test:
> 
> | [mark@gravadlaks:~]% cat shiftadd.c 
> | #include <stdio.h>
> | 
> | unsigned long logshiftadd(unsigned long a,
> |                           unsigned long b,
> |                           unsigned long c)
> | {
> |         printf("%ld >> %ld + %ld is %ld\n",
> |                a, b, c, a >> b + c);
> | }
> | 
> | int main(int argc, char *argv)
> | {
> |         logshiftadd(0, 0, 0);
> |         logshiftadd(0, 0, 1);
> |         logshiftadd(0, 0, 2);
> |         printf("\n");
> |         logshiftadd(1024, 0, 0);
> |         logshiftadd(1024, 0, 1);
> |         logshiftadd(1024, 0, 2);
> |         printf("\n");
> |         logshiftadd(1024, 2, 0);
> |         logshiftadd(1024, 2, 1);
> |         logshiftadd(1024, 2, 2);
> | 
> |         return 0;
> | }
> | [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd
> | [mark@gravadlaks:~]% ./shiftadd 
> | 0 >> 0 + 0 is 0
> | 0 >> 0 + 1 is 0
> | 0 >> 0 + 2 is 0
> | 
> | 1024 >> 0 + 0 is 1024
> | 1024 >> 0 + 1 is 512
> | 1024 >> 0 + 2 is 256
> | 
> | 1024 >> 2 + 0 is 256
> | 1024 >> 2 + 1 is 128
> | 1024 >> 2 + 2 is 64
> 
>> During testing, all > those captured branch records looked alright.
> 
> I think we clearly need better testing here.
> 
> Thanks,
> Mark.

Hi Will and Mark,

So I started looking into the test both with and without the fix,
strangely I couldn't see any difference in the branch outputs, or
anywhere in the driver where it would be flipping or filtering anything
to make it only appear to be working. This was a bit confusing, but
added up with the original point that the test was passing and it was
actually doing something.

So I started going deeper and found what the issue (non-issue) is.

Firstly why is there no warning:

The expression is stringified and passed to the assembler, so it skips
the C compiler warning settings. I can send a patch to fix this, but all
we need to do is get the compiler to evaluate the argument and then
throw it away, luckily there are no other issues found even with an
allyesconfig, so BRBE was the only thing with this bug:

 #define read_sysreg_s(r) ({
 	u64 __val;
+	u32 __maybe_unused __check_r = (u32)(r);
 	asm volatile(__mrs_s("%0", r) : "=r" (__val));
 	__val;					
 })


Secondly, why does BRBE actually work:

Well the assembler (at least in my Clang toolchain) has a different
order of operations to C. I put a minimal repro here:
https://godbolt.org/z/YP9adh5xh

You can see the op2 should be a 0b100000 (0x20) for BRBSRC and it
appears to be, you can also see that moving the bracket makes no
difference in this case.

For some more evidence, the disassembler I have locally actually gives
the correct register name, even when the bracket is wrong, and diffing
the .o file gives no difference when moving the bracket:

  0000000000000008 <main>:
   8:   d503245f        bti     c
   c:   d503201f        nop
  10:   d503201f        nop
  14:   2a1f03e0        mov     w0, wzr
  18:   d5318028        mrs     x8, brbsrc0_el1
  1c:   d5318128        mrs     x8, brbsrc1_el1
  20:   d65f03c0        ret

Seems completely crazy to me that this is actually the case. So maybe I
am also wrong. Don't know if this counts as a compiler bug or it's just
supposed to be like that.

Thanks
James
  
Mark Rutland Aug. 15, 2023, 1:05 p.m. UTC | #8
On Tue, Aug 15, 2023 at 11:17:19AM +0100, James Clark wrote:
> 
> 
> On 31/07/2023 10:06, Mark Rutland wrote:
> > On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 7/28/23 22:22, James Clark wrote:
> >>>
> >>>
> >>> On 28/07/2023 17:20, Will Deacon wrote:
> >>>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
> >>>>> This adds BRBE related register definitions and various other related field
> >>>>> macros there in. These will be used subsequently in a BRBE driver which is
> >>>>> being added later on.
> >>>>>
> >>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>>>> Cc: Will Deacon <will@kernel.org>
> >>>>> Cc: Marc Zyngier <maz@kernel.org>
> >>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> >>>>> Cc: linux-arm-kernel@lists.infradead.org
> >>>>> Cc: linux-kernel@vger.kernel.org
> >>>>> Tested-by: James Clark <james.clark@arm.com>
> >>>>> Reviewed-by: Mark Brown <broonie@kernel.org>
> >>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >>>>> ---
> >>>>>  arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
> >>>>>  arch/arm64/tools/sysreg         | 158 ++++++++++++++++++++++++++++++++
> >>>>>  2 files changed, 261 insertions(+)
> >>>>>
> >>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> >>>>> index b481935e9314..f95e30c13c8b 100644
> >>>>> --- a/arch/arm64/include/asm/sysreg.h
> >>>>> +++ b/arch/arm64/include/asm/sysreg.h
> >>>>> @@ -163,6 +163,109 @@
> >>>>>  #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
> >>>>>  #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
> >>>>>  
> >>>>> +#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
> >>>>> +#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
> >>>>> +#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
> >>>>
> >>>> It's that time on a Friday but... aren't these macros busted? I think you
> >>>> need brackets before adding the offset, otherwise wouldn't, for example,
> >>>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
> >>>> then start accessing source register 0?
> >>>>
> >>>> I'm surprised that the compiler doesn't warn about this, but even more
> >>>> surprised that you managed to test this.
> >>>>
> >>>> Please tell me I'm wrong!
> >>>>
> >>>> Will
> >>>
> >>> No I think you are right, it is wrong. Luckily there is already an
> >>> extraneous bracket so you you can fix it by moving one a place down:
> >>>
> >>>   sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
> >>>
> >>> It's interesting because the test [1] is doing quite a bit and looking
> >>> at the branch info, and that src and targets match up to function names.
> >>> I also manually looked at the branch buffers and didn't see anything
> >>> obviously wrong like things that looked like branch infos in the source
> >>> or target fields. Will have to take another look to see if it would be
> >>> possible for the test to catch this.
> >>>
> >>> James
> >>>
> >>> [1]:
> >>> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
> >>
> >> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
> >>
> >> The additional brackets are useful in explicitly telling the compiler but
> >> what it the compiler is just doing the right thing implicitly i.e computing
> >> the shifting operation before doing the offset addition.
> > 
> > No; that is not correct. In c, '+' has higher precedence than '>>'.
> > 
> > For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as
> > '(a >> b) + c'
> > 
> > That's trivial to test:
> > 
> > | [mark@gravadlaks:~]% cat shiftadd.c 
> > | #include <stdio.h>
> > | 
> > | unsigned long logshiftadd(unsigned long a,
> > |                           unsigned long b,
> > |                           unsigned long c)
> > | {
> > |         printf("%ld >> %ld + %ld is %ld\n",
> > |                a, b, c, a >> b + c);
> > | }
> > | 
> > | int main(int argc, char *argv)
> > | {
> > |         logshiftadd(0, 0, 0);
> > |         logshiftadd(0, 0, 1);
> > |         logshiftadd(0, 0, 2);
> > |         printf("\n");
> > |         logshiftadd(1024, 0, 0);
> > |         logshiftadd(1024, 0, 1);
> > |         logshiftadd(1024, 0, 2);
> > |         printf("\n");
> > |         logshiftadd(1024, 2, 0);
> > |         logshiftadd(1024, 2, 1);
> > |         logshiftadd(1024, 2, 2);
> > | 
> > |         return 0;
> > | }
> > | [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd
> > | [mark@gravadlaks:~]% ./shiftadd 
> > | 0 >> 0 + 0 is 0
> > | 0 >> 0 + 1 is 0
> > | 0 >> 0 + 2 is 0
> > | 
> > | 1024 >> 0 + 0 is 1024
> > | 1024 >> 0 + 1 is 512
> > | 1024 >> 0 + 2 is 256
> > | 
> > | 1024 >> 2 + 0 is 256
> > | 1024 >> 2 + 1 is 128
> > | 1024 >> 2 + 2 is 64
> > 
> >> During testing, all > those captured branch records looked alright.
> > 
> > I think we clearly need better testing here.
> > 
> > Thanks,
> > Mark.
> 
> Hi Will and Mark,
> 
> So I started looking into the test both with and without the fix,
> strangely I couldn't see any difference in the branch outputs, or
> anywhere in the driver where it would be flipping or filtering anything
> to make it only appear to be working. This was a bit confusing, but
> added up with the original point that the test was passing and it was
> actually doing something.
> 
> So I started going deeper and found what the issue (non-issue) is.
> 
> Firstly why is there no warning:
> 
> The expression is stringified and passed to the assembler, so it skips
> the C compiler warning settings. I can send a patch to fix this, but all
> we need to do is get the compiler to evaluate the argument and then
> throw it away, luckily there are no other issues found even with an
> allyesconfig, so BRBE was the only thing with this bug:
> 
>  #define read_sysreg_s(r) ({
>  	u64 __val;
> +	u32 __maybe_unused __check_r = (u32)(r);
>  	asm volatile(__mrs_s("%0", r) : "=r" (__val));
>  	__val;					
>  })
> 
> 
> Secondly, why does BRBE actually work:
> 
> Well the assembler (at least in my Clang toolchain) has a different
> order of operations to C. I put a minimal repro here:
> https://godbolt.org/z/YP9adh5xh
> 
> You can see the op2 should be a 0b100000 (0x20) for BRBSRC and it
> appears to be, you can also see that moving the bracket makes no
> difference in this case.
> 
> For some more evidence, the disassembler I have locally actually gives
> the correct register name, even when the bracket is wrong, and diffing
> the .o file gives no difference when moving the bracket:
> 
>   0000000000000008 <main>:
>    8:   d503245f        bti     c
>    c:   d503201f        nop
>   10:   d503201f        nop
>   14:   2a1f03e0        mov     w0, wzr
>   18:   d5318028        mrs     x8, brbsrc0_el1
>   1c:   d5318128        mrs     x8, brbsrc1_el1
>   20:   d65f03c0        ret
> 
> Seems completely crazy to me that this is actually the case. So maybe I
> am also wrong. Don't know if this counts as a compiler bug or it's just
> supposed to be like that.

From a quick dig, it's supposed to be like that: the GNU assembler uses a
different operator precedence to C, and clang's assembler does the same for
compatibility. What a great.

Compare:

  https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_6.html#SEC66

... with:

  https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence

Adding the brackets will make this work in either case, so I think that's the
right thing to do for now.

Thanks,
Mark.
  

Patch

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index b481935e9314..f95e30c13c8b 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -163,6 +163,109 @@ 
 #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
 #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
 
+#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
+#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
+#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
+
+#define SYS_BRBINF0_EL1			__SYS_BRBINFO(0)
+#define SYS_BRBINF1_EL1			__SYS_BRBINFO(1)
+#define SYS_BRBINF2_EL1			__SYS_BRBINFO(2)
+#define SYS_BRBINF3_EL1			__SYS_BRBINFO(3)
+#define SYS_BRBINF4_EL1			__SYS_BRBINFO(4)
+#define SYS_BRBINF5_EL1			__SYS_BRBINFO(5)
+#define SYS_BRBINF6_EL1			__SYS_BRBINFO(6)
+#define SYS_BRBINF7_EL1			__SYS_BRBINFO(7)
+#define SYS_BRBINF8_EL1			__SYS_BRBINFO(8)
+#define SYS_BRBINF9_EL1			__SYS_BRBINFO(9)
+#define SYS_BRBINF10_EL1		__SYS_BRBINFO(10)
+#define SYS_BRBINF11_EL1		__SYS_BRBINFO(11)
+#define SYS_BRBINF12_EL1		__SYS_BRBINFO(12)
+#define SYS_BRBINF13_EL1		__SYS_BRBINFO(13)
+#define SYS_BRBINF14_EL1		__SYS_BRBINFO(14)
+#define SYS_BRBINF15_EL1		__SYS_BRBINFO(15)
+#define SYS_BRBINF16_EL1		__SYS_BRBINFO(16)
+#define SYS_BRBINF17_EL1		__SYS_BRBINFO(17)
+#define SYS_BRBINF18_EL1		__SYS_BRBINFO(18)
+#define SYS_BRBINF19_EL1		__SYS_BRBINFO(19)
+#define SYS_BRBINF20_EL1		__SYS_BRBINFO(20)
+#define SYS_BRBINF21_EL1		__SYS_BRBINFO(21)
+#define SYS_BRBINF22_EL1		__SYS_BRBINFO(22)
+#define SYS_BRBINF23_EL1		__SYS_BRBINFO(23)
+#define SYS_BRBINF24_EL1		__SYS_BRBINFO(24)
+#define SYS_BRBINF25_EL1		__SYS_BRBINFO(25)
+#define SYS_BRBINF26_EL1		__SYS_BRBINFO(26)
+#define SYS_BRBINF27_EL1		__SYS_BRBINFO(27)
+#define SYS_BRBINF28_EL1		__SYS_BRBINFO(28)
+#define SYS_BRBINF29_EL1		__SYS_BRBINFO(29)
+#define SYS_BRBINF30_EL1		__SYS_BRBINFO(30)
+#define SYS_BRBINF31_EL1		__SYS_BRBINFO(31)
+
+#define SYS_BRBSRC0_EL1			__SYS_BRBSRC(0)
+#define SYS_BRBSRC1_EL1			__SYS_BRBSRC(1)
+#define SYS_BRBSRC2_EL1			__SYS_BRBSRC(2)
+#define SYS_BRBSRC3_EL1			__SYS_BRBSRC(3)
+#define SYS_BRBSRC4_EL1			__SYS_BRBSRC(4)
+#define SYS_BRBSRC5_EL1			__SYS_BRBSRC(5)
+#define SYS_BRBSRC6_EL1			__SYS_BRBSRC(6)
+#define SYS_BRBSRC7_EL1			__SYS_BRBSRC(7)
+#define SYS_BRBSRC8_EL1			__SYS_BRBSRC(8)
+#define SYS_BRBSRC9_EL1			__SYS_BRBSRC(9)
+#define SYS_BRBSRC10_EL1		__SYS_BRBSRC(10)
+#define SYS_BRBSRC11_EL1		__SYS_BRBSRC(11)
+#define SYS_BRBSRC12_EL1		__SYS_BRBSRC(12)
+#define SYS_BRBSRC13_EL1		__SYS_BRBSRC(13)
+#define SYS_BRBSRC14_EL1		__SYS_BRBSRC(14)
+#define SYS_BRBSRC15_EL1		__SYS_BRBSRC(15)
+#define SYS_BRBSRC16_EL1		__SYS_BRBSRC(16)
+#define SYS_BRBSRC17_EL1		__SYS_BRBSRC(17)
+#define SYS_BRBSRC18_EL1		__SYS_BRBSRC(18)
+#define SYS_BRBSRC19_EL1		__SYS_BRBSRC(19)
+#define SYS_BRBSRC20_EL1		__SYS_BRBSRC(20)
+#define SYS_BRBSRC21_EL1		__SYS_BRBSRC(21)
+#define SYS_BRBSRC22_EL1		__SYS_BRBSRC(22)
+#define SYS_BRBSRC23_EL1		__SYS_BRBSRC(23)
+#define SYS_BRBSRC24_EL1		__SYS_BRBSRC(24)
+#define SYS_BRBSRC25_EL1		__SYS_BRBSRC(25)
+#define SYS_BRBSRC26_EL1		__SYS_BRBSRC(26)
+#define SYS_BRBSRC27_EL1		__SYS_BRBSRC(27)
+#define SYS_BRBSRC28_EL1		__SYS_BRBSRC(28)
+#define SYS_BRBSRC29_EL1		__SYS_BRBSRC(29)
+#define SYS_BRBSRC30_EL1		__SYS_BRBSRC(30)
+#define SYS_BRBSRC31_EL1		__SYS_BRBSRC(31)
+
+#define SYS_BRBTGT0_EL1			__SYS_BRBTGT(0)
+#define SYS_BRBTGT1_EL1			__SYS_BRBTGT(1)
+#define SYS_BRBTGT2_EL1			__SYS_BRBTGT(2)
+#define SYS_BRBTGT3_EL1			__SYS_BRBTGT(3)
+#define SYS_BRBTGT4_EL1			__SYS_BRBTGT(4)
+#define SYS_BRBTGT5_EL1			__SYS_BRBTGT(5)
+#define SYS_BRBTGT6_EL1			__SYS_BRBTGT(6)
+#define SYS_BRBTGT7_EL1			__SYS_BRBTGT(7)
+#define SYS_BRBTGT8_EL1			__SYS_BRBTGT(8)
+#define SYS_BRBTGT9_EL1			__SYS_BRBTGT(9)
+#define SYS_BRBTGT10_EL1		__SYS_BRBTGT(10)
+#define SYS_BRBTGT11_EL1		__SYS_BRBTGT(11)
+#define SYS_BRBTGT12_EL1		__SYS_BRBTGT(12)
+#define SYS_BRBTGT13_EL1		__SYS_BRBTGT(13)
+#define SYS_BRBTGT14_EL1		__SYS_BRBTGT(14)
+#define SYS_BRBTGT15_EL1		__SYS_BRBTGT(15)
+#define SYS_BRBTGT16_EL1		__SYS_BRBTGT(16)
+#define SYS_BRBTGT17_EL1		__SYS_BRBTGT(17)
+#define SYS_BRBTGT18_EL1		__SYS_BRBTGT(18)
+#define SYS_BRBTGT19_EL1		__SYS_BRBTGT(19)
+#define SYS_BRBTGT20_EL1		__SYS_BRBTGT(20)
+#define SYS_BRBTGT21_EL1		__SYS_BRBTGT(21)
+#define SYS_BRBTGT22_EL1		__SYS_BRBTGT(22)
+#define SYS_BRBTGT23_EL1		__SYS_BRBTGT(23)
+#define SYS_BRBTGT24_EL1		__SYS_BRBTGT(24)
+#define SYS_BRBTGT25_EL1		__SYS_BRBTGT(25)
+#define SYS_BRBTGT26_EL1		__SYS_BRBTGT(26)
+#define SYS_BRBTGT27_EL1		__SYS_BRBTGT(27)
+#define SYS_BRBTGT28_EL1		__SYS_BRBTGT(28)
+#define SYS_BRBTGT29_EL1		__SYS_BRBTGT(29)
+#define SYS_BRBTGT30_EL1		__SYS_BRBTGT(30)
+#define SYS_BRBTGT31_EL1		__SYS_BRBTGT(31)
+
 #define SYS_MIDR_EL1			sys_reg(3, 0, 0, 0, 0)
 #define SYS_MPIDR_EL1			sys_reg(3, 0, 0, 0, 5)
 #define SYS_REVIDR_EL1			sys_reg(3, 0, 0, 0, 6)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 1ea4a3dc68f8..9892af96262f 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1002,6 +1002,164 @@  UnsignedEnum	3:0	BT
 EndEnum
 EndSysreg
 
+
+SysregFields BRBINFx_EL1
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIRECT
+	0b000001	INDIRECT
+	0b000010	DIRECT_LINK
+	0b000011	INDIRECT_LINK
+	0b000101	RET
+	0b000111	ERET
+	0b001000	COND_DIRECT
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INSN_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALIGN_FAULT
+	0b101011	INSN_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b11	FULL
+EndEnum
+EndSysregFields
+
+Sysreg	BRBCR_EL1	2	1	9	0	0
+Res0	63:24
+Field	23 	EXCEPTION
+Field	22 	ERTN
+Res0	21:9
+Field	8 	FZP
+Res0	7
+Enum	6:5	TS
+	0b01	VIRTUAL
+	0b10	GUEST_PHYSICAL
+	0b11	PHYSICAL
+EndEnum
+Field	4	MPRED
+Field	3	CC
+Res0	2
+Field	1	E1BRE
+Field	0	E0BRE
+EndSysreg
+
+Sysreg	BRBFCR_EL1	2	1	9	0	1
+Res0	63:30
+Enum	29:28	BANK
+	0b0	FIRST
+	0b1	SECOND
+EndEnum
+Res0	27:23
+Field	22	CONDDIR
+Field	21	DIRCALL
+Field	20	INDCALL
+Field	19	RTN
+Field	18	INDIRECT
+Field	17	DIRECT
+Field	16	EnI
+Res0	15:8
+Field	7	PAUSED
+Field	6	LASTFAILED
+Res0	5:0
+EndSysreg
+
+Sysreg	BRBTS_EL1	2	1	9	0	2
+Field	63:0	TS
+EndSysreg
+
+Sysreg	BRBINFINJ_EL1	2	1	9	1	0
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIRECT
+	0b000001	INDIRECT
+	0b000010	DIRECT_LINK
+	0b000011	INDIRECT_LINK
+	0b000101	RET
+	0b000111	ERET
+	0b001000	COND_DIRECT
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INSN_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALIGN_FAULT
+	0b101011	INSN_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b11	FULL
+EndEnum
+EndSysreg
+
+Sysreg	BRBSRCINJ_EL1	2	1	9	1	1
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBTGTINJ_EL1	2	1	9	1	2
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBIDR0_EL1	2	1	9	2	0
+Res0	63:16
+Enum	15:12	CC
+	0b101	20_BIT
+EndEnum
+Enum	11:8	FORMAT
+	0b0	0
+EndEnum
+Enum	7:0		NUMREC
+	0b0001000	8
+	0b0010000	16
+	0b0100000	32
+	0b1000000	64
+EndEnum
+EndSysreg
+
 Sysreg	ID_AA64ZFR0_EL1	3	0	0	4	4
 Res0	63:60
 UnsignedEnum	59:56	F64MM