riscv/efistub: Ensure GP-relative addressing is not used

Message ID 42c63cb9-87d0-49db-9af8-95771b186684@siemens.com
State New
Headers
Series riscv/efistub: Ensure GP-relative addressing is not used |

Commit Message

Jan Kiszka Jan. 12, 2024, 6:37 p.m. UTC
  From: Jan Kiszka <jan.kiszka@siemens.com>

The cflags for the RISC-V efistub were missing -mno-relax, thus were
under the risk that the compiler could use GP-relative addressing. That
happened for _edata with binutils-2.41 and kernel 6.1, causing the
relocation to fail due to an invalid kernel_size in handle_kernel_image.
It was not yet observed with newer versions, but that may just be luck.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---

Something like this should go to stable as well, but we will need 
rebased patches.

 drivers/firmware/efi/libstub/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Ard Biesheuvel Jan. 12, 2024, 6:51 p.m. UTC | #1
Hi Jan,

On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> From: Jan Kiszka <jan.kiszka@siemens.com>
>
> The cflags for the RISC-V efistub were missing -mno-relax, thus were
> under the risk that the compiler could use GP-relative addressing. That
> happened for _edata with binutils-2.41 and kernel 6.1, causing the
> relocation to fail due to an invalid kernel_size in handle_kernel_image.
> It was not yet observed with newer versions, but that may just be luck.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>
> Something like this should go to stable as well, but we will need
> rebased patches.
>
>  drivers/firmware/efi/libstub/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index 06964a3c130f..d561d7de46a9 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN -DEFI_HAVE_STRNLEN \
>                                    -DEFI_HAVE_MEMCHR -DEFI_HAVE_STRRCHR \
>                                    -DEFI_HAVE_STRCMP -fno-builtin -fpic \
>                                    $(call cc-option,-mno-single-pic-base)
> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax

Can we detect the presence of these references (via the relocation
type)? We already do something similar for ordinary absolute
references too.

>  cflags-$(CONFIG_LOONGARCH)     += -fpie
>
>  cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)   += -I$(srctree)/scripts/dtc/libfdt
> --
> 2.35.3
  
Palmer Dabbelt Jan. 12, 2024, 6:56 p.m. UTC | #2
On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
> Hi Jan,
>
> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
>> under the risk that the compiler could use GP-relative addressing. That
>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
>> It was not yet observed with newer versions, but that may just be luck.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>
>> Something like this should go to stable as well, but we will need
>> rebased patches.
>>
>>  drivers/firmware/efi/libstub/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
>> index 06964a3c130f..d561d7de46a9 100644
>> --- a/drivers/firmware/efi/libstub/Makefile
>> +++ b/drivers/firmware/efi/libstub/Makefile
>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN -DEFI_HAVE_STRNLEN \
>>                                    -DEFI_HAVE_MEMCHR -DEFI_HAVE_STRRCHR \
>>                                    -DEFI_HAVE_STRCMP -fno-builtin -fpic \
>>                                    $(call cc-option,-mno-single-pic-base)
>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
>
> Can we detect the presence of these references (via the relocation
> type)? We already do something similar for ordinary absolute
> references too.

If there's no `__global_pointer$` symbol then the linker won't make 
GP-relative relaxations (because it doesn't know where GP is).  We 
usually define that symbol in the linker script, but I'm not entierly 
sure how libstub gets its linker script...

>
>>  cflags-$(CONFIG_LOONGARCH)     += -fpie
>>
>>  cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)   += -I$(srctree)/scripts/dtc/libfdt
>> --
>> 2.35.3
  
Jan Kiszka Jan. 13, 2024, 10:35 a.m. UTC | #3
On 12.01.24 19:56, Palmer Dabbelt wrote:
> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
>> Hi Jan,
>>
>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
>>> under the risk that the compiler could use GP-relative addressing. That
>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
>>> It was not yet observed with newer versions, but that may just be luck.
>>>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> ---
>>>
>>> Something like this should go to stable as well, but we will need
>>> rebased patches.
>>>
>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/firmware/efi/libstub/Makefile
>>> b/drivers/firmware/efi/libstub/Makefile
>>> index 06964a3c130f..d561d7de46a9 100644
>>> --- a/drivers/firmware/efi/libstub/Makefile
>>> +++ b/drivers/firmware/efi/libstub/Makefile
>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
>>> -DEFI_HAVE_STRNLEN \
>>>                                    -DEFI_HAVE_MEMCHR
>>> -DEFI_HAVE_STRRCHR \
>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
>>> -fpic \
>>>                                    $(call
>>> cc-option,-mno-single-pic-base)
>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
>>
>> Can we detect the presence of these references (via the relocation
>> type)? We already do something similar for ordinary absolute
>> references too.
> 
> If there's no `__global_pointer$` symbol then the linker won't make
> GP-relative relaxations (because it doesn't know where GP is).  We
> usually define that symbol in the linker script, but I'm not entierly
> sure how libstub gets its linker script...
> 

The stub seems to be linked together with the rest of the kernel, thus
the regular arch/riscv/kernel/vmlinux.lds.S is used.

Jan
  
Ard Biesheuvel Jan. 15, 2024, 5:34 p.m. UTC | #4
On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 12.01.24 19:56, Palmer Dabbelt wrote:
> > On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
> >> Hi Jan,
> >>
> >> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>
> >>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>>
> >>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
> >>> under the risk that the compiler could use GP-relative addressing. That
> >>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
> >>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
> >>> It was not yet observed with newer versions, but that may just be luck.
> >>>
> >>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>> ---
> >>>
> >>> Something like this should go to stable as well, but we will need
> >>> rebased patches.
> >>>
> >>>  drivers/firmware/efi/libstub/Makefile | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/firmware/efi/libstub/Makefile
> >>> b/drivers/firmware/efi/libstub/Makefile
> >>> index 06964a3c130f..d561d7de46a9 100644
> >>> --- a/drivers/firmware/efi/libstub/Makefile
> >>> +++ b/drivers/firmware/efi/libstub/Makefile
> >>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
> >>> -DEFI_HAVE_STRNLEN \
> >>>                                    -DEFI_HAVE_MEMCHR
> >>> -DEFI_HAVE_STRRCHR \
> >>>                                    -DEFI_HAVE_STRCMP -fno-builtin
> >>> -fpic \
> >>>                                    $(call
> >>> cc-option,-mno-single-pic-base)
> >>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
> >>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
> >>
> >> Can we detect the presence of these references (via the relocation
> >> type)? We already do something similar for ordinary absolute
> >> references too.
> >
> > If there's no `__global_pointer$` symbol then the linker won't make
> > GP-relative relaxations (because it doesn't know where GP is).  We
> > usually define that symbol in the linker script, but I'm not entierly
> > sure how libstub gets its linker script...
> >
>
> The stub seems to be linked together with the rest of the kernel, thus
> the regular arch/riscv/kernel/vmlinux.lds.S is used.
>

Indeed - the EFI stub is part of the same executable as vmlinux, we
just mangle the symbol names to ensure that only code that can be
safely called from the EFI stub can be linked to it.

If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
relocations, we should perhaps add those to the STUBCOPY_RELOC-y
Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
like the right value there to begin with: the idea of that is to
disallow ELF relocations that evaluate to expressions that can only be
known at runtime (like absolute addresses for global pointer
variables)
  
Jan Kiszka Jan. 16, 2024, 5:21 a.m. UTC | #5
On 15.01.24 18:34, Ard Biesheuvel wrote:
> On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> On 12.01.24 19:56, Palmer Dabbelt wrote:
>>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
>>>> Hi Jan,
>>>>
>>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>
>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>
>>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
>>>>> under the risk that the compiler could use GP-relative addressing. That
>>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
>>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
>>>>> It was not yet observed with newer versions, but that may just be luck.
>>>>>
>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> ---
>>>>>
>>>>> Something like this should go to stable as well, but we will need
>>>>> rebased patches.
>>>>>
>>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
>>>>> b/drivers/firmware/efi/libstub/Makefile
>>>>> index 06964a3c130f..d561d7de46a9 100644
>>>>> --- a/drivers/firmware/efi/libstub/Makefile
>>>>> +++ b/drivers/firmware/efi/libstub/Makefile
>>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
>>>>> -DEFI_HAVE_STRNLEN \
>>>>>                                    -DEFI_HAVE_MEMCHR
>>>>> -DEFI_HAVE_STRRCHR \
>>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
>>>>> -fpic \
>>>>>                                    $(call
>>>>> cc-option,-mno-single-pic-base)
>>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
>>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
>>>>
>>>> Can we detect the presence of these references (via the relocation
>>>> type)? We already do something similar for ordinary absolute
>>>> references too.
>>>
>>> If there's no `__global_pointer$` symbol then the linker won't make
>>> GP-relative relaxations (because it doesn't know where GP is).  We
>>> usually define that symbol in the linker script, but I'm not entierly
>>> sure how libstub gets its linker script...
>>>
>>
>> The stub seems to be linked together with the rest of the kernel, thus
>> the regular arch/riscv/kernel/vmlinux.lds.S is used.
>>
> 
> Indeed - the EFI stub is part of the same executable as vmlinux, we
> just mangle the symbol names to ensure that only code that can be
> safely called from the EFI stub can be linked to it.
> 
> If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
> relocations, we should perhaps add those to the STUBCOPY_RELOC-y
> Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
> like the right value there to begin with: the idea of that is to
> disallow ELF relocations that evaluate to expressions that can only be
> known at runtime (like absolute addresses for global pointer
> variables)

How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?

Jan
  
Ard Biesheuvel Jan. 16, 2024, 8:36 a.m. UTC | #6
On Tue, 16 Jan 2024 at 06:21, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 15.01.24 18:34, Ard Biesheuvel wrote:
> > On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>
> >> On 12.01.24 19:56, Palmer Dabbelt wrote:
> >>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
> >>>> Hi Jan,
> >>>>
> >>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>>
> >>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>
> >>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
> >>>>> under the risk that the compiler could use GP-relative addressing. That
> >>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
> >>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
> >>>>> It was not yet observed with newer versions, but that may just be luck.
> >>>>>
> >>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>> ---
> >>>>>
> >>>>> Something like this should go to stable as well, but we will need
> >>>>> rebased patches.
> >>>>>
> >>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
> >>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
> >>>>> b/drivers/firmware/efi/libstub/Makefile
> >>>>> index 06964a3c130f..d561d7de46a9 100644
> >>>>> --- a/drivers/firmware/efi/libstub/Makefile
> >>>>> +++ b/drivers/firmware/efi/libstub/Makefile
> >>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
> >>>>> -DEFI_HAVE_STRNLEN \
> >>>>>                                    -DEFI_HAVE_MEMCHR
> >>>>> -DEFI_HAVE_STRRCHR \
> >>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
> >>>>> -fpic \
> >>>>>                                    $(call
> >>>>> cc-option,-mno-single-pic-base)
> >>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
> >>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
> >>>>
> >>>> Can we detect the presence of these references (via the relocation
> >>>> type)? We already do something similar for ordinary absolute
> >>>> references too.
> >>>
> >>> If there's no `__global_pointer$` symbol then the linker won't make
> >>> GP-relative relaxations (because it doesn't know where GP is).  We
> >>> usually define that symbol in the linker script, but I'm not entierly
> >>> sure how libstub gets its linker script...
> >>>
> >>
> >> The stub seems to be linked together with the rest of the kernel, thus
> >> the regular arch/riscv/kernel/vmlinux.lds.S is used.
> >>
> >
> > Indeed - the EFI stub is part of the same executable as vmlinux, we
> > just mangle the symbol names to ensure that only code that can be
> > safely called from the EFI stub can be linked to it.
> >
> > If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
> > relocations, we should perhaps add those to the STUBCOPY_RELOC-y
> > Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
> > like the right value there to begin with: the idea of that is to
> > disallow ELF relocations that evaluate to expressions that can only be
> > known at runtime (like absolute addresses for global pointer
> > variables)
>
> How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?
>

We'll need to keep the HI20, in fact - I got confused between HI20 and
PCREL_HI20, and the former is actually used for 32-bit absolute
addresses in 32-bit code.

This seems to do the trick: it disallows relaxation relocations and
native word sizes absolute references. AFAICT, those are the only ones
we should care about.

STUBCOPY_RELOC-$(CONFIG_RISCV) := -E
R_RISCV_HI20\|R_RISCV_$(BITS)\|R_RISCV_RELAX
  
Jan Kiszka Jan. 16, 2024, 1:44 p.m. UTC | #7
On 16.01.24 09:36, Ard Biesheuvel wrote:
> On Tue, 16 Jan 2024 at 06:21, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> On 15.01.24 18:34, Ard Biesheuvel wrote:
>>> On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>
>>>> On 12.01.24 19:56, Palmer Dabbelt wrote:
>>>>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
>>>>>> Hi Jan,
>>>>>>
>>>>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>>
>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>
>>>>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
>>>>>>> under the risk that the compiler could use GP-relative addressing. That
>>>>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
>>>>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
>>>>>>> It was not yet observed with newer versions, but that may just be luck.
>>>>>>>
>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Something like this should go to stable as well, but we will need
>>>>>>> rebased patches.
>>>>>>>
>>>>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
>>>>>>> b/drivers/firmware/efi/libstub/Makefile
>>>>>>> index 06964a3c130f..d561d7de46a9 100644
>>>>>>> --- a/drivers/firmware/efi/libstub/Makefile
>>>>>>> +++ b/drivers/firmware/efi/libstub/Makefile
>>>>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
>>>>>>> -DEFI_HAVE_STRNLEN \
>>>>>>>                                    -DEFI_HAVE_MEMCHR
>>>>>>> -DEFI_HAVE_STRRCHR \
>>>>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
>>>>>>> -fpic \
>>>>>>>                                    $(call
>>>>>>> cc-option,-mno-single-pic-base)
>>>>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
>>>>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
>>>>>>
>>>>>> Can we detect the presence of these references (via the relocation
>>>>>> type)? We already do something similar for ordinary absolute
>>>>>> references too.
>>>>>
>>>>> If there's no `__global_pointer$` symbol then the linker won't make
>>>>> GP-relative relaxations (because it doesn't know where GP is).  We
>>>>> usually define that symbol in the linker script, but I'm not entierly
>>>>> sure how libstub gets its linker script...
>>>>>
>>>>
>>>> The stub seems to be linked together with the rest of the kernel, thus
>>>> the regular arch/riscv/kernel/vmlinux.lds.S is used.
>>>>
>>>
>>> Indeed - the EFI stub is part of the same executable as vmlinux, we
>>> just mangle the symbol names to ensure that only code that can be
>>> safely called from the EFI stub can be linked to it.
>>>
>>> If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
>>> relocations, we should perhaps add those to the STUBCOPY_RELOC-y
>>> Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
>>> like the right value there to begin with: the idea of that is to
>>> disallow ELF relocations that evaluate to expressions that can only be
>>> known at runtime (like absolute addresses for global pointer
>>> variables)
>>
>> How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?
>>
> 
> We'll need to keep the HI20, in fact - I got confused between HI20 and
> PCREL_HI20, and the former is actually used for 32-bit absolute
> addresses in 32-bit code.
> 
> This seems to do the trick: it disallows relaxation relocations and
> native word sizes absolute references. AFAICT, those are the only ones
> we should care about.
> 
> STUBCOPY_RELOC-$(CONFIG_RISCV) := -E
> R_RISCV_HI20\|R_RISCV_$(BITS)\|R_RISCV_RELAX

I would suggest to do that on top of this patch. Want me to write such a
patch, or will you? You can probably more fluently explain why
R_RISCV_32/64 is important, I would first have to understand what that
is exactly. :)

Jan
  
Ard Biesheuvel Jan. 16, 2024, 1:47 p.m. UTC | #8
On Tue, 16 Jan 2024 at 14:44, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 16.01.24 09:36, Ard Biesheuvel wrote:
> > On Tue, 16 Jan 2024 at 06:21, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>
> >> On 15.01.24 18:34, Ard Biesheuvel wrote:
> >>> On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>
> >>>> On 12.01.24 19:56, Palmer Dabbelt wrote:
> >>>>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
> >>>>>> Hi Jan,
> >>>>>>
> >>>>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>>>>
> >>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>>
> >>>>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
> >>>>>>> under the risk that the compiler could use GP-relative addressing. That
> >>>>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
> >>>>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
> >>>>>>> It was not yet observed with newer versions, but that may just be luck.
> >>>>>>>
> >>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>> ---
> >>>>>>>
> >>>>>>> Something like this should go to stable as well, but we will need
> >>>>>>> rebased patches.
> >>>>>>>
> >>>>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
> >>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
> >>>>>>> b/drivers/firmware/efi/libstub/Makefile
> >>>>>>> index 06964a3c130f..d561d7de46a9 100644
> >>>>>>> --- a/drivers/firmware/efi/libstub/Makefile
> >>>>>>> +++ b/drivers/firmware/efi/libstub/Makefile
> >>>>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
> >>>>>>> -DEFI_HAVE_STRNLEN \
> >>>>>>>                                    -DEFI_HAVE_MEMCHR
> >>>>>>> -DEFI_HAVE_STRRCHR \
> >>>>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
> >>>>>>> -fpic \
> >>>>>>>                                    $(call
> >>>>>>> cc-option,-mno-single-pic-base)
> >>>>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
> >>>>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
> >>>>>>
> >>>>>> Can we detect the presence of these references (via the relocation
> >>>>>> type)? We already do something similar for ordinary absolute
> >>>>>> references too.
> >>>>>
> >>>>> If there's no `__global_pointer$` symbol then the linker won't make
> >>>>> GP-relative relaxations (because it doesn't know where GP is).  We
> >>>>> usually define that symbol in the linker script, but I'm not entierly
> >>>>> sure how libstub gets its linker script...
> >>>>>
> >>>>
> >>>> The stub seems to be linked together with the rest of the kernel, thus
> >>>> the regular arch/riscv/kernel/vmlinux.lds.S is used.
> >>>>
> >>>
> >>> Indeed - the EFI stub is part of the same executable as vmlinux, we
> >>> just mangle the symbol names to ensure that only code that can be
> >>> safely called from the EFI stub can be linked to it.
> >>>
> >>> If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
> >>> relocations, we should perhaps add those to the STUBCOPY_RELOC-y
> >>> Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
> >>> like the right value there to begin with: the idea of that is to
> >>> disallow ELF relocations that evaluate to expressions that can only be
> >>> known at runtime (like absolute addresses for global pointer
> >>> variables)
> >>
> >> How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?
> >>
> >
> > We'll need to keep the HI20, in fact - I got confused between HI20 and
> > PCREL_HI20, and the former is actually used for 32-bit absolute
> > addresses in 32-bit code.
> >
> > This seems to do the trick: it disallows relaxation relocations and
> > native word sizes absolute references. AFAICT, those are the only ones
> > we should care about.
> >
> > STUBCOPY_RELOC-$(CONFIG_RISCV) := -E
> > R_RISCV_HI20\|R_RISCV_$(BITS)\|R_RISCV_RELAX
>
> I would suggest to do that on top of this patch. Want me to write such a
> patch, or will you? You can probably more fluently explain why
> R_RISCV_32/64 is important, I would first have to understand what that
> is exactly. :)
>

Sure, I can take care of that.

For your patch,

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>

I'll queue this up as a EFI fix.
  
Jan Kiszka Jan. 16, 2024, 1:56 p.m. UTC | #9
On 16.01.24 14:47, Ard Biesheuvel wrote:
> On Tue, 16 Jan 2024 at 14:44, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> On 16.01.24 09:36, Ard Biesheuvel wrote:
>>> On Tue, 16 Jan 2024 at 06:21, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>
>>>> On 15.01.24 18:34, Ard Biesheuvel wrote:
>>>>> On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>
>>>>>> On 12.01.24 19:56, Palmer Dabbelt wrote:
>>>>>>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
>>>>>>>> Hi Jan,
>>>>>>>>
>>>>>>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>>>>
>>>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>>>
>>>>>>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
>>>>>>>>> under the risk that the compiler could use GP-relative addressing. That
>>>>>>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
>>>>>>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
>>>>>>>>> It was not yet observed with newer versions, but that may just be luck.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> Something like this should go to stable as well, but we will need
>>>>>>>>> rebased patches.
>>>>>>>>>
>>>>>>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
>>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
>>>>>>>>> b/drivers/firmware/efi/libstub/Makefile
>>>>>>>>> index 06964a3c130f..d561d7de46a9 100644
>>>>>>>>> --- a/drivers/firmware/efi/libstub/Makefile
>>>>>>>>> +++ b/drivers/firmware/efi/libstub/Makefile
>>>>>>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
>>>>>>>>> -DEFI_HAVE_STRNLEN \
>>>>>>>>>                                    -DEFI_HAVE_MEMCHR
>>>>>>>>> -DEFI_HAVE_STRRCHR \
>>>>>>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
>>>>>>>>> -fpic \
>>>>>>>>>                                    $(call
>>>>>>>>> cc-option,-mno-single-pic-base)
>>>>>>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
>>>>>>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
>>>>>>>>
>>>>>>>> Can we detect the presence of these references (via the relocation
>>>>>>>> type)? We already do something similar for ordinary absolute
>>>>>>>> references too.
>>>>>>>
>>>>>>> If there's no `__global_pointer$` symbol then the linker won't make
>>>>>>> GP-relative relaxations (because it doesn't know where GP is).  We
>>>>>>> usually define that symbol in the linker script, but I'm not entierly
>>>>>>> sure how libstub gets its linker script...
>>>>>>>
>>>>>>
>>>>>> The stub seems to be linked together with the rest of the kernel, thus
>>>>>> the regular arch/riscv/kernel/vmlinux.lds.S is used.
>>>>>>
>>>>>
>>>>> Indeed - the EFI stub is part of the same executable as vmlinux, we
>>>>> just mangle the symbol names to ensure that only code that can be
>>>>> safely called from the EFI stub can be linked to it.
>>>>>
>>>>> If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
>>>>> relocations, we should perhaps add those to the STUBCOPY_RELOC-y
>>>>> Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
>>>>> like the right value there to begin with: the idea of that is to
>>>>> disallow ELF relocations that evaluate to expressions that can only be
>>>>> known at runtime (like absolute addresses for global pointer
>>>>> variables)
>>>>
>>>> How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?
>>>>
>>>
>>> We'll need to keep the HI20, in fact - I got confused between HI20 and
>>> PCREL_HI20, and the former is actually used for 32-bit absolute
>>> addresses in 32-bit code.
>>>
>>> This seems to do the trick: it disallows relaxation relocations and
>>> native word sizes absolute references. AFAICT, those are the only ones
>>> we should care about.
>>>
>>> STUBCOPY_RELOC-$(CONFIG_RISCV) := -E
>>> R_RISCV_HI20\|R_RISCV_$(BITS)\|R_RISCV_RELAX
>>
>> I would suggest to do that on top of this patch. Want me to write such a
>> patch, or will you? You can probably more fluently explain why
>> R_RISCV_32/64 is important, I would first have to understand what that
>> is exactly. :)
>>
> 
> Sure, I can take care of that.
> 

Perfect.

> For your patch,
> 
> Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
> 
> I'll queue this up as a EFI fix.

Thanks. Will you take care of stable, or should I once the commit was
merged?

Jan
  
Ard Biesheuvel Jan. 16, 2024, 1:58 p.m. UTC | #10
On Tue, 16 Jan 2024 at 14:56, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 16.01.24 14:47, Ard Biesheuvel wrote:
> > On Tue, 16 Jan 2024 at 14:44, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>
> >> On 16.01.24 09:36, Ard Biesheuvel wrote:
> >>> On Tue, 16 Jan 2024 at 06:21, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>
> >>>> On 15.01.24 18:34, Ard Biesheuvel wrote:
> >>>>> On Sat, 13 Jan 2024 at 11:35, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>>>
> >>>>>> On 12.01.24 19:56, Palmer Dabbelt wrote:
> >>>>>>> On Fri, 12 Jan 2024 10:51:16 PST (-0800), Ard Biesheuvel wrote:
> >>>>>>>> Hi Jan,
> >>>>>>>>
> >>>>>>>> On Fri, 12 Jan 2024 at 19:37, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>>>>>>>
> >>>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>>>>
> >>>>>>>>> The cflags for the RISC-V efistub were missing -mno-relax, thus were
> >>>>>>>>> under the risk that the compiler could use GP-relative addressing. That
> >>>>>>>>> happened for _edata with binutils-2.41 and kernel 6.1, causing the
> >>>>>>>>> relocation to fail due to an invalid kernel_size in handle_kernel_image.
> >>>>>>>>> It was not yet observed with newer versions, but that may just be luck.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>>>> ---
> >>>>>>>>>
> >>>>>>>>> Something like this should go to stable as well, but we will need
> >>>>>>>>> rebased patches.
> >>>>>>>>>
> >>>>>>>>>  drivers/firmware/efi/libstub/Makefile | 2 +-
> >>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/firmware/efi/libstub/Makefile
> >>>>>>>>> b/drivers/firmware/efi/libstub/Makefile
> >>>>>>>>> index 06964a3c130f..d561d7de46a9 100644
> >>>>>>>>> --- a/drivers/firmware/efi/libstub/Makefile
> >>>>>>>>> +++ b/drivers/firmware/efi/libstub/Makefile
> >>>>>>>>> @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM)          += -DEFI_HAVE_STRLEN
> >>>>>>>>> -DEFI_HAVE_STRNLEN \
> >>>>>>>>>                                    -DEFI_HAVE_MEMCHR
> >>>>>>>>> -DEFI_HAVE_STRRCHR \
> >>>>>>>>>                                    -DEFI_HAVE_STRCMP -fno-builtin
> >>>>>>>>> -fpic \
> >>>>>>>>>                                    $(call
> >>>>>>>>> cc-option,-mno-single-pic-base)
> >>>>>>>>> -cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE
> >>>>>>>>> +cflags-$(CONFIG_RISCV)         += -fpic -DNO_ALTERNATIVE -mno-relax
> >>>>>>>>
> >>>>>>>> Can we detect the presence of these references (via the relocation
> >>>>>>>> type)? We already do something similar for ordinary absolute
> >>>>>>>> references too.
> >>>>>>>
> >>>>>>> If there's no `__global_pointer$` symbol then the linker won't make
> >>>>>>> GP-relative relaxations (because it doesn't know where GP is).  We
> >>>>>>> usually define that symbol in the linker script, but I'm not entierly
> >>>>>>> sure how libstub gets its linker script...
> >>>>>>>
> >>>>>>
> >>>>>> The stub seems to be linked together with the rest of the kernel, thus
> >>>>>> the regular arch/riscv/kernel/vmlinux.lds.S is used.
> >>>>>>
> >>>>>
> >>>>> Indeed - the EFI stub is part of the same executable as vmlinux, we
> >>>>> just mangle the symbol names to ensure that only code that can be
> >>>>> safely called from the EFI stub can be linked to it.
> >>>>>
> >>>>> If the effect of -mno-relax is to stop emitting R_RISCV_RELAX
> >>>>> relocations, we should perhaps add those to the STUBCOPY_RELOC-y
> >>>>> Makefile variable? (in the same file). BTW R_RISCV_HI20 doesn't seem
> >>>>> like the right value there to begin with: the idea of that is to
> >>>>> disallow ELF relocations that evaluate to expressions that can only be
> >>>>> known at runtime (like absolute addresses for global pointer
> >>>>> variables)
> >>>>
> >>>> How to do that best? Simply replace R_RISCV_HI20 with R_RISCV_RELAX?
> >>>>
> >>>
> >>> We'll need to keep the HI20, in fact - I got confused between HI20 and
> >>> PCREL_HI20, and the former is actually used for 32-bit absolute
> >>> addresses in 32-bit code.
> >>>
> >>> This seems to do the trick: it disallows relaxation relocations and
> >>> native word sizes absolute references. AFAICT, those are the only ones
> >>> we should care about.
> >>>
> >>> STUBCOPY_RELOC-$(CONFIG_RISCV) := -E
> >>> R_RISCV_HI20\|R_RISCV_$(BITS)\|R_RISCV_RELAX
> >>
> >> I would suggest to do that on top of this patch. Want me to write such a
> >> patch, or will you? You can probably more fluently explain why
> >> R_RISCV_32/64 is important, I would first have to understand what that
> >> is exactly. :)
> >>
> >
> > Sure, I can take care of that.
> >
>
> Perfect.
>
> > For your patch,
> >
> > Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
> >
> > I'll queue this up as a EFI fix.
>
> Thanks. Will you take care of stable, or should I once the commit was
> merged?
>

I'll add the cc:stable so we'll both get informed once the -stable
maintainers [attempt to] queue this up.
  

Patch

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 06964a3c130f..d561d7de46a9 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -28,7 +28,7 @@  cflags-$(CONFIG_ARM)		+= -DEFI_HAVE_STRLEN -DEFI_HAVE_STRNLEN \
 				   -DEFI_HAVE_MEMCHR -DEFI_HAVE_STRRCHR \
 				   -DEFI_HAVE_STRCMP -fno-builtin -fpic \
 				   $(call cc-option,-mno-single-pic-base)
-cflags-$(CONFIG_RISCV)		+= -fpic -DNO_ALTERNATIVE
+cflags-$(CONFIG_RISCV)		+= -fpic -DNO_ALTERNATIVE -mno-relax
 cflags-$(CONFIG_LOONGARCH)	+= -fpie
 
 cflags-$(CONFIG_EFI_PARAMS_FROM_FDT)	+= -I$(srctree)/scripts/dtc/libfdt