Fixing ELF loader for systems with oversized pages [was: Re: [musl] Segmentation fault musl 1.2.4]

Message ID 20240130201701.GT4163@brightrain.aerifal.cx
State New
Headers
Series Fixing ELF loader for systems with oversized pages [was: Re: [musl] Segmentation fault musl 1.2.4] |

Commit Message

dalias@libc.org Jan. 30, 2024, 8:17 p.m. UTC
  On Tue, Jan 30, 2024 at 10:37:30AM -0500, Rich Felker wrote:
> On Tue, Jan 30, 2024 at 11:43:38AM +0100, Szabolcs Nagy wrote:
> > * Rich Felker <dalias@libc.org> [2024-01-16 15:45:52 -0500]:
> > 
> > > On Tue, Jan 16, 2024 at 07:29:18PM +0100, Szabolcs Nagy wrote:
> > > > * Cody Wetzel <codyawetzel@gmail.com> [2024-01-16 09:21:05 -0600]:
> > > > > Here is the output for the old
> > > > > ....
> > > > > >
> > > > > > / # /tmp/ld-musl-armhf.so.1 /usr/bin/readelf -lW /tmp/ld-musl-armhf.so.1
> > > > > >
> > > > > > Elf file type is DYN (Shared object file)
> > > > > > Entry point 0x359cd
> > > > > > There are 6 program headers, starting at offset 52
> > > > > >
> > > > > > Program Headers:
> > > > > >   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > > > > >   EXIDX          0x07acec 0x0007acec 0x0007acec 0x00008 0x00008 R   0x4
> > > > > >   LOAD           0x000000 0x00000000 0x00000000 0x7acf4 0x7acf4 R E 0x10000
> > > > > >   LOAD           0x07fd6c 0x0008fd6c 0x0008fd6c 0x0054a 0x02258 RW  0x10000
> > > > 
> > > > this load segment is 64k aligned.
> > > > 
> > > > > >   DYNAMIC        0x07febc 0x0008febc 0x0008febc 0x000c0 0x000c0 RW  0x4
> > > > > >   GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
> > > > > >   GNU_RELRO      0x07fd6c 0x0008fd6c 0x0008fd6c 0x00294 0x00294 R   0x1
> > > > > >
> > > > > >  Section to Segment mapping:
> > > > > >   Segment Sections...
> > > > > >    00     .ARM.exidx
> > > > > >    01     .hash .gnu.hash .dynsym .dynstr .rel.dyn .rel.plt .plt .text
> > > > > > .rodata .ARM.exidx
> > > > > >    02     .data.rel.ro .dynamic .got .data .bss
> > > > > >    03     .dynamic
> > > > > >    04
> > > > > >    05     .data.rel.ro .dynamic .got
> > > > > >
> > > > > 
> > > > > And the new...
> > > > > 
> > > > > / # /tmp/ld-musl-armhf.so.1 /usr/bin/readelf -lW /lib/ld-musl-armhf.so.1
> > > > > >
> > > > > > Elf file type is DYN (Shared object file)
> > > > > > Entry point 0x362f1
> > > > > > There are 6 program headers, starting at offset 52
> > > > > >
> > > > > > Program Headers:
> > > > > >   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > > > > >   EXIDX          0x07b81c 0x0007b81c 0x0007b81c 0x00008 0x00008 R   0x4
> > > > > >   LOAD           0x000000 0x00000000 0x00000000 0x7b824 0x7b824 R E 0x1000
> > > > > >   LOAD           0x07bd74 0x0007cd74 0x0007cd74 0x0054a 0x0225c RW  0x1000
> > > > 
> > > > this load segment is 4k aligned and offset vs addr is not congruent
> > > > modulo 64k, or 32k, so won't work on systems with such page size.
> > > > 
> > > > > >   DYNAMIC        0x07bebc 0x0007cebc 0x0007cebc 0x000c0 0x000c0 RW  0x4
> > > > > >   GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
> > > > > >   GNU_RELRO      0x07bd74 0x0007cd74 0x0007cd74 0x0028c 0x0028c R   0x1
> > > > > >
> > > > > >  Section to Segment mapping:
> > > > > >   Segment Sections...
> > > > > >    00     .ARM.exidx
> > > > > >    01     .hash .gnu.hash .dynsym .dynstr .rel.dyn .rel.plt .plt .text
> > > > > > .rodata .ARM.exidx
> > > > > >    02     .data.rel.ro .dynamic .got .data .bss
> > > > > >    03     .dynamic
> > > > > >    04
> > > > > >    05     .data.rel.ro .dynamic .got
> > > > > 
> > > > > 
> > > > > I hope that helps.
> > > > 
> > > > yes, this is a linking issue, not musl libc.
> > > > 
> > > > alpine linux links binaries for 4k pagesize only.
> > > > 
> > > > arm linkers were updated at some point to create binaries supporting
> > > > up to 64k pagesize.  i suspect some ppl ran into issues in practice
> > > > and decided the larger binaries are not worth it, if they dont work
> > > > reliably and forced 4k page size at link time.
> > > > 
> > > > you have to raise an issue with alpine linux, if you think 32k
> > > > oage size is useful and reliably supportable.
> > > 
> > > Are they using -Wl,-z,separate-code? That incurs a large
> > > binary-size-on-disk penalty when supporting oversized pages, and IIRC
> > > something was done to make the linker default to not supporting
> > > oversized pages when that's used. It might be the reason, if arm
> > > linking is normally expected to use a larger max pagesize.
> > 
> > i looked at this now, turns out they just changed the
> > pagesize back to 4k (i missed this change):
> > 
> > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1a26a53a0dee39106ba58fcb15496c5f13074652
> 
> This doesn't help immediately, but a major ingredient to fix this
> situation would be getting the kernel to stop doing the wrong thing.
> Right now, it's ignoring the fact that the ELF program header
> constraints are incompatible with mmap given the oversized system
> pagesize, and just incorrectly mapping the executable and trying to
> run it anyway, whereby it blows up.
> 
> The right thing to do would be either to fail with ENOEXEC in this
> case, or when mmap with the required offset constraint fails, falling
> back to making an anonymous map and copying the whole content of the
> loadable segment into that (no COW sharing). The latter is really not
> all that bad for got/data/etc. mappings which you expect will be dirty
> (modified) anyway.
> 
> BTW the former choice (ENOEXEC) would allow doing the latter in
> userspace with a binfmt_misc loader.

Completely untested draft patch showing the concept is attached.

Rich
  

Patch

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index f8c7f26f1fbb..45c50f379377 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -861,6 +861,12 @@  static int load_elf_binary(struct linux_binprm *bprm)
 	if (!elf_phdata)
 		goto out;
 
+	elf_ppnt = elf_phdata;
+	for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++) {
+		if (elf_ppnt->p_type != PT_LOAD) continue;
+		if (ELF_PAGEOFFSET(elf_ppnt->p_vaddr - elf_ppnt->p_offset))
+			goto out;
+	}
 	elf_ppnt = elf_phdata;
 	for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++) {
 		char *elf_interpreter;
@@ -962,6 +968,13 @@  static int load_elf_binary(struct linux_binprm *bprm)
 		if (!interp_elf_phdata)
 			goto out_free_dentry;
 
+		elf_ppnt = interp_elf_phdata;
+		for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++) {
+			if (elf_ppnt->p_type != PT_LOAD) continue;
+			if (ELF_PAGEOFFSET(elf_ppnt->p_vaddr - elf_ppnt->p_offset))
+				goto out_free_dentry;
+		}
+
 		/* Pass PT_LOPROC..PT_HIPROC headers to arch code */
 		elf_property_phdata = NULL;
 		elf_ppnt = interp_elf_phdata;