[v3] fs/binfmt_elf: Fix memsz > filesz handling
Commit Message
The old code for ELF interpreter loading could only handle
1 memsz > filesz segment. This is incorrect, as evidenced
by the elf program loading code, which could handle multiple
such segments.
Fix memsz > filesz handling for elf interpreters and refactor
interpreter/program BSS clearing into a common
codepath; also handle the case where a segment after a bss
could overwrite a bss clear.
This bug was uncovered on builds of ppc64le musl libc with
llvm lld 15.0.0, since ppc64 does not allocate file space
for its .plt.
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
---
v3:
- Minor commit message edit (make the whole commit msg imperative)
v2:
- Fixed possible problems mapping segments on top of a cleared bss
See https://github.com/heatd/elf-bug-questionmark for a repro and patch
comments for fix details.
Further refactoring brk clearing.
v1: https://lore.kernel.org/all/20221106021657.1145519-1-pedro.falcato@gmail.com/
fs/binfmt_elf.c | 277 ++++++++++++++++++++++++++++--------------------
1 file changed, 164 insertions(+), 113 deletions(-)
Comments
On Tue, Nov 08, 2022 at 11:07:15AM +0000, Pedro Falcato wrote:
> [...]
> + * This tail logic is skippable if we're the last phdr, as
> + * nothing can map an address >= our p_vaddr, since ELF phdr
> + * PT_LOAD segments are required to be sorted in an increasing
> + * order.
I'm still looking through the patch, but I do want to call this bit out
as a problem. The kernel cannot, unfortunately, make this assumption. See:
https://lore.kernel.org/linux-fsdevel/YfOooXQ2ScpZLhmD@fractal.localdomain/
"It turns out that almost all native Linux games published by the Virtual
Programming company have this kind of weird PT_LOAD ordering including
the famous Bioshock Infinite ... Someone should probably ask Virtual
Programming, what kind of tooling they use to create such convoluted
ELF binaries."
So, even though it's in violation of the spec, these binaries exist in
the real world, and we cannot break them. :(
On Fri, Nov 11, 2022 at 3:38 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Nov 08, 2022 at 11:07:15AM +0000, Pedro Falcato wrote:
> > [...]
> > + * This tail logic is skippable if we're the last phdr, as
> > + * nothing can map an address >= our p_vaddr, since ELF phdr
> > + * PT_LOAD segments are required to be sorted in an increasing
> > + * order.
>
> I'm still looking through the patch, but I do want to call this bit out
> as a problem. The kernel cannot, unfortunately, make this assumption. See:
> https://lore.kernel.org/linux-fsdevel/YfOooXQ2ScpZLhmD@fractal.localdomain/
Ugh. I guess it doesn't matter in this situation? That logic only
matters if we're trying to fix this new loading bug, and old
executables load correctly with the old behavior anyway,
which is what you get if that logic falls through.
I don't know if this makes sense, but in my (possibly naive) opinion
we have a compromise to keep loading what could already be loaded,
without actually requiring to load broken ELFs 100% correctly.
We could of course also just sort the program headers at load time,
but I assume that's unwanted overhead for most well behaved ELF
program headers :)
On Fri, Nov 11, 2022 at 03:59:08AM +0000, Pedro Falcato wrote:
> We could of course also just sort the program headers at load time,
> but I assume that's unwanted overhead for most well behaved ELF
> program headers :)
Large refactoring of the ELF loader needs proper unit testing, and we're
still a bit away from that existing. In the meantime, we'll need to make
very very small changes to fix bugs. I've sent a minimal change which I
think should fix the problem (now at v2 since right after sending it I
realized I was trading one accidentally correct state for another in the
v1):
https://lore.kernel.org/linux-hardening/20221111061315.gonna.703-kees@kernel.org/
Hi Pedro,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on kees/for-next/pstore]
[also build test ERROR on kees/for-next/kspp ebiederm-user-namespace/for-next linus/master v6.1-rc4]
[cannot apply to kees/for-next/execve next-20221111]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Pedro-Falcato/fs-binfmt_elf-Fix-memsz-filesz-handling/20221108-190918
base: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore
patch link: https://lore.kernel.org/r/20221108110715.227062-1-pedro.falcato%40gmail.com
patch subject: [PATCH v3] fs/binfmt_elf: Fix memsz > filesz handling
config: i386-randconfig-c001
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/95d3dfbe2432d0980b6a71d396b1d2cebcc378b0
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Pedro-Falcato/fs-binfmt_elf-Fix-memsz-filesz-handling/20221108-190918
git checkout 95d3dfbe2432d0980b6a71d396b1d2cebcc378b0
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
fs/binfmt_elf.c: In function 'load_elf_library':
>> fs/binfmt_elf.c:1480:13: error: too few arguments to function 'padzero'
1480 | if (padzero(elf_bss)) {
| ^~~~~~~
fs/binfmt_elf.c:117:12: note: declared here
117 | static int padzero(unsigned long elf_bss, unsigned long len)
| ^~~~~~~
vim +/padzero +1480 fs/binfmt_elf.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 1415
69369a7003735d Josh Triplett 2014-04-03 1416 #ifdef CONFIG_USELIB
^1da177e4c3f41 Linus Torvalds 2005-04-16 1417 /* This is really simpleminded and specialized - we are loading an
^1da177e4c3f41 Linus Torvalds 2005-04-16 1418 a.out library that is given an ELF header. */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1419 static int load_elf_library(struct file *file)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1420 {
^1da177e4c3f41 Linus Torvalds 2005-04-16 1421 struct elf_phdr *elf_phdata;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1422 struct elf_phdr *eppnt;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1423 unsigned long elf_bss, bss, len;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1424 int retval, error, i, j;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1425 struct elfhdr elf_ex;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1426
^1da177e4c3f41 Linus Torvalds 2005-04-16 1427 error = -ENOEXEC;
658c0335651185 Alexey Dobriyan 2019-12-04 1428 retval = elf_read(file, &elf_ex, sizeof(elf_ex), 0);
658c0335651185 Alexey Dobriyan 2019-12-04 1429 if (retval < 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1430 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1431
^1da177e4c3f41 Linus Torvalds 2005-04-16 1432 if (memcmp(elf_ex.e_ident, ELFMAG, SELFMAG) != 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1433 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1434
^1da177e4c3f41 Linus Torvalds 2005-04-16 1435 /* First of all, some simple consistency checks */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1436 if (elf_ex.e_type != ET_EXEC || elf_ex.e_phnum > 2 ||
72c2d531920048 Al Viro 2013-09-22 1437 !elf_check_arch(&elf_ex) || !file->f_op->mmap)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1438 goto out;
4755200b6b116d Nicolas Pitre 2017-08-16 1439 if (elf_check_fdpic(&elf_ex))
4755200b6b116d Nicolas Pitre 2017-08-16 1440 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1441
^1da177e4c3f41 Linus Torvalds 2005-04-16 1442 /* Now read in all of the header information */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1443
^1da177e4c3f41 Linus Torvalds 2005-04-16 1444 j = sizeof(struct elf_phdr) * elf_ex.e_phnum;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1445 /* j < ELF_MIN_ALIGN because elf_ex.e_phnum <= 2 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1446
^1da177e4c3f41 Linus Torvalds 2005-04-16 1447 error = -ENOMEM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1448 elf_phdata = kmalloc(j, GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1449 if (!elf_phdata)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1450 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1451
^1da177e4c3f41 Linus Torvalds 2005-04-16 1452 eppnt = elf_phdata;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1453 error = -ENOEXEC;
658c0335651185 Alexey Dobriyan 2019-12-04 1454 retval = elf_read(file, eppnt, j, elf_ex.e_phoff);
658c0335651185 Alexey Dobriyan 2019-12-04 1455 if (retval < 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1456 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1457
^1da177e4c3f41 Linus Torvalds 2005-04-16 1458 for (j = 0, i = 0; i<elf_ex.e_phnum; i++)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1459 if ((eppnt + i)->p_type == PT_LOAD)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1460 j++;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1461 if (j != 1)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1462 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1463
^1da177e4c3f41 Linus Torvalds 2005-04-16 1464 while (eppnt->p_type != PT_LOAD)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1465 eppnt++;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1466
^1da177e4c3f41 Linus Torvalds 2005-04-16 1467 /* Now use mmap to map the library into memory. */
6be5ceb02e98ea Linus Torvalds 2012-04-20 1468 error = vm_mmap(file,
^1da177e4c3f41 Linus Torvalds 2005-04-16 1469 ELF_PAGESTART(eppnt->p_vaddr),
^1da177e4c3f41 Linus Torvalds 2005-04-16 1470 (eppnt->p_filesz +
^1da177e4c3f41 Linus Torvalds 2005-04-16 1471 ELF_PAGEOFFSET(eppnt->p_vaddr)),
^1da177e4c3f41 Linus Torvalds 2005-04-16 1472 PROT_READ | PROT_WRITE | PROT_EXEC,
42be8b42535183 David Hildenbrand 2021-04-22 1473 MAP_FIXED_NOREPLACE | MAP_PRIVATE,
^1da177e4c3f41 Linus Torvalds 2005-04-16 1474 (eppnt->p_offset -
^1da177e4c3f41 Linus Torvalds 2005-04-16 1475 ELF_PAGEOFFSET(eppnt->p_vaddr)));
^1da177e4c3f41 Linus Torvalds 2005-04-16 1476 if (error != ELF_PAGESTART(eppnt->p_vaddr))
^1da177e4c3f41 Linus Torvalds 2005-04-16 1477 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1478
^1da177e4c3f41 Linus Torvalds 2005-04-16 1479 elf_bss = eppnt->p_vaddr + eppnt->p_filesz;
^1da177e4c3f41 Linus Torvalds 2005-04-16 @1480 if (padzero(elf_bss)) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 1481 error = -EFAULT;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1482 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1483 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1484
24962af7e1041b Oscar Salvador 2018-07-13 1485 len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
24962af7e1041b Oscar Salvador 2018-07-13 1486 bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
ecc2bc8ac03884 Michal Hocko 2016-05-23 1487 if (bss > len) {
ecc2bc8ac03884 Michal Hocko 2016-05-23 1488 error = vm_brk(len, bss - len);
5d22fc25d4fc80 Linus Torvalds 2016-05-27 1489 if (error)
ecc2bc8ac03884 Michal Hocko 2016-05-23 1490 goto out_free_ph;
ecc2bc8ac03884 Michal Hocko 2016-05-23 1491 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1492 error = 0;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1493
^1da177e4c3f41 Linus Torvalds 2005-04-16 1494 out_free_ph:
^1da177e4c3f41 Linus Torvalds 2005-04-16 1495 kfree(elf_phdata);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1496 out:
^1da177e4c3f41 Linus Torvalds 2005-04-16 1497 return error;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1498 }
69369a7003735d Josh Triplett 2014-04-03 1499 #endif /* #ifdef CONFIG_USELIB */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1500
On Fri, Nov 11, 2022 at 6:15 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Nov 11, 2022 at 03:59:08AM +0000, Pedro Falcato wrote:
> > We could of course also just sort the program headers at load time,
> > but I assume that's unwanted overhead for most well behaved ELF
> > program headers :)
>
> Large refactoring of the ELF loader needs proper unit testing, and we're
> still a bit away from that existing. In the meantime, we'll need to make
> very very small changes to fix bugs. I've sent a minimal change which I
> think should fix the problem (now at v2 since right after sending it I
> realized I was trading one accidentally correct state for another in the
> v1):
> https://lore.kernel.org/linux-hardening/20221111061315.gonna.703-kees@kernel.org/
Got it. I understand you may be a bit nervous deploying this patch ATM.
What are we missing for ELF loader kunit testing? How can one help?
Note that my -v1 is still relatively safe and was already tested, you
could just apply that.
Thanks,
Pedro
Hi Pedro,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on kees/for-next/pstore]
[also build test ERROR on kees/for-next/kspp linus/master v6.1-rc4]
[cannot apply to kees/for-next/execve next-20221111]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Pedro-Falcato/fs-binfmt_elf-Fix-memsz-filesz-handling/20221108-190918
base: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore
patch link: https://lore.kernel.org/r/20221108110715.227062-1-pedro.falcato%40gmail.com
patch subject: [PATCH v3] fs/binfmt_elf: Fix memsz > filesz handling
config: x86_64-randconfig-a005
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/95d3dfbe2432d0980b6a71d396b1d2cebcc378b0
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Pedro-Falcato/fs-binfmt_elf-Fix-memsz-filesz-handling/20221108-190918
git checkout 95d3dfbe2432d0980b6a71d396b1d2cebcc378b0
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
>> fs/binfmt_elf.c:1480:21: error: too few arguments to function call, expected 2, have 1
if (padzero(elf_bss)) {
~~~~~~~ ^
fs/binfmt_elf.c:117:12: note: 'padzero' declared here
static int padzero(unsigned long elf_bss, unsigned long len)
^
1 error generated.
vim +1480 fs/binfmt_elf.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 1415
69369a7003735d Josh Triplett 2014-04-03 1416 #ifdef CONFIG_USELIB
^1da177e4c3f41 Linus Torvalds 2005-04-16 1417 /* This is really simpleminded and specialized - we are loading an
^1da177e4c3f41 Linus Torvalds 2005-04-16 1418 a.out library that is given an ELF header. */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1419 static int load_elf_library(struct file *file)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1420 {
^1da177e4c3f41 Linus Torvalds 2005-04-16 1421 struct elf_phdr *elf_phdata;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1422 struct elf_phdr *eppnt;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1423 unsigned long elf_bss, bss, len;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1424 int retval, error, i, j;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1425 struct elfhdr elf_ex;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1426
^1da177e4c3f41 Linus Torvalds 2005-04-16 1427 error = -ENOEXEC;
658c0335651185 Alexey Dobriyan 2019-12-04 1428 retval = elf_read(file, &elf_ex, sizeof(elf_ex), 0);
658c0335651185 Alexey Dobriyan 2019-12-04 1429 if (retval < 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1430 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1431
^1da177e4c3f41 Linus Torvalds 2005-04-16 1432 if (memcmp(elf_ex.e_ident, ELFMAG, SELFMAG) != 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1433 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1434
^1da177e4c3f41 Linus Torvalds 2005-04-16 1435 /* First of all, some simple consistency checks */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1436 if (elf_ex.e_type != ET_EXEC || elf_ex.e_phnum > 2 ||
72c2d531920048 Al Viro 2013-09-22 1437 !elf_check_arch(&elf_ex) || !file->f_op->mmap)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1438 goto out;
4755200b6b116d Nicolas Pitre 2017-08-16 1439 if (elf_check_fdpic(&elf_ex))
4755200b6b116d Nicolas Pitre 2017-08-16 1440 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1441
^1da177e4c3f41 Linus Torvalds 2005-04-16 1442 /* Now read in all of the header information */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1443
^1da177e4c3f41 Linus Torvalds 2005-04-16 1444 j = sizeof(struct elf_phdr) * elf_ex.e_phnum;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1445 /* j < ELF_MIN_ALIGN because elf_ex.e_phnum <= 2 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1446
^1da177e4c3f41 Linus Torvalds 2005-04-16 1447 error = -ENOMEM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1448 elf_phdata = kmalloc(j, GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1449 if (!elf_phdata)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1450 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1451
^1da177e4c3f41 Linus Torvalds 2005-04-16 1452 eppnt = elf_phdata;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1453 error = -ENOEXEC;
658c0335651185 Alexey Dobriyan 2019-12-04 1454 retval = elf_read(file, eppnt, j, elf_ex.e_phoff);
658c0335651185 Alexey Dobriyan 2019-12-04 1455 if (retval < 0)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1456 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1457
^1da177e4c3f41 Linus Torvalds 2005-04-16 1458 for (j = 0, i = 0; i<elf_ex.e_phnum; i++)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1459 if ((eppnt + i)->p_type == PT_LOAD)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1460 j++;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1461 if (j != 1)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1462 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1463
^1da177e4c3f41 Linus Torvalds 2005-04-16 1464 while (eppnt->p_type != PT_LOAD)
^1da177e4c3f41 Linus Torvalds 2005-04-16 1465 eppnt++;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1466
^1da177e4c3f41 Linus Torvalds 2005-04-16 1467 /* Now use mmap to map the library into memory. */
6be5ceb02e98ea Linus Torvalds 2012-04-20 1468 error = vm_mmap(file,
^1da177e4c3f41 Linus Torvalds 2005-04-16 1469 ELF_PAGESTART(eppnt->p_vaddr),
^1da177e4c3f41 Linus Torvalds 2005-04-16 1470 (eppnt->p_filesz +
^1da177e4c3f41 Linus Torvalds 2005-04-16 1471 ELF_PAGEOFFSET(eppnt->p_vaddr)),
^1da177e4c3f41 Linus Torvalds 2005-04-16 1472 PROT_READ | PROT_WRITE | PROT_EXEC,
42be8b42535183 David Hildenbrand 2021-04-22 1473 MAP_FIXED_NOREPLACE | MAP_PRIVATE,
^1da177e4c3f41 Linus Torvalds 2005-04-16 1474 (eppnt->p_offset -
^1da177e4c3f41 Linus Torvalds 2005-04-16 1475 ELF_PAGEOFFSET(eppnt->p_vaddr)));
^1da177e4c3f41 Linus Torvalds 2005-04-16 1476 if (error != ELF_PAGESTART(eppnt->p_vaddr))
^1da177e4c3f41 Linus Torvalds 2005-04-16 1477 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1478
^1da177e4c3f41 Linus Torvalds 2005-04-16 1479 elf_bss = eppnt->p_vaddr + eppnt->p_filesz;
^1da177e4c3f41 Linus Torvalds 2005-04-16 @1480 if (padzero(elf_bss)) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 1481 error = -EFAULT;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1482 goto out_free_ph;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1483 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1484
24962af7e1041b Oscar Salvador 2018-07-13 1485 len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
24962af7e1041b Oscar Salvador 2018-07-13 1486 bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
ecc2bc8ac03884 Michal Hocko 2016-05-23 1487 if (bss > len) {
ecc2bc8ac03884 Michal Hocko 2016-05-23 1488 error = vm_brk(len, bss - len);
5d22fc25d4fc80 Linus Torvalds 2016-05-27 1489 if (error)
ecc2bc8ac03884 Michal Hocko 2016-05-23 1490 goto out_free_ph;
ecc2bc8ac03884 Michal Hocko 2016-05-23 1491 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1492 error = 0;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1493
^1da177e4c3f41 Linus Torvalds 2005-04-16 1494 out_free_ph:
^1da177e4c3f41 Linus Torvalds 2005-04-16 1495 kfree(elf_phdata);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1496 out:
^1da177e4c3f41 Linus Torvalds 2005-04-16 1497 return error;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1498 }
69369a7003735d Josh Triplett 2014-04-03 1499 #endif /* #ifdef CONFIG_USELIB */
^1da177e4c3f41 Linus Torvalds 2005-04-16 1500
On Fri, Nov 11, 2022 at 05:14:47PM +0000, Pedro Falcato wrote:
> On Fri, Nov 11, 2022 at 6:15 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Fri, Nov 11, 2022 at 03:59:08AM +0000, Pedro Falcato wrote:
> > > We could of course also just sort the program headers at load time,
> > > but I assume that's unwanted overhead for most well behaved ELF
> > > program headers :)
> >
> > Large refactoring of the ELF loader needs proper unit testing, and we're
> > still a bit away from that existing. In the meantime, we'll need to make
> > very very small changes to fix bugs. I've sent a minimal change which I
> > think should fix the problem (now at v2 since right after sending it I
> > realized I was trading one accidentally correct state for another in the
> > v1):
> > https://lore.kernel.org/linux-hardening/20221111061315.gonna.703-kees@kernel.org/
> Got it. I understand you may be a bit nervous deploying this patch ATM.
>
> What are we missing for ELF loader kunit testing? How can one help?
>
> Note that my -v1 is still relatively safe and was already tested, you
> could just apply that.
Even the v1 is a LOT of refactoring. I'd like to avoid any factoring
like this as much as possible given how fragile the code has proven to
be.
As for unit testing, we need two prerequisites:
- mocking:
https://lore.kernel.org/lkml/20220910212804.670622-1-davidgow@google.com/
- userspace VMA support:
https://lore.kernel.org/lkml/202211061948.46D3F78@keescook/
@@ -109,37 +109,20 @@ static struct linux_binfmt elf_format = {
#define BAD_ADDR(x) (unlikely((unsigned long)(x) >= TASK_SIZE))
-static int set_brk(unsigned long start, unsigned long end, int prot)
-{
- start = ELF_PAGEALIGN(start);
- end = ELF_PAGEALIGN(end);
- if (end > start) {
- /*
- * Map the last of the bss segment.
- * If the header is requesting these pages to be
- * executable, honour that (ppc32 needs this).
- */
- int error = vm_brk_flags(start, end - start,
- prot & PROT_EXEC ? VM_EXEC : 0);
- if (error)
- return error;
- }
- current->mm->start_brk = current->mm->brk = end;
- return 0;
-}
-
/* We need to explicitly zero any fractional pages
after the data section (i.e. bss). This would
contain the junk from the file that should not
be in memory
*/
-static int padzero(unsigned long elf_bss)
+static int padzero(unsigned long elf_bss, unsigned long len)
{
unsigned long nbyte;
nbyte = ELF_PAGEOFFSET(elf_bss);
if (nbyte) {
nbyte = ELF_MIN_ALIGN - nbyte;
+ if (nbyte > len)
+ nbyte = len;
if (clear_user((void __user *) elf_bss, nbyte))
return -EFAULT;
}
@@ -584,6 +567,138 @@ static inline int make_prot(u32 p_flags, struct arch_elf_state *arch_state,
return arch_elf_adjust_prot(prot, arch_state, has_interp, is_interp);
}
+static int map_bss_anon(unsigned long start, unsigned long end, int prot)
+{
+ start = ELF_PAGEALIGN(start);
+ end = ELF_PAGEALIGN(end);
+
+ return (end > start ? vm_brk_flags(start, end - start,
+ prot & PROT_EXEC ? VM_EXEC : 0) : 0);
+}
+
+static void set_brk(unsigned long end)
+{
+ current->mm->start_brk = current->mm->brk = ELF_PAGEALIGN(end);
+}
+
+static int handle_bss_clearing(struct elf_phdr *elf_phdata,
+ struct elfhdr *elf_ex, unsigned long load_bias,
+ int is_interp)
+{
+ int i;
+ struct elf_phdr *elf_ppnt;
+ int retval = 0;
+ /*
+ * Do explicit bss clearing after loading everything.
+ * Why? Because we may get odd phdr layouts like:
+ * PHDR [0x1000, 0x1200] p_filesz 0 p_memsz 0x200 [.bss]
+ * PHDR [0x1200, 0x1300] p_filesz 0x100 p_memsz 0x200 [.data]
+ *
+ * If we mindlessly clear and map, we'll get loading bugs where we map
+ * anon bss and do our explicit clearing, then map [.data], which
+ * elf_map ends up mapping over our .bss, which effectively undoes the
+ * clear!
+ */
+ for (i = 0, elf_ppnt = elf_phdata;
+ i < elf_ex->e_phnum; i++, elf_ppnt++) {
+
+ unsigned long addr;
+ unsigned int n;
+ unsigned long end;
+ struct elf_phdr *next = NULL;
+ int j;
+
+ if (elf_ppnt->p_type != PT_LOAD)
+ continue;
+
+ if (elf_ppnt->p_memsz <= elf_ppnt->p_filesz)
+ continue;
+
+ addr = elf_ppnt->p_vaddr + load_bias + elf_ppnt->p_filesz;
+ n = ELF_PAGEOFFSET(addr);
+ end = elf_ppnt->p_vaddr + load_bias + elf_ppnt->p_memsz;
+
+ /*
+ * Find the next PT_LOAD in the list, if it exists.
+ */
+ for (j = i + 1; j < elf_ex->e_phnum; j++) {
+ struct elf_phdr *phdr = elf_ppnt + j - i;
+
+ if (phdr->p_type == PT_LOAD) {
+ next = phdr;
+ break;
+ }
+ }
+
+
+ /*
+ * Check if we need to zero the starting bss page, explicitly.
+ */
+ if (n != 0) {
+ unsigned long bss_len =
+ elf_ppnt->p_memsz - elf_ppnt->p_filesz;
+ unsigned long len;
+
+ if (next)
+ len = bss_len;
+ else
+ len = ELF_MIN_ALIGN;
+
+ /*
+ * Note: padzero will zero either
+ * [addr, end of addr_page] or [addr, addr + len],
+ * whatever comes first. The old bss behavior zeroed
+ * the whole last page *only* if it was the last bss
+ * segment, else it would only zero bss_len bytes.
+ * It turns out ld-linux relies on this to not crash,
+ * so we follow the old behavior.
+ */
+ retval = padzero(addr, len);
+
+ if (retval < 0)
+ return retval;
+ }
+
+ /*
+ * brk is defined with regards to the main program, not the
+ * interpreter.
+ */
+ if (!is_interp)
+ set_brk(end);
+
+ n = ELF_PAGEOFFSET(end);
+
+ /*
+ *
+ * The tail logic is a bit more complex. We need to re-clear
+ * the tail if something was mapped over that last page.
+ * How do we do it?
+ * We check if the next phdr maps this page.
+ * If so, we know there was a mmap MAP_FIXED over it, so
+ * re-clear it.
+ *
+ * This tail logic is skippable if we're the last phdr, as
+ * nothing can map an address >= our p_vaddr, since ELF phdr
+ * PT_LOAD segments are required to be sorted in an increasing
+ * order.
+ */
+ if (n == 0 || !next)
+ continue;
+
+ end = ELF_PAGESTART(end);
+
+ if (end == ELF_PAGESTART(next->p_vaddr + load_bias)) {
+ /*
+ * Re-clear the last n bytes
+ */
+ if (clear_user((void __user *) end, n))
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
/* This is much more generalized than the library routine read function,
so we keep this separate. Technically the library read function
is only provided so that we can read a.out libraries that have
@@ -597,8 +712,6 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
struct elf_phdr *eppnt;
unsigned long load_addr = 0;
int load_addr_set = 0;
- unsigned long last_bss = 0, elf_bss = 0;
- int bss_prot = 0;
unsigned long error = ~0UL;
unsigned long total_size;
int i;
@@ -662,49 +775,25 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
goto out;
}
- /*
- * Find the end of the file mapping for this phdr, and
- * keep track of the largest address we see for this.
- */
- k = load_addr + eppnt->p_vaddr + eppnt->p_filesz;
- if (k > elf_bss)
- elf_bss = k;
+ if (eppnt->p_memsz > eppnt->p_filesz) {
+ /*
+ * Handle BSS zeroing and mapping
+ */
+ unsigned long start = load_addr + vaddr + eppnt->p_filesz;
+ unsigned long end = load_addr + vaddr + eppnt->p_memsz;
- /*
- * Do the same thing for the memory mapping - between
- * elf_bss and last_bss is the bss section.
- */
- k = load_addr + eppnt->p_vaddr + eppnt->p_memsz;
- if (k > last_bss) {
- last_bss = k;
- bss_prot = elf_prot;
+ error = map_bss_anon(start, end, elf_prot);
+
+ if (error < 0)
+ goto out;
}
}
}
- /*
- * Now fill out the bss section: first pad the last page from
- * the file up to the page boundary, and zero it from elf_bss
- * up to the end of the page.
- */
- if (padzero(elf_bss)) {
- error = -EFAULT;
+ error = handle_bss_clearing(interp_elf_phdata, interp_elf_ex,
+ load_addr, 1);
+ if (error < 0)
goto out;
- }
- /*
- * Next, align both the file and mem bss up to the page size,
- * since this is where elf_bss was just zeroed up to, and where
- * last_bss will end after the vm_brk_flags() below.
- */
- elf_bss = ELF_PAGEALIGN(elf_bss);
- last_bss = ELF_PAGEALIGN(last_bss);
- /* Finally, if there is still more bss to allocate, do it. */
- if (last_bss > elf_bss) {
- error = vm_brk_flags(elf_bss, last_bss - elf_bss,
- bss_prot & PROT_EXEC ? VM_EXEC : 0);
- if (error)
- goto out;
- }
error = load_addr;
out:
@@ -829,8 +918,6 @@ static int load_elf_binary(struct linux_binprm *bprm)
unsigned long error;
struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL;
struct elf_phdr *elf_property_phdata = NULL;
- unsigned long elf_bss, elf_brk;
- int bss_prot = 0;
int retval, i;
unsigned long elf_entry;
unsigned long e_entry;
@@ -1020,9 +1107,6 @@ static int load_elf_binary(struct linux_binprm *bprm)
executable_stack);
if (retval < 0)
goto out_free_dentry;
-
- elf_bss = 0;
- elf_brk = 0;
start_code = ~0UL;
end_code = 0;
@@ -1041,33 +1125,6 @@ static int load_elf_binary(struct linux_binprm *bprm)
if (elf_ppnt->p_type != PT_LOAD)
continue;
- if (unlikely (elf_brk > elf_bss)) {
- unsigned long nbyte;
-
- /* There was a PT_LOAD segment with p_memsz > p_filesz
- before this one. Map anonymous pages, if needed,
- and clear the area. */
- retval = set_brk(elf_bss + load_bias,
- elf_brk + load_bias,
- bss_prot);
- if (retval)
- goto out_free_dentry;
- nbyte = ELF_PAGEOFFSET(elf_bss);
- if (nbyte) {
- nbyte = ELF_MIN_ALIGN - nbyte;
- if (nbyte > elf_brk - elf_bss)
- nbyte = elf_brk - elf_bss;
- if (clear_user((void __user *)elf_bss +
- load_bias, nbyte)) {
- /*
- * This bss-zeroing can fail if the ELF
- * file specifies odd protections. So
- * we don't check the return value
- */
- }
- }
- }
-
elf_prot = make_prot(elf_ppnt->p_flags, &arch_state,
!!interpreter, false);
@@ -1211,41 +1268,35 @@ static int load_elf_binary(struct linux_binprm *bprm)
k = elf_ppnt->p_vaddr + elf_ppnt->p_filesz;
- if (k > elf_bss)
- elf_bss = k;
+
+ if (elf_ppnt->p_memsz > elf_ppnt->p_filesz) {
+ unsigned long seg_end = elf_ppnt->p_vaddr +
+ elf_ppnt->p_memsz + load_bias;
+ retval = map_bss_anon(k + load_bias,
+ seg_end,
+ elf_prot);
+ if (retval)
+ goto out_free_dentry;
+ }
+
if ((elf_ppnt->p_flags & PF_X) && end_code < k)
end_code = k;
if (end_data < k)
end_data = k;
- k = elf_ppnt->p_vaddr + elf_ppnt->p_memsz;
- if (k > elf_brk) {
- bss_prot = elf_prot;
- elf_brk = k;
- }
}
+ retval = handle_bss_clearing(elf_phdata, elf_ex, load_bias, 0);
+
+ if (retval < 0)
+ goto out_free_dentry;
+
e_entry = elf_ex->e_entry + load_bias;
phdr_addr += load_bias;
- elf_bss += load_bias;
- elf_brk += load_bias;
start_code += load_bias;
end_code += load_bias;
start_data += load_bias;
end_data += load_bias;
- /* Calling set_brk effectively mmaps the pages that we need
- * for the bss and break sections. We must do this before
- * mapping in the interpreter, to make sure it doesn't wind
- * up getting placed where the bss needs to go.
- */
- retval = set_brk(elf_bss, elf_brk, bss_prot);
- if (retval)
- goto out_free_dentry;
- if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bss))) {
- retval = -EFAULT; /* Nobody gets to see this, but.. */
- goto out_free_dentry;
- }
-
if (interpreter) {
elf_entry = load_elf_interp(interp_elf_ex,
interpreter,