ld: Fix segfault in populate_publics_stream

Message ID 20221127023840.32080-1-mark@harmstone.com
State Accepted
Headers
Series ld: Fix segfault in populate_publics_stream |

Checks

Context Check Description
snail/binutils-gdb-check success Github commit url

Commit Message

Mark Harmstone Nov. 27, 2022, 2:38 a.m. UTC
  ---
 ld/pdb.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Jan Beulich Nov. 28, 2022, 2:54 p.m. UTC | #1
On 27.11.2022 03:38, Mark Harmstone wrote:
> --- a/ld/pdb.c
> +++ b/ld/pdb.c
> @@ -1413,6 +1413,9 @@ populate_publics_stream (bfd *stream, bfd *abfd, bfd *sym_rec_stream)

Out of curiosity - which tree was this diff generated against? The
line number here looks to be off by several hundred from what I
see in the repo right now.

>    for (bfd *in = coff_data (abfd)->link_info->input_bfds; in;
>         in = in->link.next)
>      {
> +      if (!in->outsymbols)
> +	continue;
> +
>        for (unsigned int i = 0; i < in->symcount; i++)
>  	{
>  	  struct bfd_symbol *sym = in->outsymbols[i];

Why / when would in->outsymbols be NULL but in->symcount be non-zero?
And if that was possible, why would it not also be possible that the
array is smaller than in->symcount? (This is the kind of questions
which arise when there's no description at all for a patch. Such a
description could have clarified under what special conditions a NULL
deref could happen despite it not being obviously possible.)

Jan
  
Mark Harmstone Nov. 28, 2022, 5:53 p.m. UTC | #2
> Out of curiosity - which tree was this diff generated against? The
 > line number here looks to be off by several hundred from what I
 > see in the repo right now.

This is inteded to be applied with the other patches in the series, the ones
beginning with "[PATCH v2] ld: Generate PDB string table" at
https://sourceware.org/pipermail/binutils/2022-November/thread.html.

Sorry, this probably wasn't obvious from a mail client. I've not numbered
them as I'm not sure how many more there'll be, and if I wait for the previous
patches to be accepted before submitting the next, I'll almost certainly miss
the cut-off for the code freeze.

 > Why / when would in->outsymbols be NULL but in->symcount be non-zero?

Try running the test in the DEBUG_S_LINES patch without this one - it'll fail
because ld segfaults. outsymbols doesn't get set from within generate_reloc
for the second object file, as it only has one non-loadable section. The
"symbols" come from the .equs I'm using like #defines.

 > And if that was possible, why would it not also be possible that the
 > array is smaller than in->symcount?

bfd_generic_link_read_symbols is called for each loadable section, and
allocates the outsymbols array once. It was my mistake when I submitted my
original patch for populate_publics_stream, in not realizing that it would
break for object files without any loadable sections.

Mark
  
Jan Beulich Nov. 29, 2022, 9 a.m. UTC | #3
On 28.11.2022 18:53, Mark Harmstone wrote:
>  > Out of curiosity - which tree was this diff generated against? The
>  > line number here looks to be off by several hundred from what I
>  > see in the repo right now.
> 
> This is inteded to be applied with the other patches in the series, the ones
> beginning with "[PATCH v2] ld: Generate PDB string table" at
> https://sourceware.org/pipermail/binutils/2022-November/thread.html.
> 
> Sorry, this probably wasn't obvious from a mail client. I've not numbered
> them as I'm not sure how many more there'll be, and if I wait for the previous
> patches to be accepted before submitting the next, I'll almost certainly miss
> the cut-off for the code freeze.

Such dependencies, if not otherwise obvious (like in a properly threaded
and numbered series) need calling out in a post-commit-message remark.

>  > Why / when would in->outsymbols be NULL but in->symcount be non-zero?
> 
> Try running the test in the DEBUG_S_LINES patch without this one - it'll fail
> because ld segfaults. outsymbols doesn't get set from within generate_reloc
> for the second object file, as it only has one non-loadable section. The
> "symbols" come from the .equs I'm using like #defines.

And then why would such symbols not need emitting debug info for? What
you say above and ...

>  > And if that was possible, why would it not also be possible that the
>  > array is smaller than in->symcount?
> 
> bfd_generic_link_read_symbols is called for each loadable section, and
> allocates the outsymbols array once. It was my mistake when I submitted my
> original patch for populate_publics_stream, in not realizing that it would
> break for object files without any loadable sections.

... here makes me think that assuming it is the right thing to do, it
wants not only properly describing in the patch, but should actually be
accompanied by a code comment.

Jan
  
Mark Harmstone Nov. 29, 2022, 5:47 p.m. UTC | #4
> it ... should actually be accompanied by a code comment.

Not really, as it was a straightforward mistake in my original patch. If I'd
included a NULL check to begin with, I doubt anyone would have batted an
eyelid.

I could have easily worked around this in my test, either by adding in a dummy
section or by using the preprocessor rather than .equ, but figured that any
way of causing a segfault is a bug that needs to be fixed.

Mark
  
Jan Beulich Nov. 30, 2022, 7 a.m. UTC | #5
On 29.11.2022 18:47, Mark Harmstone wrote:
>  > it ... should actually be accompanied by a code comment.
> 
> Not really, as it was a straightforward mistake in my original patch. If I'd
> included a NULL check to begin with, I doubt anyone would have batted an
> eyelid.

Maybe. Depends on how close a review would have been done.

> I could have easily worked around this in my test, either by adding in a dummy
> section or by using the preprocessor rather than .equ, but figured that any
> way of causing a segfault is a bug that needs to be fixed.

Of course. 

Jan
  

Patch

diff --git a/ld/pdb.c b/ld/pdb.c
index d133b3e1aaa..42bb1b3a91b 100644
--- a/ld/pdb.c
+++ b/ld/pdb.c
@@ -1413,6 +1413,9 @@  populate_publics_stream (bfd *stream, bfd *abfd, bfd *sym_rec_stream)
   for (bfd *in = coff_data (abfd)->link_info->input_bfds; in;
        in = in->link.next)
     {
+      if (!in->outsymbols)
+	continue;
+
       for (unsigned int i = 0; i < in->symcount; i++)
 	{
 	  struct bfd_symbol *sym = in->outsymbols[i];