Fix assertion for unwind-dw2-fde.c btree changes

Message ID 2TMB4YQOP1E1R.2QLW7HCD1NVF3@8pit.net
State Accepted
Headers
Series Fix assertion for unwind-dw2-fde.c btree changes |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Sören Tempel May 14, 2023, 4:09 p.m. UTC
  Dear Thomas Neumann,

I am contacting you as the author of the commit with commit hash
6e80a1d164d1f996ad08a512c000025a7c2ca893 [1] in the GCC repository. In
this commit, you refactored the __register_frame/__deregister_frame
implementation in GCC. The commit is part of the GCC 13 release.
While upgrading the GCC version in Alpine Linux from GCC 12 to GCC 13
we ran into a regression introduced by these changes. The regression
manifests itself in a failing assertion in __deregister_frame_info_bases.
The assertion failure was observed while using Chromium's `flatc` build
system tool. The failing assertion is:

	unwind-dw2-fde.c:281	gcc_assert (in_shutdown || ob);

The assertion fails for us because ob is a null pointer and in_shutdown
is zero. The backtrace for the assertion failure looks as follows:

	#0  __restore_sigs (set=set@entry=0x7fffffffe050) at ./arch/x86_64/syscall_arch.h:40
	#1  0x00007ffff7facea5 in raise (sig=sig@entry=6) at src/signal/raise.c:11
	#2  0x00007ffff7f7ffa8 in abort () at src/exit/abort.c:11
	#3  0x00007ffff7f3d411 in __deregister_frame_info_bases (begin=0x55555557ef58)
	    at /home/buildozer/aports/main/gcc/src/gcc-13-20230506/libgcc/unwind-dw2-fde.c:281
	#4  __deregister_frame_info_bases (begin=0x55555557ef58)
	    at /home/buildozer/aports/main/gcc/src/gcc-13-20230506/libgcc/unwind-dw2-fde.c:219
	#5  0x0000555555580072 in __do_global_dtors_aux ()
	#6  0x000000185eb03fee in ?? ()
	#7  0x00007ffff7fc1ad6 in __libc_exit_fini () at ldso/dynlink.c:1453
	#8  0x00007ffff7f78082 in exit (code=1) at src/exit/exit.c:30
	#9  0x00005555555802a6 in Error(flatbuffers::FlatCompiler const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool) ()
	#10 0x000055555560f952 in flatbuffers::FlatCompiler::ParseFromCommandLineArguments(int, char const**) ()
	#11 0x0000555555581b42 in main (

I noticed that you already pushed a fixup for the aforementioned
assertion in commit 386ebf75f4c0342b1f823f4e4aba07abda3288d1 [2].
However, I believe there is one more edge case that isn't being account
for presently: If the inserted entry has a size of 0 (i.e. if range[1] -
range[0] == 0) then the btree_insert call in __register_frame_info_bases
will not insert anything. This is not accounted for in
__deregister_frame_info_bases as it is assumed that the prior
btree_insert call in __register_frame_info_bases always inserted
something into the lookup data structure.

Based on the information contained in the backtrace shown above, this
behavior can be observed in the following gdb debug session:

	(gdb) break __register_frame_info_bases if begin==0x55555557ef58
	(gdb) run
	Breakpoint 11.1, __register_frame_info_bases (begin=0x55555557ef58, ob=0x555555907f60 <object>, tbase=0x0, dbase=0x0)
	    at /home/buildozer/aports/main/gcc/src/gcc-13-20230506/libgcc/unwind-dw2-fde.c:111
	(gdb) break btree_insert
	(gdb) cont
	Continuing.
	Breakpoint 12, btree_insert (base=0, size=0, ob=0x555555907f60 <object>, t=0x7ffff7f5d290 <registered_frames>)
		at /home/buildozer/aports/main/gcc/src/gcc-13-20230506/libgcc/unwind-dw2-btree.h:726
	726       if (!size)
	(gdb) p size
	$1 = 0
	(gdb) next
	727         return false;

From the above debug output, we can deduce that nothing was inserted into
the lookup data structure for the frame beginning at 0x55555557ef58
because the size of the range is zero. If we set at breakpoint in
__deregister_frame_info_bases for the same frame we can observe the
following:

	(gdb) break __deregister_frame_info_bases if begin==0x55555557ef58
	Continuing.
	/home/buildozer/aports/community/chromium/src/chromium-113.0.5672.92/out/bld/flatc:
	Breakpoint 13.1, __deregister_frame_info_bases (begin=0x55555557ef58)
	    at /home/buildozer/aports/main/gcc/src/gcc-13-20230506/libgcc/unwind-dw2-fde.c:220
	(gdb) break unwind-dw2-fde.c:242
	(gdb) cont
	242       ob = btree_remove (&registered_frames, range[0]);
	(gdb) p range
	$2 = {0, 0}
	(gdb) next
	(gdb) p ob
	$3 = 0x0

Naturally, since nothing was inserted into the lookup data structure for
this frame, btree_remove isn't able to remove anything and returns a
null pointer for ob. This then causes the aforementioned assertion
failure.

A git-format-patch(1) for the assertion is attached, which adds handling
for the edge case that nothing was inserted via btree_insert in
__register_frame_info_bases to __deregister_frame_info_bases.

Would be cool if this could be fixed on the GCC trunk.

Greetings
Sören Tempel

[1]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6e80a1d164d1f996ad08a512c000025a7c2ca893
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=386ebf75f4c0342b1f823f4e4aba07abda3288d1
From 6dc56564cad69a26595cc38956355e5be7d2c2b0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?S=C3=B6ren=20Tempel?= <soeren+git@soeren-tempel.net>
Date: Sun, 14 May 2023 19:30:21 +0200
Subject: [PATCH] fix assert in __deregister_frame_info_bases
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The assertion in __deregister_frame_info_bases assumes that for every
frame something was inserted into the lookup data structure by
__register_frame_info_bases. Unfortunately, this does not necessarily
hold true as the btree_insert call in __register_frame_info_bases will
not insert anything for empty ranges. Therefore, we need to explicitly
account for such empty ranges in the assertion as `ob` will be a null
pointer for such ranges, hence causing the assertion to fail.

Signed-off-by: Sören Tempel <soeren@soeren-tempel.net>
---
 libgcc/unwind-dw2-fde.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Thomas Neumann May 14, 2023, 6:59 p.m. UTC | #1
Dear Sören,

> we ran into a regression introduced by these changes. The regression
> manifests itself in a failing assertion in __deregister_frame_info_bases.
> The assertion failure was observed while using Chromium's `flatc` build
> system tool. The failing assertion is:
> 
> 	unwind-dw2-fde.c:281	gcc_assert (in_shutdown || ob);
> [snip]
> However, I believe there is one more edge case that isn't being account
> for presently: If the inserted entry has a size of 0 (i.e. if range[1] -
> range[0] == 0) then the btree_insert call in __register_frame_info_bases
> will not insert anything. This is not accounted for in
 > [snip]
> 
> Would be cool if this could be fixed on the GCC trunk.

thanks for the details analysis and the patch, it looks obviously 
correct for me. I can apply it to trunk, but we need approval from a gcc 
maintainer first.

But independent of your patch, do you have the test case available in 
some easily accessible form, for example a docker image or an automated 
build script? I ask because something odd is happening here, somebody 
registered a non-empty EH that does not contain a single unwind range. I 
am puzzled why anybody would do that, I would like to double check that 
this is indeed the intended behavior and not a bug somewhere else. Or if 
you have the test case at hand, it would be great if you could do a 
quick step through get_pc_range for the affected frame to double-check 
that the table is indeed empty and we don't miss an entry for some 
strange reason.

Best

Thomas
  
alice May 14, 2023, 7:52 p.m. UTC | #2
On Sun May 14, 2023 at 8:59 PM CEST, Thomas Neumann wrote:
> Dear Sören,
>
> > we ran into a regression introduced by these changes. The regression
> > manifests itself in a failing assertion in __deregister_frame_info_bases.
> > The assertion failure was observed while using Chromium's `flatc` build
> > system tool. The failing assertion is:
> > 
> > 	unwind-dw2-fde.c:281	gcc_assert (in_shutdown || ob);
> > [snip]
> > However, I believe there is one more edge case that isn't being account
> > for presently: If the inserted entry has a size of 0 (i.e. if range[1] -
> > range[0] == 0) then the btree_insert call in __register_frame_info_bases
> > will not insert anything. This is not accounted for in
>  > [snip]
> > 
> > Would be cool if this could be fixed on the GCC trunk.
>
> thanks for the details analysis and the patch, it looks obviously 
> correct for me. I can apply it to trunk, but we need approval from a gcc 
> maintainer first.
>
> But independent of your patch, do you have the test case available in 
> some easily accessible form, for example a docker image or an automated 
> build script? I ask because something odd is happening here, somebody 
> registered a non-empty EH that does not contain a single unwind range. I 
> am puzzled why anybody would do that, I would like to double check that 
> this is indeed the intended behavior and not a bug somewhere else. Or if 
> you have the test case at hand, it would be great if you could do a 
> quick step through get_pc_range for the affected frame to double-check 
> that the table is indeed empty and we don't miss an entry for some 
> strange reason.

sadly, the only instance of this breakage that i found is extremely contrived,
and involves the chromium codebase, and the flatc build in it..

even building the flatbuffers repo externally using cmake at the same revision
didn't reproduce it.

that said, i have attached a dockerfile that shows you what /does/ fail, under
the specific conditions. but there is no unpatched libgcc_s, so you'll have
to do that part yourself, and build a libgcc_s.so.1 without the patch in this
thread.

>
> Best
>
> Thomas
# syntax=docker/dockerfile:1
FROM alpine:edge

RUN --mount=type=cache,target=/etc/apk/cache apk add -u alpine-sdk
RUN git clone --depth=1 https://github.com/alpinelinux/aports

WORKDIR aports/community/chromium
RUN --mount=type=cache,target=/etc/apk/cache \
  abuild -F fetch deps unpack prepare

# this does /not/ fail with
# gcc 12 libgcc
# gcc 13 libgcc installed above, which has the patch from this email thread.
# to reproduce it, you have to build a libgcc without the patch and put it in
# /usr/lib/libgcc_s.so.1
# specifically, the ./flatc invocation 'exits' 'normally', but on running
# dtors(?) it will hit the assertion
RUN cd src/chromium-113.0.5672.92/out/bld \
  && ninja flatc \
  && ./flatc --help
  
Thomas Neumann May 14, 2023, 10:35 p.m. UTC | #3
> even building the flatbuffers repo externally using cmake at the same revision
> didn't reproduce it.
> 
> that said, i have attached a dockerfile that shows you what /does/ fail, under
> the specific conditions. but there is no unpatched libgcc_s, so you'll have
> to do that part yourself, and build a libgcc_s.so.1 without the patch in this
> thread.

thanks for the dockerfile, I could reproduce the problem. For some 
strange reason the program really tried to register a frame table 
without entries. Perhaps that is caused by musl, I could not find the 
root cause for that. But nevertheless, fixing the assert is the right 
thing to do, we must ignore empty tables.

Best

Thomas
  
alice May 14, 2023, 11:09 p.m. UTC | #4
On Mon May 15, 2023 at 12:35 AM CEST, Thomas Neumann wrote:
> > even building the flatbuffers repo externally using cmake at the same revision
> > didn't reproduce it.
> > 
> > that said, i have attached a dockerfile that shows you what /does/ fail, under
> > the specific conditions. but there is no unpatched libgcc_s, so you'll have
> > to do that part yourself, and build a libgcc_s.so.1 without the patch in this
> > thread.
>
> thanks for the dockerfile, I could reproduce the problem. For some 
> strange reason the program really tried to register a frame table 
> without entries. Perhaps that is caused by musl, I could not find the 
> root cause for that. But nevertheless, fixing the assert is the right 
> thing to do, we must ignore empty tables.

yeah, this certainly seems musl related. if i had to guess, there is probably
still a subtle bug somewhere else here causing this. but sadly, that is all far
outside my knowledge.

thank you very much for having a look. it's greatly appreciated. :)

>
> Best
>
> Thomas
  
Richard Biener May 15, 2023, 7:59 a.m. UTC | #5
On Sun, May 14, 2023 at 9:00 PM Thomas Neumann via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Dear Sören,
>
> > we ran into a regression introduced by these changes. The regression
> > manifests itself in a failing assertion in __deregister_frame_info_bases.
> > The assertion failure was observed while using Chromium's `flatc` build
> > system tool. The failing assertion is:
> >
> >       unwind-dw2-fde.c:281    gcc_assert (in_shutdown || ob);
> > [snip]
> > However, I believe there is one more edge case that isn't being account
> > for presently: If the inserted entry has a size of 0 (i.e. if range[1] -
> > range[0] == 0) then the btree_insert call in __register_frame_info_bases
> > will not insert anything. This is not accounted for in
>  > [snip]
> >
> > Would be cool if this could be fixed on the GCC trunk.
>
> thanks for the details analysis and the patch, it looks obviously
> correct for me. I can apply it to trunk, but we need approval from a gcc
> maintainer first.

The patch is OK for trunk and affected branches.

Thanks,
Richard.

> But independent of your patch, do you have the test case available in
> some easily accessible form, for example a docker image or an automated
> build script? I ask because something odd is happening here, somebody
> registered a non-empty EH that does not contain a single unwind range. I
> am puzzled why anybody would do that, I would like to double check that
> this is indeed the intended behavior and not a bug somewhere else. Or if
> you have the test case at hand, it would be great if you could do a
> quick step through get_pc_range for the affected frame to double-check
> that the table is indeed empty and we don't miss an entry for some
> strange reason.
>
> Best
>
> Thomas
>
>
>
  
Kyrylo Tkachov May 15, 2023, 12:52 p.m. UTC | #6
> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Richard Biener
> via Gcc-patches
> Sent: Monday, May 15, 2023 8:59 AM
> To: Thomas Neumann <thomas.neumann@in.tum.de>
> Cc: Sören Tempel <soeren@soeren-tempel.net>; gcc-patches@gcc.gnu.org;
> alice@ayaya.dev
> Subject: Re: [PATCH] Fix assertion for unwind-dw2-fde.c btree changes
> 
> On Sun, May 14, 2023 at 9:00 PM Thomas Neumann via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Dear Sören,
> >
> > > we ran into a regression introduced by these changes. The regression
> > > manifests itself in a failing assertion in __deregister_frame_info_bases.
> > > The assertion failure was observed while using Chromium's `flatc` build
> > > system tool. The failing assertion is:
> > >
> > >       unwind-dw2-fde.c:281    gcc_assert (in_shutdown || ob);
> > > [snip]
> > > However, I believe there is one more edge case that isn't being account
> > > for presently: If the inserted entry has a size of 0 (i.e. if range[1] -
> > > range[0] == 0) then the btree_insert call in __register_frame_info_bases
> > > will not insert anything. This is not accounted for in
> >  > [snip]
> > >
> > > Would be cool if this could be fixed on the GCC trunk.
> >
> > thanks for the details analysis and the patch, it looks obviously
> > correct for me. I can apply it to trunk, but we need approval from a gcc
> > maintainer first.
> 
> The patch is OK for trunk and affected branches.
> 

Hello, this patch breaks the build on targets where range is not declared i.e. where the #ifdef ATOMIC_FDE_FAST_PATH path is not taken.

Thanks,
Kyrill

> Thanks,
> Richard.
> 
> > But independent of your patch, do you have the test case available in
> > some easily accessible form, for example a docker image or an automated
> > build script? I ask because something odd is happening here, somebody
> > registered a non-empty EH that does not contain a single unwind range. I
> > am puzzled why anybody would do that, I would like to double check that
> > this is indeed the intended behavior and not a bug somewhere else. Or if
> > you have the test case at hand, it would be great if you could do a
> > quick step through get_pc_range for the affected frame to double-check
> > that the table is indeed empty and we don't miss an entry for some
> > strange reason.
> >
> > Best
> >
> > Thomas
> >
> >
> >
  
Thomas Neumann May 15, 2023, 1:05 p.m. UTC | #7
> Hello, this patch breaks the build on targets where range is not declared i.e. where the #ifdef ATOMIC_FDE_FAST_PATH path is not taken.

argh, I did not realize I tested the patch only on atomic fast path 
platforms. The patch below fixes that by moving the check inside the #ifdef.

I will check that everything works on atomic and non-atomic platforms 
and commit the trivial move then. Sorry for the breakage.

Best

Thomas



 From 550dc27f547a067e96137adeb85148d8a84c81a0 Mon Sep 17 00:00:00 2001
From: Thomas Neumann <tneumann@users.sourceforge.net>
Date: Mon, 15 May 2023 14:59:22 +0200
Subject: [PATCH] fix assert in non-atomic path

The non-atomic path does not have range information,
we have to adjust the assert handle that case, too.

libgcc/ChangeLog:
	* unwind-dw2-fde.c: Fix assert in non-atomic path.
---
  libgcc/unwind-dw2-fde.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index 8683a65aa02..df461a1527d 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -240,6 +240,7 @@ __deregister_frame_info_bases (const void *begin)

    // And remove
    ob = btree_remove (&registered_frames, range[0]);
+  bool empty_table = (range[1] - range[0]) == 0;
  #else
    init_object_mutex_once ();
    __gthread_mutex_lock (&object_mutex);
@@ -276,11 +277,12 @@ __deregister_frame_info_bases (const void *begin)

   out:
    __gthread_mutex_unlock (&object_mutex);
+  bool empty_table = false;
  #endif

    // If we didn't find anything in the lookup data structures then they
    // were either already destroyed or we tried to remove an empty range.
-  gcc_assert (in_shutdown || ((range[1] - range[0]) == 0 || ob));
+  gcc_assert (in_shutdown || (empty_table || ob));
    return (void *) ob;
  }
  
Kyrylo Tkachov May 15, 2023, 1:28 p.m. UTC | #8
> -----Original Message-----
> From: Thomas Neumann <thomas.neumann@in.tum.de>
> Sent: Monday, May 15, 2023 2:06 PM
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; Richard Biener
> <richard.guenther@gmail.com>
> Cc: Sören Tempel <soeren@soeren-tempel.net>; gcc-patches@gcc.gnu.org;
> alice@ayaya.dev
> Subject: Re: [PATCH] Fix assertion for unwind-dw2-fde.c btree changes
> 
> > Hello, this patch breaks the build on targets where range is not declared i.e.
> where the #ifdef ATOMIC_FDE_FAST_PATH path is not taken.
> 
> argh, I did not realize I tested the patch only on atomic fast path
> platforms. The patch below fixes that by moving the check inside the #ifdef.
> 
> I will check that everything works on atomic and non-atomic platforms
> and commit the trivial move then. Sorry for the breakage.

Thanks for the quick fix. I can confirm the aarch64 build succeeds now.
Kyrill

> 
> Best
> 
> Thomas
> 
> 
> 
>  From 550dc27f547a067e96137adeb85148d8a84c81a0 Mon Sep 17 00:00:00
> 2001
> From: Thomas Neumann <tneumann@users.sourceforge.net>
> Date: Mon, 15 May 2023 14:59:22 +0200
> Subject: [PATCH] fix assert in non-atomic path
> 
> The non-atomic path does not have range information,
> we have to adjust the assert handle that case, too.
> 
> libgcc/ChangeLog:
> 	* unwind-dw2-fde.c: Fix assert in non-atomic path.
> ---
>   libgcc/unwind-dw2-fde.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> index 8683a65aa02..df461a1527d 100644
> --- a/libgcc/unwind-dw2-fde.c
> +++ b/libgcc/unwind-dw2-fde.c
> @@ -240,6 +240,7 @@ __deregister_frame_info_bases (const void *begin)
> 
>     // And remove
>     ob = btree_remove (&registered_frames, range[0]);
> +  bool empty_table = (range[1] - range[0]) == 0;
>   #else
>     init_object_mutex_once ();
>     __gthread_mutex_lock (&object_mutex);
> @@ -276,11 +277,12 @@ __deregister_frame_info_bases (const void *begin)
> 
>    out:
>     __gthread_mutex_unlock (&object_mutex);
> +  bool empty_table = false;
>   #endif
> 
>     // If we didn't find anything in the lookup data structures then they
>     // were either already destroyed or we tried to remove an empty range.
> -  gcc_assert (in_shutdown || ((range[1] - range[0]) == 0 || ob));
> +  gcc_assert (in_shutdown || (empty_table || ob));
>     return (void *) ob;
>   }
> 
> --
> 2.39.2
>
  
Jeff Law May 15, 2023, 1:31 p.m. UTC | #9
On 5/15/23 07:05, Thomas Neumann via Gcc-patches wrote:
>> Hello, this patch breaks the build on targets where range is not 
>> declared i.e. where the #ifdef ATOMIC_FDE_FAST_PATH path is not taken.
> 
> argh, I did not realize I tested the patch only on atomic fast path 
> platforms. The patch below fixes that by moving the check inside the 
> #ifdef.
> 
> I will check that everything works on atomic and non-atomic platforms 
> and commit the trivial move then. Sorry for the breakage.
> 
> Best
> 
> Thomas
> 
> 
> 
>  From 550dc27f547a067e96137adeb85148d8a84c81a0 Mon Sep 17 00:00:00 2001
> From: Thomas Neumann <tneumann@users.sourceforge.net>
> Date: Mon, 15 May 2023 14:59:22 +0200
> Subject: [PATCH] fix assert in non-atomic path
> 
> The non-atomic path does not have range information,
> we have to adjust the assert handle that case, too.
> 
> libgcc/ChangeLog:
>      * unwind-dw2-fde.c: Fix assert in non-atomic path.
OK for the trunk.
jeff
  

Patch

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index 7b74c391ced..8683a65aa02 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -278,7 +278,9 @@  __deregister_frame_info_bases (const void *begin)
   __gthread_mutex_unlock (&object_mutex);
 #endif
 
-  gcc_assert (in_shutdown || ob);
+  // If we didn't find anything in the lookup data structures then they
+  // were either already destroyed or we tried to remove an empty range.
+  gcc_assert (in_shutdown || ((range[1] - range[0]) == 0 || ob));
   return (void *) ob;
 }