Add 'libgomp.c++/static-local-variable-1.C'

Message ID 87sf541jy9.fsf@euler.schwinge.homeip.net
State Accepted
Headers
Series Add 'libgomp.c++/static-local-variable-1.C' |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Thomas Schwinge Nov. 17, 2023, 3:24 p.m. UTC
  Hi!

I found that with GCC's '-fthreadsafe-statics' implementation (..., which
is enabled by default) instrumented as follows:

    --- libstdc++-v3/libsupc++/guard.cc
    +++ libstdc++-v3/libsupc++/guard.cc
    @@ -271,6 +273,7 @@ namespace __cxxabiv1
       extern "C"
       int __cxa_guard_acquire (__guard *g)
       {
    +    asm("int3");
     #ifdef __GTHREADS
         // If the target can reorder loads, we need to insert a read memory
         // barrier so that accesses to the guarded variable happen after the

..., there is only one single libgomp C++ test case where this triggers;
'libgomp.c++/taskloop-6.C':

    Thread 1 "a.out" received signal SIGTRAP, Trace/breakpoint trap.
    __cxxabiv1::__cxa_guard_acquire (g=0x60b228 <guard variable for f17<121>(J<int>)::i>) at [...]/source-gcc/libstdc++-v3/libsupc++/guard.cc:281
    281         if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
    (gdb) bt
    #0  __cxxabiv1::__cxa_guard_acquire (g=0x60b228 <guard variable for f17<121>(J<int>)::i>) at [...]/source-gcc/libstdc++-v3/libsupc++/guard.cc:281
    #1  0x0000000000404772 in f17<121> (j=...) at source-gcc/libgomp/testsuite/libgomp.c++/taskloop-6.C:300
    #2  0x0000000000401e11 in main () at source-gcc/libgomp/testsuite/libgomp.c++/taskloop-6.C:411

That test case however isn't per se testing behavior of a C++ static
local variable vs. OpenMP.

OK to push the attached "Add 'libgomp.c++/static-local-variable-1.C'"?
(Also, I'm happy to extend the test case to verify any additional
features you think are userful to be tested there.)

(With '-fno-threadsafe-statics', this fails, as expected.)


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Comments

Thomas Schwinge Nov. 17, 2023, 4:56 p.m. UTC | #1
Hi!

On 2023-11-17T16:24:46+0100, I wrote:
> [...] attached "Add 'libgomp.c++/static-local-variable-1.C'" [...]

Now, working on translating this into an OpenMP 'target' variant.  My
goal here is not necessarily to make this work now, but rather to figure
out whether '-fthreadsafe-statics' actually does or doesn't need to be
supported in offloading compilation, whether '__cxa_guard_acquire' is in
fact unreachable there.  (Currently the latter symbol isn't available in
offloading compilation; as you know I'm currently working on GPU
libstdc++ library support.)  However, GCC offloading compilation
currently fails differently, as follows:

    r.cc:16:12: error: variable ‘_ZGVZL1fvE1s’ has been referenced in offloaded code but hasn’t been marked to be included in the offloaded code
       16 |   static S s;
          |            ^
    lto1: fatal error: errors during merging of translation units
    compilation terminated.
    nvptx mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
    [...]

... with:

    $ c++filt _ZGVZL1fvE1s
    guard variable for f()::s

Now I wonder how is that supposed to behave; is this valid OpenMP
'target' code at all?  Can you please help me find my way through the
OpenMP specification regarding this?

In OpenMP 5.2, 5.1.1 "Variables Referenced in a Construct", we have:

      - Variables with static storage duration that are declared in a scope inside the construct are shared.

Does this apply to a 'declare target'ed function 'f'?  (I was thinking:
"dynamic extend" of the scope of the 'target' construct?)  Ah, probably
that's 5.1.2 "Variables Referenced in a Region but not in a Construct":

      - Variables with static storage duration that are declared in called routines in the region are shared.

In 7.8 "Declare Target Directives", we have:

    If a variable with static storage duration is declared in a device routine then the named variable is
    treated as if it had appeared in an 'enter' clause on a declare target directive.

Similarly, in 13.8 "'target' Construct":

    If a variable with static storage duration is declared in a 'target' construct that does not specify a
    'device' clause in which the 'ancestor' _device-modifier_ appears then the named variable is
    treated as if it had appeared in a 'enter' clause on a declare target directive.

Per those occurrences, and per GCC not raising an error when encountering
a static local variable, I assume this is intended to work "as expected"?

On the other hand, NVHPC nvc++ 23.1-0 fails, too:

    NVC++-S-1062-Support procedure called within a compute region - __cxa_guard_acquire (r.cc: 16)
    [local to r_cc]::f():
         16, Accelerator restriction: unsupported call to support routine '__cxa_guard_acquire'
    NVC++/x86-64 Linux 23.1-0: compilation completed with severe errors

Hmm...


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Thomas Schwinge Dec. 7, 2023, 3:09 p.m. UTC | #2
Hi!

Jakub, would you please provide guidance?


Elsewhere, I wrote:

|| I'm working on implementing (some) C++ standard library support for code
|| offloading in GCC, and ran into the following issue: per
|| <https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables>,
|| "variables declared at block scope with the specifier 'static' [...] have
|| static [...] storage duration but are initialized the first time control
|| passes through their declaration".
||
|| To implement "initialized the first time [...]" in a multi-threaded
|| context, compilers typically use a guard variable and locking call to a
|| compiler-internal C++ support library function ('__cxa_guard_acquire').
|| (..., which in GCC, you may disable with '-fno-threadsafe-statics', for
|| that matter.)
||
|| In GCC, all this appears to work fine for multi-threaded host-side
|| (non-offladed) OpenMP 'parallel', for example.  However, I'm now curious
|| about the OpenMP 'target' offloading case; minimal example:
||
||     struct S
||     {
||       S() { }
||       ~S() { }
||     };
||
||     static void f()
||     {
||       // <https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables>
||       static S s;
||     }
||
||     int main()
||     {
||     #pragma omp target
||       {
||         f();
||       }
||     }
||
|| (Everything other than 'main' is meant to be implicitly OpenMP
|| 'declare target'ed here.)

On 2023-11-20T19:13:23+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Nov 20, 2023 at 06:43:47PM +0100, Thomas Schwinge wrote:
>> Current GCC fails:
>>
>>     error: variable ‘_ZGVZL1fvE1s’ has been referenced in offloaded code but hasn’t been marked to be included in the offloaded code
>>
>> ... with:
>>
>>     $ c++filt _ZGVZL1fvE1s
>>     guard variable for f()::s
>>
>> That may "simply" be a bug to fix in GCC.

The conclusion was: yes.

>> (Something like implicitly
>> creating respective guard variables on the device, I suppose.)
>
> Yeah, I believe we should in the omp_discover_* sub-pass handle with
> a help of a langhook automatically mark the guard variables (possibly
> iff the guarded variable is marked?),

Looking at 'gcc/omp-offload.cc:omp_discover_implicit_declare_target' left
me confused how that would be the code that marks up 'static' variables
as implicit 'omp declare target'.  Working through a simple POD example
(say, 's%static S s%static int i') it turns out, indeed that's not where
that is happending, but instead 'gcc/gimplify.cc:gimplify_bind_expr' is
the place:

    [...]
      for (t = BIND_EXPR_VARS (bind_expr); t ; t = DECL_CHAIN (t))
    [...]
                  /* Static locals inside of target construct or offloaded
                     routines need to be "omp declare target".  */
                  if (TREE_STATIC (t))
                    for (; ctx; ctx = ctx->outer_context)
                      if ((ctx->region_type & ORT_TARGET) != 0)
                        {
                          if (!lookup_attribute ("omp declare target",
                                                 DECL_ATTRIBUTES (t)))
                            {
                              tree id = get_identifier ("omp declare target");
                              DECL_ATTRIBUTES (t)
                                = tree_cons (id, NULL_TREE, DECL_ATTRIBUTES (t));
                              varpool_node *node = varpool_node::get (t);
                              if (node)
                                {
                                  node->offloadable = 1;
                                  if (ENABLE_OFFLOADING && !DECL_EXTERNAL (t))
                                    {
                                      g->have_offload = true;
                                      if (!in_lto_p)
                                        vec_safe_push (offload_vars, t);
                                    }
                                }
                            }
                          break;
    [...]

You (Jakub) added that in
commit 211b7533bff68e5dd72e7d75249f470101759d6d (Subversion r272322)
"Make static vars inside of target regions or declare target routines implicitly declare target to (PR middle-end/90779)".

Now, the problem why that existing code doesn't trigger for C++ guard
variables is that those are not in 'BIND_EXPR_VARS', due to C++ front end
use of 'pushdecl_top_level_and_finish'.  If I change the C++ front end as
follows (WIP; not reviewed in detail):

    --- gcc/cp/decl2.cc
    +++ gcc/cp/decl2.cc
    @@ -3576,5 +3576,6 @@ get_guard (tree decl)
           DECL_IGNORED_P (guard) = 1;
           TREE_USED (guard) = 1;
    -      pushdecl_top_level_and_finish (guard, NULL_TREE);
    +      pushdecl (guard);
    +      cp_finish_decl (guard, NULL_TREE, false, NULL_TREE, 0);
         }
       return guard;

..., then we do get the expected behavior:

    --- a-r.cc.006t.gimple        2023-12-07 13:27:36.254963406 +0100
    +++ a-r.cc.006t.gimple        2023-12-07 14:10:39.352107107 +0100
    @@ -5,6 +5,7 @@
       bool retval.1;
       bool D.2966;
       static struct S s1;
    +  static long long int _ZGVZL2f1vE2s1;

       gimple_call <__atomic_load_1, _1, &_ZGVZL2f1vE2s1, 2>
       gimple_assign <eq_expr, retval.0, _1, 0, NULL>

..., and offloading compilation works down to the expected next issue:

    ld: error: undefined symbol: __cxa_guard_acquire
    >>> referenced by /tmp/ccAVyZpc.o:(f1())
    [...]
    collect2: error: ld returned 1 exit status
    gcn mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status
    [...]

However: 'pushdecl_top_level_and_finish' has been used there "forever",
and I currently have no clue at all whether changing that into 'pushdecl'
would be acceptable, what effects that'd have elsewhere.

That said...  Couldn't we indeed move this gimplification-level code re
'Static locals [...] need to be "omp declare target"' into
'gcc/omp-offload.cc:omp_discover_implicit_declare_target'?

First thought: 'gcc/omp-offload.cc:omp_discover_declare_target_tgt_fn_r'
extended so that for each 'VAR_DECL' that is 'TREE_STATIC', we mark it
'omp declare target'.  (That'll need some additional conditions, but you
get the idea.)  This way, we're not restricted to only 'static's in the
current bind/block, but would also catch top-level things like C++ guard
variables (without requiring any C++ front end changes).

I suppose I'd first exclude all 'DECL_ARTIFICIAL' ones, and we then may
gradually enable those, as we add test cases and handling as necessary:

> or e.g. rtti info (_ZTS*, _ZTI*)
> and eventually figure out what we should do about virtual tables (_ZTV*).
> The last case is most complicated, as it contains function pointers, and we
> need to figure out if we mark all methods, or say replace some pointers in
> the virtual table with NULLs or something that errors or terminates if it
> isn't marked.

All those I plan to defer, for now.

> And sure, __cxa_guard_* would need to be implemented in the offloading
> libsupc++.a or libstdc++.a.

Until proper libstdc++/libsupc++ support emerges (I'm working on it...),
my idea was to add a temporary 'libgomp/config/accel/*.c' implementation
(based on 'libstdc++-v3/libsupc++/guard.cc').


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Jakub Jelinek Dec. 7, 2023, 3:33 p.m. UTC | #3
On Thu, Dec 07, 2023 at 04:09:04PM +0100, Thomas Schwinge wrote:
> > Yeah, I believe we should in the omp_discover_* sub-pass handle with
> > a help of a langhook automatically mark the guard variables (possibly
> > iff the guarded variable is marked?),
> 
> Looking at 'gcc/omp-offload.cc:omp_discover_implicit_declare_target' left
> me confused how that would be the code that marks up 'static' variables
> as implicit 'omp declare target'.  Working through a simple POD example
> (say, 's%static S s%static int i') it turns out, indeed that's not where
> that is happending, but instead 'gcc/gimplify.cc:gimplify_bind_expr' is
> the place:

Sure, that is for the case where those local statics should be marked
implicitly because they appear in a target function.
They can be also marked explicitly by the user through
#pragma omp declare target enter (name_of_static_var)
or
[[omp::decl (declare target)]] attribute on it etc.

> Now, the problem why that existing code doesn't trigger for C++ guard
> variables is that those are not in 'BIND_EXPR_VARS', due to C++ front end
> use of 'pushdecl_top_level_and_finish'.  If I change the C++ front end as
> follows (WIP; not reviewed in detail):
> 
>     --- gcc/cp/decl2.cc
>     +++ gcc/cp/decl2.cc
>     @@ -3576,5 +3576,6 @@ get_guard (tree decl)
>            DECL_IGNORED_P (guard) = 1;
>            TREE_USED (guard) = 1;
>     -      pushdecl_top_level_and_finish (guard, NULL_TREE);
>     +      pushdecl (guard);
>     +      cp_finish_decl (guard, NULL_TREE, false, NULL_TREE, 0);
>          }
>        return guard;

I don't think that is desirable.

> ..., then we do get the expected behavior:
> 
>     --- a-r.cc.006t.gimple        2023-12-07 13:27:36.254963406 +0100
>     +++ a-r.cc.006t.gimple        2023-12-07 14:10:39.352107107 +0100
>     @@ -5,6 +5,7 @@
>        bool retval.1;
>        bool D.2966;
>        static struct S s1;
>     +  static long long int _ZGVZL2f1vE2s1;
> 
>        gimple_call <__atomic_load_1, _1, &_ZGVZL2f1vE2s1, 2>
>        gimple_assign <eq_expr, retval.0, _1, 0, NULL>
> 
> ..., and offloading compilation works down to the expected next issue:
> 
>     ld: error: undefined symbol: __cxa_guard_acquire
>     >>> referenced by /tmp/ccAVyZpc.o:(f1())
>     [...]
>     collect2: error: ld returned 1 exit status
>     gcn mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status
>     [...]
> 
> However: 'pushdecl_top_level_and_finish' has been used there "forever",
> and I currently have no clue at all whether changing that into 'pushdecl'
> would be acceptable, what effects that'd have elsewhere.

Exactly.

> That said...  Couldn't we indeed move this gimplification-level code re
> 'Static locals [...] need to be "omp declare target"' into
> 'gcc/omp-offload.cc:omp_discover_implicit_declare_target'?

The omp-offload.cc discovery stuff was added for stuff where the OpenMP
standard says something is implicitly declare target because there is
some use of it satisfying some rule.
Like, calls to functions defined in current compilation unit referenced in
target region or something similar, or such calls referenced in declare
target static var initializers.
So, that feels to me like the right spot to handle the guards as well.
Of course, the middle-end doesn't know about C++ FE's get_guard variable,
so it should be some new language hook which would take care of it.
The omp_discover_declare* functions can add further VAR_DECLs to the
worklist, so I'd probably call the new language hook in the
omp_discover_implicit_declare_target last loop.
Or maybe even better just handle that in the
cxx_omp_finish_decl_inits hook.  You can just
  FOR_EACH_VARIABLE (vnode)
    if (DECL_FUNCTION_SCOPE_P (vnode->decl)
	&& omp_declare_target_var_p (vnode->decl))
      {
	tree sname = mangle_guard_variable (decl);
	tree guard = get_global_binding (sname);
	if (guard)
	  ... mark guard as declare target if not yet marked ...
      }
because guard var initializers don't really mention anything and so
their addition doesn't need to trigger further worklist changes.

> > or e.g. rtti info (_ZTS*, _ZTI*)
> > and eventually figure out what we should do about virtual tables (_ZTV*).
> > The last case is most complicated, as it contains function pointers, and we
> > need to figure out if we mark all methods, or say replace some pointers in
> > the virtual table with NULLs or something that errors or terminates if it
> > isn't marked.
> 
> All those I plan to defer, for now.

Ok.

> > And sure, __cxa_guard_* would need to be implemented in the offloading
> > libsupc++.a or libstdc++.a.
> 
> Until proper libstdc++/libsupc++ support emerges (I'm working on it...),
> my idea was to add a temporary 'libgomp/config/accel/*.c' implementation
> (based on 'libstdc++-v3/libsupc++/guard.cc').

That looks reasonable.

	Jakub
  
Thomas Schwinge Dec. 21, 2023, 12:31 p.m. UTC | #4
Hi Jakub!

On 2023-12-07T16:33:08+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Dec 07, 2023 at 04:09:04PM +0100, Thomas Schwinge wrote:
>> > Yeah, I believe we should in the omp_discover_* sub-pass handle with
>> > a help of a langhook automatically mark the guard variables (possibly
>> > iff the guarded variable is marked?),
>>
>> Looking at 'gcc/omp-offload.cc:omp_discover_implicit_declare_target' left
>> me confused how that would be the code that marks up 'static' variables
>> as implicit 'omp declare target'.  Working through a simple POD example
>> (say, 's%static S s%static int i') it turns out, indeed that's not where
>> that is happending, but instead 'gcc/gimplify.cc:gimplify_bind_expr' is
>> the place:
>
> Sure, that is for the case where those local statics should be marked
> implicitly because they appear in a target function.
> They can be also marked explicitly by the user through
> #pragma omp declare target enter (name_of_static_var)
> or
> [[omp::decl (declare target)]] attribute on it etc.

These three: implicitly, or explicit '#pragma omp declare target' etc.,
or inside '#pragma omp begin declare target' region are the only OpenMP
facilities to get things 'omp declare target'ed, right?

>> That said...  Couldn't we indeed move this gimplification-level code re
>> 'Static locals [...] need to be "omp declare target"' into
>> 'gcc/omp-offload.cc:omp_discover_implicit_declare_target'?
>
> The omp-offload.cc discovery stuff was added for stuff where the OpenMP
> standard says something is implicitly declare target because there is
> some use of it satisfying some rule.
> Like, calls to functions defined in current compilation unit referenced in
> target region or something similar, or such calls referenced in declare
> target static var initializers.
> So, that feels to me like the right spot to handle the guards as well.
> Of course, the middle-end doesn't know about C++ FE's get_guard variable,
> so it should be some new language hook which would take care of it.
> The omp_discover_declare* functions can add further VAR_DECLs to the
> worklist, so I'd probably call the new language hook in the
> omp_discover_implicit_declare_target last loop.
> Or maybe even better just handle that in the
> cxx_omp_finish_decl_inits hook.  You can just
>   FOR_EACH_VARIABLE (vnode)
>     if (DECL_FUNCTION_SCOPE_P (vnode->decl)
>       && omp_declare_target_var_p (vnode->decl))
>       {
>       tree sname = mangle_guard_variable (decl);
>       tree guard = get_global_binding (sname);
>       if (guard)
>         ... mark guard as declare target if not yet marked ...
>       }
> because guard var initializers don't really mention anything and so
> their addition doesn't need to trigger further worklist changes.

That doesn't generally work, as the gimplification-level code re
'Static locals [...] need to be "omp declare target"' runs *after*
'omp_discover_implicit_declare_target'.  Thus my "move" idea above.
However, let's defer the latter one; I've now got a simple setup where
the new language hook is invoked in all necessary places.  (Will post
later.)

>> > And sure, __cxa_guard_* would need to be implemented in the offloading
>> > libsupc++.a or libstdc++.a.
>>
>> Until proper libstdc++/libsupc++ support emerges (I'm working on it...),
>> my idea was to add a temporary 'libgomp/config/accel/*.c' implementation
>> (based on 'libstdc++-v3/libsupc++/guard.cc').
>
> That looks reasonable.

OK to push, for a start, the attached
"GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local variables support"?
That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
target testing, where we thus see a number of FAIL -> PASS progressions.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Jakub Jelinek Dec. 21, 2023, 12:58 p.m. UTC | #5
On Thu, Dec 21, 2023 at 01:31:19PM +0100, Thomas Schwinge wrote:
> These three: implicitly, or explicit '#pragma omp declare target' etc.,
> or inside '#pragma omp begin declare target' region are the only OpenMP
> facilities to get things 'omp declare target'ed, right?

I think so.
> That doesn't generally work, as the gimplification-level code re
> 'Static locals [...] need to be "omp declare target"' runs *after*
> 'omp_discover_implicit_declare_target'.  Thus my "move" idea above.

Can't we mark the static locals already during that discovery?
The addition during gimplification was probably made when we didn't have
that at all.

> OK to push, for a start, the attached
> "GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local variables support"?
> That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
> target testing, where we thus see a number of FAIL -> PASS progressions.

> For now, for single-threaded GCN, nvptx target use only; extension for
> multi-threaded offloading use to follow later.
> 
> 	libgcc/
> 	* c++-minimal/README: New.
> 	* c++-minimal/guard.c: New.
> 	* config/gcn/t-amdgcn (LIB2ADD): Add it.
> 	* config/nvptx/t-nvptx (LIB2ADD): Likewise.

> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/cxxabi.h'.  */
> +
> +  int
> +  __cxa_guard_acquire(__guard*);
> +
> +  void
> +  __cxa_guard_release(__guard*);
> +
> +  void
> +  __cxa_guard_abort(__guard*);

When all this isn't inside a namespace, shouldn't it be indented by
2 spaces less?

> +
> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/guard.cc'.  */
> +
> +# undef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
> +# undef _GLIBCXX_GUARD_SET_AND_RELEASE
> +# define _GLIBCXX_GUARD_SET_AND_RELEASE(G) _GLIBCXX_GUARD_SET (G)

And without a space after # here?

Otherwise LGTM, but hope that one day we'll get rid of it again.

	Jakub
  
Thomas Schwinge Dec. 23, 2023, 9:17 a.m. UTC | #6
Hi!

On 2023-12-21T13:58:23+0100, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Dec 21, 2023 at 01:31:19PM +0100, Thomas Schwinge wrote:
>> [...] the gimplification-level code re
>> 'Static locals [...] need to be "omp declare target"' runs *after*
>> 'omp_discover_implicit_declare_target'.  Thus my "move" idea above.
>
> Can't we mark the static locals already during that discovery?

Well, that's precisely what I had tried to communicate, earlier on.  ;-)

I'll work on that, as a refactoring, after I've gotten the current
implementation idea working.

> The addition during gimplification was probably made when we didn't have
> that at all.


>> OK to push, for a start, the attached
>> "GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local variables support"?
>> That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
>> target testing, where we thus see a number of FAIL -> PASS progressions.
>
>> For now, for single-threaded GCN, nvptx target use only; extension for
>> multi-threaded offloading use to follow later.
>>
>>      libgcc/
>>      * c++-minimal/README: New.
>>      * c++-minimal/guard.c: New.
>>      * config/gcn/t-amdgcn (LIB2ADD): Add it.
>>      * config/nvptx/t-nvptx (LIB2ADD): Likewise.
>
>> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/cxxabi.h'.  */
>> +
>> +  int
>> +  __cxa_guard_acquire(__guard*);
>> +
>> +  void
>> +  __cxa_guard_release(__guard*);
>> +
>> +  void
>> +  __cxa_guard_abort(__guard*);
>
> When all this isn't inside a namespace, shouldn't it be indented by
> 2 spaces less?
>
>> +
>> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/guard.cc'.  */
>> +
>> +# undef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>> +# undef _GLIBCXX_GUARD_SET_AND_RELEASE
>> +# define _GLIBCXX_GUARD_SET_AND_RELEASE(G) _GLIBCXX_GUARD_SET (G)
>
> And without a space after # here?

Well, those were just un-edited copy'n'pastes from the original files;
now indentation/space-corrected for viewing pleasure.

> Otherwise LGTM, but hope that one day we'll get rid of it again.

Yep.

Pushed to master branch commit c0bf7ea189ecf252152fe15134f70f576bcd20b2
"GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local variables support",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Patch

From d3140a1b4a649c5acb3735ef7fd04a4ebffe5e9a Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 17 Nov 2023 16:06:25 +0100
Subject: [PATCH] Add 'libgomp.c++/static-local-variable-1.C'

A debug run may look as follows:

    int main()
    void f()
    S::S()
    void f()
    S::S()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()
    void f()/
    void f()/
    void f()/
      cSC = 1
      cf = 12
    int main()/
    S::~S()
    S::~S()/

	libgomp/
	* testsuite/libgomp.c++/static-local-variable-1.C: New.
---
 .../libgomp.c++/static-local-variable-1.C     | 95 +++++++++++++++++++
 1 file changed, 95 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.c++/static-local-variable-1.C

diff --git a/libgomp/testsuite/libgomp.c++/static-local-variable-1.C b/libgomp/testsuite/libgomp.c++/static-local-variable-1.C
new file mode 100644
index 00000000000..3169ba77d8d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/static-local-variable-1.C
@@ -0,0 +1,95 @@ 
+// Test basic behavior of a C++ static local variable vs. OpenMP.
+
+#include <omp.h>
+#include <unistd.h>
+
+#define DEBUG_PRINTF //__builtin_printf
+
+static int state;
+
+static int cSC, cSD, cf;
+
+struct S
+{
+  S()
+  {
+    DEBUG_PRINTF("%s\n", __PRETTY_FUNCTION__);
+
+    int c;
+#pragma omp atomic capture
+    c = ++cSC;
+    if (c != 1)
+      __builtin_abort();
+
+    if (state++ != 1)
+      __builtin_abort();
+
+    DEBUG_PRINTF("%s/\n", __PRETTY_FUNCTION__);
+  }
+
+  ~S()
+  {
+    DEBUG_PRINTF("%s\n", __PRETTY_FUNCTION__);
+
+    int c;
+#pragma omp atomic capture
+    c = ++cSD;
+    if (c != 1)
+      __builtin_abort();
+
+    if (state++ != 3)
+      __builtin_abort();
+
+    DEBUG_PRINTF("%s/\n", __PRETTY_FUNCTION__);
+    // Exit '0', now that we've verified all is OK.
+    _exit(0);
+  }
+};
+
+static void f()
+{
+  DEBUG_PRINTF("%s\n", __PRETTY_FUNCTION__);
+
+#pragma omp atomic
+  ++cf;
+
+  // <https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables>
+  static S s;
+
+  DEBUG_PRINTF("%s/\n", __PRETTY_FUNCTION__);
+}
+
+int main()
+{
+  DEBUG_PRINTF("%s\n", __PRETTY_FUNCTION__);
+
+  if (state++ != 0)
+    __builtin_abort();
+
+  int nthreads;
+
+#pragma omp parallel
+  {
+#pragma omp master
+    {
+      nthreads = omp_get_num_threads ();
+    }
+
+    f();
+  }
+
+  DEBUG_PRINTF("  cSC = %d\n", cSC);
+  DEBUG_PRINTF("  cf = %d\n", cf);
+  if (cSC != 1)
+    __builtin_abort();
+  if (cf != nthreads)
+    __builtin_abort();
+
+  if (state++ != 2)
+    __builtin_abort();
+
+  DEBUG_PRINTF("%s/\n", __PRETTY_FUNCTION__);
+
+  // See '_exit(0);' elsewhere.
+  return 1;
+}
-- 
2.34.1