c++: Fix up mangling of function/block scope static structured bindings [PR111069]

Message ID ZORuAJTB6/kkyFGG@tucnak
State Unresolved
Headers
Series c++: Fix up mangling of function/block scope static structured bindings [PR111069] |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek Aug. 22, 2023, 8:12 a.m. UTC
  Hi!

As can be seen on the testcase, we weren't correctly mangling
static/thread_local structured bindings (C++20 feature) at function/block
scope.  The following patch fixes that by using what write_local_name
does for those cases (note, structured binding mandling doesn't use the
standard path because it needs to pass a list of all the identifiers in
the structured binding to the mangling).  In addition to that it fixes
mangling of various helpers which use write_guarded_name (_ZGV*, _ZTH*,
_ZTW*) and kills find_decomp_unqualified_name which for the local names
would be too hard to implement and uses write_guarded_name for structured
binding related _ZGR* names as well.
All the mangled names on the testcase match now clang++ and my expectations.
Because the old mangled names were plain wrong (they mangled the same as
structured binding at global scope and resulted in assembly errors if there
was more than one static structured binding with the same identifiers in
the same (or another) function, I think we don't need to play with another
mangling ABI level which turns on/off the old broken way, unsure whether
we should backport the patch to 13 branch though.

BTW, I think we should also handle ABI tags in mangle_decomp which we
currently don't do, but guess that should be subject to another mangling ABI
version.
On
struct __attribute__((abi_tag ("foobar"))) S { int i; };
extern S a[2];
struct __attribute__((abi_tag ("qux"))) T { int i; S j; int k; };
extern T b[2];
namespace N { auto [i, j] = a; auto [k, l] = b; S c; T d; }
inline void foo () { static auto [m, n] = a; static auto [o, p] = b;
{ static auto [m, n] = a; ++n.i; } ++m.i; ++o.i; }
void (*p) () = &foo;
clang++ uses
_ZN1NDC1i1jEB6foobarE
_ZN1NDC1k1lEB3quxE
_ZZ3foovEDC1m1nEB6foobar
_ZGVZ3foovEDC1m1nEB6foobar
_ZZ3foovEDC1o1pEB3qux
_ZGVZ3foovEDC1o1pEB3qux
_ZZ3foovEDC1m1nEB6foobar_0
_ZGVZ3foovEDC1m1nEB6foobar_0
mangling while g++ (with the patch):
_ZN1NDC1i1jEE
_ZN1NDC1k1lEE
_ZZ3foovEDC1m1nE
_ZGVZ3foovEDC1m1nE
_ZZ3foovEDC1o1pE
_ZGVZ3foovEDC1o1pE
_ZZ3foovEDC1m1nE_0
_ZGVZ3foovEDC1m1nE_0

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-08-22  Jakub Jelinek  <jakub@redhat.com>

	PR c++/111069
	* cp-tree.h (determine_local_discriminator): Add NAME argument with
	NULL_TREE default.
	* decl.cc (determine_local_discriminator): Add NAME argument, use it
	if non-NULL, otherwise compute it the old way.
	(cp_maybe_mangle_decomp): Call determine_local_discriminator.
	* mangle.cc (find_decomp_unqualified_name): Remove.
	(write_unqualified_name): Don't call find_decomp_unqualified_name.
	(mangle_decomp): Handle mangling of static function/block scope
	structured bindings.  Don't call decl_mangling_context twice.
	(write_guarded_var_name): Handle structured bindings.
	(mangle_ref_init_variable): Use write_guarded_var_name for structured
	bindings.

	* g++.dg/cpp2a/decomp8.C: New test.


	Jakub
  

Comments

Jason Merrill Aug. 23, 2023, 8:23 p.m. UTC | #1
On 8/22/23 04:12, Jakub Jelinek wrote:
> As can be seen on the testcase, we weren't correctly mangling
> static/thread_local structured bindings (C++20 feature) at function/block
> scope.  The following patch fixes that by using what write_local_name
> does for those cases (note, structured binding mandling doesn't use the
> standard path because it needs to pass a list of all the identifiers in
> the structured binding to the mangling).  In addition to that it fixes
> mangling of various helpers which use write_guarded_name (_ZGV*, _ZTH*,
> _ZTW*) and kills find_decomp_unqualified_name which for the local names
> would be too hard to implement and uses write_guarded_name for structured
> binding related _ZGR* names as well.
> All the mangled names on the testcase match now clang++ and my expectations.
> Because the old mangled names were plain wrong (they mangled the same as
> structured binding at global scope and resulted in assembly errors if there
> was more than one static structured binding with the same identifiers in
> the same (or another) function, I think we don't need to play with another
> mangling ABI level which turns on/off the old broken way, unsure whether
> we should backport the patch to 13 branch though.

Probably not.

> BTW, I think we should also handle ABI tags in mangle_decomp which we
> currently don't do, but guess that should be subject to another mangling ABI
> version.

I'd be surprised if this would affect any real code, but I suppose so. 
In any case I'd like to fix this at the same time as the local statics, 
to avoid changing their mangled name twice.

> @@ -9049,6 +9050,25 @@ cp_maybe_mangle_decomp (tree decl, tree
>         tree d = first;
>         for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
>   	v[count - i - 1] = d;
> +      if (DECL_FUNCTION_SCOPE_P (decl))
> +	{
> +	  size_t sz = 3;
> +	  for (unsigned int i = 0; i < count; ++i)
> +	    sz += IDENTIFIER_LENGTH (DECL_NAME (v[i])) + 1;
> +	  char *name = XALLOCAVEC (char, sz);
> +	  name[0] = 'D';
> +	  name[1] = 'C';
> +	  char *p = name + 2;
> +	  for (unsigned int i = 0; i < count; ++i)
> +	    {
> +	      size_t len = IDENTIFIER_LENGTH (DECL_NAME (v[i]));
> +	      *p++ = ' ';
> +	      memcpy (p, IDENTIFIER_POINTER (DECL_NAME (v[i])), len);
> +	      p += len;
> +	    }
> +	  *p = '\0';
> +	  determine_local_discriminator (decl, get_identifier (name));
> +	}

Maybe do this in mangle_decomp, based on the actual mangling in process 
instead of this pseudo-mangling?

> @@ -4564,6 +4519,13 @@ write_guarded_var_name (const tree varia
>       /* The name of a guard variable for a reference temporary should refer
>          to the reference, not the temporary.  */
>       write_string (IDENTIFIER_POINTER (DECL_NAME (variable)) + 4);
> +  else if (DECL_DECOMPOSITION_P (variable)
> +	   && DECL_NAME (variable) == NULL_TREE
> +	   && startswith (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (variable)),
> +			  "_Z"))

Maybe add a startswith overload that takes an identifier?

> @@ -4630,7 +4592,10 @@ mangle_ref_init_variable (const tree var
>     start_mangling (variable);
>     write_string ("_ZGR");
>     check_abi_tags (variable);
> -  write_name (variable, /*ignore_local_scope=*/0);
> +  if (DECL_DECOMPOSITION_P (variable))
> +    write_guarded_var_name (variable);
> +  else
> +    write_name (variable, /*ignore_local_scope=*/0);

Why not use write_guarded_name unconditionally?

Jason
  
Jakub Jelinek Aug. 24, 2023, 4:39 p.m. UTC | #2
On Wed, Aug 23, 2023 at 04:23:00PM -0400, Jason Merrill wrote:
> I'd be surprised if this would affect any real code, but I suppose so. In
> any case I'd like to fix this at the same time as the local statics, to
> avoid changing their mangled name twice.

Ok.
Running now into a problem with abi tags, because cp_maybe_mangle_decomp
is called before the type of the structured binding is finalized (sequence
is cp_maybe_mangle_decomp; cp_finish_decl; cp_finish_decomp), I vaguely
remember the reason was to have the name already mangled by cp_finish_decl
time, so that it can add it into varpool etc..
Will see if I can e.g. pass the initializer expression to
cp_maybe_mangle_decomp and figure out the tags from that.

> > @@ -9049,6 +9050,25 @@ cp_maybe_mangle_decomp (tree decl, tree
> >         tree d = first;
> >         for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
> >   	v[count - i - 1] = d;
> > +      if (DECL_FUNCTION_SCOPE_P (decl))
> > +	{
> > +	  size_t sz = 3;
> > +	  for (unsigned int i = 0; i < count; ++i)
> > +	    sz += IDENTIFIER_LENGTH (DECL_NAME (v[i])) + 1;
> > +	  char *name = XALLOCAVEC (char, sz);
> > +	  name[0] = 'D';
> > +	  name[1] = 'C';
> > +	  char *p = name + 2;
> > +	  for (unsigned int i = 0; i < count; ++i)
> > +	    {
> > +	      size_t len = IDENTIFIER_LENGTH (DECL_NAME (v[i]));
> > +	      *p++ = ' ';
> > +	      memcpy (p, IDENTIFIER_POINTER (DECL_NAME (v[i])), len);
> > +	      p += len;
> > +	    }
> > +	  *p = '\0';
> > +	  determine_local_discriminator (decl, get_identifier (name));
> > +	}
> 
> Maybe do this in mangle_decomp, based on the actual mangling in process
> instead of this pseudo-mangling?

Not sure that is possible, for 2 reasons:
1) determine_local_discriminator otherwise works on DECL_NAME, not mangled
   names, so if one uses (albeit implementation reserved)
   _ZZN1N3fooI1TB3bazEEivEDC1h1iEB6foobar and similar identifiers, they
   could clash with the counting of the structured bindings
2) seems the local discriminator counting shouldn't take into account
   details like abi tags, e.g. if I have:
struct [[gnu::abi_tag ("foobar")]] S { int a; };
namespace N {
  template <typename T>
  inline int
  foo ()
  {
    static int h = 42; int r = ++h;
    { static S h = { 42 }; r += ++h.a; }
    { static int h = 42; r += ++h; }
    { static S h = { 42 }; r += ++h.a; }
    return r;
  }
}
int (*p) () = N::foo<int>;
  then both GCC and Clang mangle these as
  _ZZN1N3fooIiEEivE1h
  _ZZN1N3fooIiEEivE1hB6foobar_0
  _ZZN1N3fooIiEEivE1h_1
  _ZZN1N3fooIiEEivE1hB6foobar_2
  so whether abi tags appear in the mangled name or not shouldn't result
  in separate buckets for counting.

> 
> > @@ -4564,6 +4519,13 @@ write_guarded_var_name (const tree varia
> >       /* The name of a guard variable for a reference temporary should refer
> >          to the reference, not the temporary.  */
> >       write_string (IDENTIFIER_POINTER (DECL_NAME (variable)) + 4);
> > +  else if (DECL_DECOMPOSITION_P (variable)
> > +	   && DECL_NAME (variable) == NULL_TREE
> > +	   && startswith (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (variable)),
> > +			  "_Z"))
> 
> Maybe add a startswith overload that takes an identifier?

Ok.

> > @@ -4630,7 +4592,10 @@ mangle_ref_init_variable (const tree var
> >     start_mangling (variable);
> >     write_string ("_ZGR");
> >     check_abi_tags (variable);
> > -  write_name (variable, /*ignore_local_scope=*/0);
> > +  if (DECL_DECOMPOSITION_P (variable))
> > +    write_guarded_var_name (variable);
> > +  else
> > +    write_name (variable, /*ignore_local_scope=*/0);
> 
> Why not use write_guarded_name unconditionally?

Ok.

	Jakub
  

Patch

--- gcc/cp/cp-tree.h.jj	2023-08-18 20:11:11.013692860 +0200
+++ gcc/cp/cp-tree.h	2023-08-19 10:45:00.320543681 +0200
@@ -6858,7 +6858,7 @@  extern void pop_switch				(void);
 extern void note_break_stmt			(void);
 extern bool note_iteration_stmt_body_start	(void);
 extern void note_iteration_stmt_body_end	(bool);
-extern void determine_local_discriminator	(tree);
+extern void determine_local_discriminator	(tree, tree = NULL_TREE);
 extern bool fns_correspond			(tree, tree);
 extern int decls_match				(tree, tree, bool = true);
 extern bool maybe_version_functions		(tree, tree, bool);
--- gcc/cp/decl.cc.jj	2023-08-18 09:49:38.899871662 +0200
+++ gcc/cp/decl.cc	2023-08-19 10:52:11.748888997 +0200
@@ -913,15 +913,16 @@  static GTY((deletable)) vec<tree, va_gc>
    generally very few of these in any particular function.  */
 
 void
-determine_local_discriminator (tree decl)
+determine_local_discriminator (tree decl, tree name)
 {
   auto_cond_timevar tv (TV_NAME_LOOKUP);
   retrofit_lang_decl (decl);
   tree ctx = DECL_CONTEXT (decl);
-  tree name = (TREE_CODE (decl) == TYPE_DECL
-	       && TYPE_UNNAMED_P (TREE_TYPE (decl))
-	       ? NULL_TREE : DECL_NAME (decl));
   size_t nelts = vec_safe_length (local_entities);
+  if (name == NULL_TREE)
+    name = (TREE_CODE (decl) == TYPE_DECL
+	    && TYPE_UNNAMED_P (TREE_TYPE (decl))
+	    ? NULL_TREE : DECL_NAME (decl));
   for (size_t i = 0; i < nelts; i += 2)
     {
       tree *pair = &(*local_entities)[i];
@@ -9049,6 +9050,25 @@  cp_maybe_mangle_decomp (tree decl, tree
       tree d = first;
       for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
 	v[count - i - 1] = d;
+      if (DECL_FUNCTION_SCOPE_P (decl))
+	{
+	  size_t sz = 3;
+	  for (unsigned int i = 0; i < count; ++i)
+	    sz += IDENTIFIER_LENGTH (DECL_NAME (v[i])) + 1;
+	  char *name = XALLOCAVEC (char, sz);
+	  name[0] = 'D';
+	  name[1] = 'C';
+	  char *p = name + 2;
+	  for (unsigned int i = 0; i < count; ++i)
+	    {
+	      size_t len = IDENTIFIER_LENGTH (DECL_NAME (v[i]));
+	      *p++ = ' ';
+	      memcpy (p, IDENTIFIER_POINTER (DECL_NAME (v[i])), len);
+	      p += len;
+	    }
+	  *p = '\0';
+	  determine_local_discriminator (decl, get_identifier (name));
+	}
       SET_DECL_ASSEMBLER_NAME (decl, mangle_decomp (decl, v));
       maybe_apply_pragma_weak (decl);
     }
--- gcc/cp/mangle.cc.jj	2023-08-08 15:55:06.218168490 +0200
+++ gcc/cp/mangle.cc	2023-08-21 10:39:35.209155346 +0200
@@ -1344,51 +1344,6 @@  write_template_prefix (const tree node)
   add_substitution (substitution);
 }
 
-/* As the list of identifiers for the structured binding declaration
-   DECL is likely gone, try to recover the DC <source-name>+ E portion
-   from its mangled name.  Return pointer to the DC and set len to
-   the length up to and including the terminating E.  On failure
-   return NULL.  */
-
-static const char *
-find_decomp_unqualified_name (tree decl, size_t *len)
-{
-  const char *p = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-  const char *end = p + IDENTIFIER_LENGTH (DECL_ASSEMBLER_NAME (decl));
-  bool nested = false;
-  if (!startswith (p, "_Z"))
-    return NULL;
-  p += 2;
-  if (startswith (p, "St"))
-    p += 2;
-  else if (*p == 'N')
-    {
-      nested = true;
-      ++p;
-      while (ISDIGIT (p[0]))
-	{
-	  char *e;
-	  long num = strtol (p, &e, 10);
-	  if (num >= 1 && num < end - e)
-	    p = e + num;
-	  else
-	    break;
-	}
-    }
-  if (!startswith (p, "DC"))
-    return NULL;
-  if (nested)
-    {
-      if (end[-1] != 'E')
-	return NULL;
-      --end;
-    }
-  if (end[-1] != 'E')
-    return NULL;
-  *len = end - p;
-  return p;
-}
-
 /* "For the purposes of mangling, the name of an anonymous union is considered
    to be the name of the first named data member found by a pre-order,
    depth-first, declaration-order walk of the data members of the anonymous
@@ -1461,17 +1416,7 @@  write_unqualified_name (tree decl)
     {
       found = true;
       gcc_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
-      const char *decomp_str = NULL;
-      size_t decomp_len = 0;
-      if (VAR_P (decl)
-	  && DECL_DECOMPOSITION_P (decl)
-	  && DECL_NAME (decl) == NULL_TREE
-	  && DECL_NAMESPACE_SCOPE_P (decl))
-	decomp_str = find_decomp_unqualified_name (decl, &decomp_len);
-      if (decomp_str)
-	write_chars (decomp_str, decomp_len);
-      else
-	write_source_name (DECL_ASSEMBLER_NAME (decl));
+      write_source_name (DECL_ASSEMBLER_NAME (decl));
     }
   else if (DECL_DECLARES_FUNCTION_P (decl))
     {
@@ -4370,13 +4315,21 @@  mangle_decomp (const tree decl, vec<tree
   gcc_assert (context != NULL_TREE);
 
   bool nested = false;
+  bool local = false;
   if (DECL_NAMESPACE_STD_P (context))
     write_string ("St");
+  else if (TREE_CODE (context) == FUNCTION_DECL)
+    {
+      local = true;
+      write_char ('Z');
+      write_encoding (context);
+      write_char ('E');
+    }
   else if (context != global_namespace)
     {
       nested = true;
       write_char ('N');
-      write_prefix (decl_mangling_context (decl));
+      write_prefix (context);
     }
 
   write_string ("DC");
@@ -4388,6 +4341,8 @@  mangle_decomp (const tree decl, vec<tree
 
   if (nested)
     write_char ('E');
+  else if (local && DECL_DISCRIMINATOR_P (decl))
+    write_discriminator (discriminator_for_local_entity (decl));
 
   tree id = finish_mangling_get_identifier ();
   if (DEBUG_MANGLE)
@@ -4564,6 +4519,13 @@  write_guarded_var_name (const tree varia
     /* The name of a guard variable for a reference temporary should refer
        to the reference, not the temporary.  */
     write_string (IDENTIFIER_POINTER (DECL_NAME (variable)) + 4);
+  else if (DECL_DECOMPOSITION_P (variable)
+	   && DECL_NAME (variable) == NULL_TREE
+	   && startswith (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (variable)),
+			  "_Z"))
+    /* The name of a guard variable for a structured binding needs special
+       casing.  */
+    write_string (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (variable)) + 2);
   else
     write_name (variable, /*ignore_local_scope=*/0);
 }
@@ -4630,7 +4592,10 @@  mangle_ref_init_variable (const tree var
   start_mangling (variable);
   write_string ("_ZGR");
   check_abi_tags (variable);
-  write_name (variable, /*ignore_local_scope=*/0);
+  if (DECL_DECOMPOSITION_P (variable))
+    write_guarded_var_name (variable);
+  else
+    write_name (variable, /*ignore_local_scope=*/0);
   /* Avoid name clashes with aggregate initialization of multiple
      references at once.  */
   write_compact_number (current_ref_temp_count++);
--- gcc/testsuite/g++.dg/cpp2a/decomp8.C.jj	2023-08-21 10:56:02.653308449 +0200
+++ gcc/testsuite/g++.dg/cpp2a/decomp8.C	2023-08-21 11:05:13.349163678 +0200
@@ -0,0 +1,70 @@ 
+// PR c++/111069
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+extern int a[2];
+struct Y { int b, c, d; };
+
+inline void
+freddy ()
+{
+  static auto [i, j] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+  static auto [k, l] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+  {
+    static auto [i, j] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+    static auto [k, l] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+  }
+  {
+    static auto [i, j] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+    static auto [k, l] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+  }
+}
+
+namespace N
+{
+  namespace M
+  {
+    template <int N>
+    inline void corge ()
+    {
+      static auto [i, j] = a;		// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+      static auto && [u, v, w] = Y{};	// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+      ++u;
+      {
+	static auto && [u, v, w] = Y{};	// { dg-warning "structured bindings only available with" "" { target c++14_down } }
+					// { dg-warning "structured binding declaration can be 'static' only in" "" { target c++17_down } .-1 }
+	++v;
+      }
+    }
+  }
+}
+
+void (*p) () = &freddy;
+void (*q) () = N::M::corge<3>;
+
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1i1jE" } }
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1i1jE_0" } }
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1i1jE_1" } }
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1k1lE" } }
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1k1lE_0" } }
+// { dg-final { scan-assembler "_ZZ6freddyvEDC1k1lE_1" } }
+// { dg-final { scan-assembler "_ZZN1N1M5corgeILi3EEEvvEDC1i1jE" } }
+// { dg-final { scan-assembler "_ZZN1N1M5corgeILi3EEEvvEDC1u1v1wE" } }
+// { dg-final { scan-assembler "_ZZN1N1M5corgeILi3EEEvvEDC1u1v1wE_0" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1i1jE" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1i1jE_0" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1i1jE_1" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1k1lE" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1k1lE_0" } }
+// { dg-final { scan-assembler "_ZGVZ6freddyvEDC1k1lE_1" } }
+// { dg-final { scan-assembler "_ZGVZN1N1M5corgeILi3EEEvvEDC1i1jE" } }
+// { dg-final { scan-assembler "_ZGRZN1N1M5corgeILi3EEEvvEDC1u1v1wE_" } }
+// { dg-final { scan-assembler "_ZGRZN1N1M5corgeILi3EEEvvEDC1u1v1wE_0_" } }