c++: Check for indirect change of active union member in constexpr [PR101631]

Message ID ZO30PQql2TablzpJ@Thaum.localdomain
State Accepted
Headers
Series c++: Check for indirect change of active union member in constexpr [PR101631] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Nathaniel Shead Aug. 29, 2023, 1:35 p.m. UTC
  This is an attempt to improve the constexpr machinery's handling of
union lifetime by catching more cases that cause UB. Is this approach
OK? 

I'd also like some feedback on a couple of pain points with this
implementation; in particular, is there a good way to detect if a type
has a non-deleted trivial constructor? I've used 'is_trivially_xible' in
this patch, but that also checks for a trivial destructor which by my
reading of [class.union.general]p5 is possibly incorrect. Checking for a
trivial default constructor doesn't seem too hard but I couldn't find a
good way of checking if that constructor is deleted.

I'm also generally unsatisfied with the additional complexity with the
third 'refs' argument in 'cxx_eval_store_expression' being pushed and
popped; would it be better to replace this with a vector of some
specific structure type for the data that needs to be passed on?

I've also just realised that empty unions are not well handled by this
patch either, but that seems to be a more general issue (see PR111226
I've just opened) so I'm leaving that be for now.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

-- 8< --

This patch adds checks for attempting to change the active member of a
union by methods other than a member access expression.

To be able to properly distinguish `*(&u.a) = ` from `u.a = `, this
patch redoes the solution for PR59950 to avoid extranneous *&; it seems
that the only case that needed the workaround for the PR was the copying
empty class case.

Additionally, this patch ensures that constructors for a union field
mark that field as the active member before entering the call itself;
this ensures that modifications of the field within the construction
body don't cause false positives (as these will not appear to be member
access expressions). This means that we no longer need to start the
lifetime of empty union members after the constructor body either.

	PR c++/101631

gcc/cp/ChangeLog:

	* call.cc (build_over_call): Fold indirect refs for trivial
	assignment operators when possible.
	* constexpr.cc (cxx_eval_call_expression): Start lifetime of
	union member before entering constructor call.
	(cxx_eval_store_expression): Check for accessing incorrect union
	member indirectly.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/constexpr-union2.C: New test.
	* g++.dg/cpp2a/constexpr-union3.C: New test.
	* g++.dg/cpp2a/constexpr-union4.C: New test.
	* g++.dg/cpp2a/constexpr-union5.C: New test.

Signed-Off-By: Nathaniel Shead <nathanieloshead@gmail.com>
---
 gcc/cp/call.cc                                |  10 +-
 gcc/cp/constexpr.cc                           | 103 ++++++++++++------
 gcc/testsuite/g++.dg/cpp2a/constexpr-union2.C |  30 +++++
 gcc/testsuite/g++.dg/cpp2a/constexpr-union3.C |  42 +++++++
 gcc/testsuite/g++.dg/cpp2a/constexpr-union4.C |  29 +++++
 gcc/testsuite/g++.dg/cpp2a/constexpr-union5.C |  46 ++++++++
 6 files changed, 223 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-union2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-union3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-union4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-union5.C
  

Comments

Jason Merrill Aug. 30, 2023, 8:28 p.m. UTC | #1
On 8/29/23 09:35, Nathaniel Shead wrote:
> This is an attempt to improve the constexpr machinery's handling of
> union lifetime by catching more cases that cause UB. Is this approach
> OK?
> 
> I'd also like some feedback on a couple of pain points with this
> implementation; in particular, is there a good way to detect if a type
> has a non-deleted trivial constructor? I've used 'is_trivially_xible' in
> this patch, but that also checks for a trivial destructor which by my
> reading of [class.union.general]p5 is possibly incorrect. Checking for a
> trivial default constructor doesn't seem too hard but I couldn't find a
> good way of checking if that constructor is deleted.

I guess the simplest would be

(TYPE_HAS_TRIVIAL_DFLT (t) && locate_ctor (t))

because locate_ctor returns null for a deleted default ctor.  It would 
be good to make this a separate predicate.

> I'm also generally unsatisfied with the additional complexity with the
> third 'refs' argument in 'cxx_eval_store_expression' being pushed and
> popped; would it be better to replace this with a vector of some
> specific structure type for the data that needs to be passed on?

Perhaps, but what you have here is fine.  Another possibility would be 
to just have a vec of the refs and extract the index from the ref later 
as needed.

Jason
  

Patch

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 23e458d3252..cb85e9a6694 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -10358,10 +10358,7 @@  build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	   && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR)
 	   && trivial_fn_p (fn))
     {
-      /* Don't use cp_build_fold_indirect_ref, op= returns an lvalue even if
-	 the object argument isn't one.  */
-      tree to = cp_build_indirect_ref (input_location, argarray[0],
-				       RO_ARROW, complain);
+      tree to = cp_build_fold_indirect_ref (argarray[0]);
       tree type = TREE_TYPE (to);
       tree as_base = CLASSTYPE_AS_BASE (type);
       tree arg = argarray[1];
@@ -10369,6 +10366,11 @@  build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 
       if (is_really_empty_class (type, /*ignore_vptr*/true))
 	{
+	  /* Ensure op= returns an lvalue even if the object argument isn't one.
+	     This isn't needed in other cases since MODIFY_EXPR is always
+	     considered an lvalue.  */
+	  to = cp_build_addr_expr (to, tf_none);
+	  to = cp_build_indirect_ref (input_location, to, RO_ARROW, complain);
 	  /* Avoid copying empty classes.  */
 	  val = build2 (COMPOUND_EXPR, type, arg, to);
 	  suppress_warning (val, OPT_Wunused);
diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 8bd5c4a47f8..811534586da 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -3141,40 +3141,34 @@  cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	    cxx_set_object_constness (ctx, new_obj, /*readonly_p=*/false,
 				      non_constant_p, overflow_p);
 
+	  /* If this is a constructor, we are beginning the lifetime of the
+	     object we are initialising.  */
+	  if (new_obj
+	      && DECL_CONSTRUCTOR_P (fun)
+	      && TREE_CODE (new_obj) == COMPONENT_REF
+	      && TREE_CODE (TREE_TYPE (TREE_OPERAND (new_obj, 0))) == UNION_TYPE)
+	    {
+	      tree activate = build2 (INIT_EXPR, TREE_TYPE (new_obj),
+				      new_obj,
+				      build_constructor (TREE_TYPE (new_obj),
+							 NULL));
+	      cxx_eval_constant_expression (ctx, activate,
+					    lval, non_constant_p, overflow_p);
+	      ggc_free (activate);
+	    }
+
 	  tree jump_target = NULL_TREE;
 	  cxx_eval_constant_expression (&call_ctx, body,
 					vc_discard, non_constant_p, overflow_p,
 					&jump_target);
 
 	  if (DECL_CONSTRUCTOR_P (fun))
-	    {
-	      /* This can be null for a subobject constructor call, in
-		 which case what we care about is the initialization
-		 side-effects rather than the value.  We could get at the
-		 value by evaluating *this, but we don't bother; there's
-		 no need to put such a call in the hash table.  */
-	      result = lval ? ctx->object : ctx->ctor;
-
-	      /* If we've just evaluated a subobject constructor call for an
-		 empty union member, it might not have produced a side effect
-		 that actually activated the union member.  So produce such a
-		 side effect now to ensure the union appears initialized.  */
-	      if (!result && new_obj
-		  && TREE_CODE (new_obj) == COMPONENT_REF
-		  && TREE_CODE (TREE_TYPE
-				(TREE_OPERAND (new_obj, 0))) == UNION_TYPE
-		  && is_really_empty_class (TREE_TYPE (new_obj),
-					    /*ignore_vptr*/false))
-		{
-		  tree activate = build2 (MODIFY_EXPR, TREE_TYPE (new_obj),
-					  new_obj,
-					  build_constructor (TREE_TYPE (new_obj),
-							     NULL));
-		  cxx_eval_constant_expression (ctx, activate, lval,
-						non_constant_p, overflow_p);
-		  ggc_free (activate);
-		}
-	    }
+	    /* This can be null for a subobject constructor call, in
+	       which case what we care about is the initialization
+	       side-effects rather than the value.  We could get at the
+	       value by evaluating *this, but we don't bother; there's
+	       no need to put such a call in the hash table.  */
+	    result = lval ? ctx->object : ctx->ctor;
 	  else if (VOID_TYPE_P (TREE_TYPE (res)))
 	    result = void_node;
 	  else
@@ -6049,6 +6043,14 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 						  mutable_p)
 		     && const_object_being_modified == NULL_TREE)
 	      const_object_being_modified = probe;
+
+	    /* Track named member accesses for unions to validate modifications
+	       that change active member.  */
+	    if (!evaluated && TREE_CODE (probe) == COMPONENT_REF)
+	      vec_safe_push (refs, probe);
+	    else
+	      vec_safe_push (refs, NULL_TREE);
+
 	    vec_safe_push (refs, elt);
 	    vec_safe_push (refs, TREE_TYPE (probe));
 	    probe = ob;
@@ -6057,6 +6059,7 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 
 	case REALPART_EXPR:
 	  gcc_assert (probe == target);
+	  vec_safe_push (refs, NULL_TREE);
 	  vec_safe_push (refs, probe);
 	  vec_safe_push (refs, TREE_TYPE (probe));
 	  probe = TREE_OPERAND (probe, 0);
@@ -6064,6 +6067,7 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 
 	case IMAGPART_EXPR:
 	  gcc_assert (probe == target);
+	  vec_safe_push (refs, NULL_TREE);
 	  vec_safe_push (refs, probe);
 	  vec_safe_push (refs, TREE_TYPE (probe));
 	  probe = TREE_OPERAND (probe, 0);
@@ -6152,6 +6156,7 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
       enum tree_code code = TREE_CODE (type);
       tree reftype = refs->pop();
       tree index = refs->pop();
+      bool is_access_expr = refs->pop() != NULL_TREE;
 
       if (code == COMPLEX_TYPE)
 	{
@@ -6192,10 +6197,15 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 
       type = reftype;
 
-      if (code == UNION_TYPE && CONSTRUCTOR_NELTS (*valp)
-	  && CONSTRUCTOR_ELT (*valp, 0)->index != index)
+      if (code == UNION_TYPE
+	  && TREE_CODE (t) == MODIFY_EXPR
+	  && (CONSTRUCTOR_NELTS (*valp) == 0
+	      || CONSTRUCTOR_ELT (*valp, 0)->index != index))
 	{
-	  if (cxx_dialect < cxx20)
+	  /* We changed the active member of a union. Ensure that this is
+	     valid.  */
+	  bool has_active_member = CONSTRUCTOR_NELTS (*valp) != 0;
+	  if (has_active_member && cxx_dialect < cxx20)
 	    {
 	      if (!ctx->quiet)
 		error_at (cp_expr_loc_or_input_loc (t),
@@ -6205,8 +6215,35 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 			  index);
 	      *non_constant_p = true;
 	    }
-	  else if (TREE_CODE (t) == MODIFY_EXPR
-		   && CONSTRUCTOR_NO_CLEARING (*valp))
+	  else if (!is_access_expr
+		   /* FIXME: don't also check for trivial destructor.  */
+		   || !is_trivially_xible (INIT_EXPR, reftype, make_tree_vec (0)))
+	    {
+	      /* Diagnose changing active union member after initialisation
+		 without a valid member access expression, as described in
+		 [class.union.general] p5.  */
+	      if (!ctx->quiet)
+		{
+		  if (has_active_member)
+		    error_at (cp_expr_loc_or_input_loc (t),
+			      "accessing %qD member instead of initialized "
+			      "%qD member in constant expression",
+			      index, CONSTRUCTOR_ELT (*valp, 0)->index);
+		  else
+		    error_at (cp_expr_loc_or_input_loc (t),
+			      "accessing uninitialized member %qD",
+			      index);
+		  if (is_access_expr)
+		    {
+		      inform (DECL_SOURCE_LOCATION (index),
+			      "%qDs does not implicitly begin its lifetime "
+			      "because %qT does not have a trivial default "
+			      "constructor", index, reftype);
+		    }
+		}
+	      *non_constant_p = true;
+	    }
+	  else if (has_active_member && CONSTRUCTOR_NO_CLEARING (*valp))
 	    {
 	      /* Diagnose changing the active union member while the union
 		 is in the process of being initialized.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-union2.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-union2.C
new file mode 100644
index 00000000000..1712395d7e7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-union2.C
@@ -0,0 +1,30 @@ 
+// PR c++/101631
+// { dg-do compile { target c++20 } }
+
+struct sso {
+  union {
+    int buf[10];
+    int* alloc;
+  };
+};
+
+constexpr bool direct() {
+  sso val;
+  val.alloc = nullptr;
+  val.buf[5] = 42;
+  return true;
+}
+constexpr bool ok = direct();
+
+
+constexpr void perform_assignment(int& left, int right) noexcept {
+  left = right;  // { dg-error "accessing .+ member instead of initialized" }
+}
+
+constexpr bool indirect() {
+  sso val;
+  val.alloc = nullptr;
+  perform_assignment(val.buf[5], 42);  // { dg-message "in .constexpr. expansion" }
+  return true;
+}
+constexpr bool err = indirect();  // { dg-message "in .constexpr. expansion" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-union3.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-union3.C
new file mode 100644
index 00000000000..c2f6e5fa08d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-union3.C
@@ -0,0 +1,42 @@ 
+// { dg-do compile { target c++20 } }
+
+struct S
+{
+  union {
+    char buf[8];
+    char* ptr;
+  };
+  unsigned len;
+
+  constexpr S(const char* s, unsigned n)
+  {
+    char* p;
+    if (n > 7)
+      p = ptr = new char[n];
+    else
+      p = buf;
+    for (len = 0; len < n; ++len)
+      p[len] = s[len];  // { dg-error "accessing uninitialized member" }
+    p[len] = '\0';
+  }
+};
+
+constexpr bool test()
+{
+  S s("test", 4);  // { dg-message "in .constexpr. expansion" }
+  return true;
+}
+
+constexpr bool b = test();  // { dg-message "in .constexpr. expansion" }
+
+
+union U { int a; int b; };
+
+constexpr int test2()
+{
+  U u;
+  int* p = &u.a;
+  *p = 5;  // { dg-error "accessing uninitialized member" }
+  return u.a;
+}
+constexpr int x = test2();
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-union4.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-union4.C
new file mode 100644
index 00000000000..429ab203ac2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-union4.C
@@ -0,0 +1,29 @@ 
+// { dg-do compile { target c++20 } }
+
+// from [class.union.general] p5
+
+union A { int x; int y[4]; };
+struct B { A a; };
+union C { B b; int k; };
+constexpr int f() {
+  C c;                  // does not start lifetime of any union member
+  c.b.a.y[3] = 4;       // OK, S(c.b.a.y[3]) contains c.b and c.b.a.y;
+                        // creates objects to hold union members c.b and c.b.a.y
+  return c.b.a.y[3];    // OK, c.b.a.y refers to newly created object (see [basic.life])
+}
+constexpr int a = f();
+
+struct X { const int a; int b; };
+union Y { X x; int k; };// { dg-message "does not implicitly begin its lifetime" }
+constexpr int g() {
+  Y y = { { 1, 2 } };   // OK, y.x is active union member ([class.mem])
+  int n = y.x.a;
+  y.k = 4;              // OK, ends lifetime of y.x, y.k is active member of union
+
+  y.x.b = n;            // { dg-error "accessing .* member instead of initialized .* member" }
+                        // undefined behavior: y.x.b modified outside its lifetime,
+                        // S(y.x.b) is empty because X's default constructor is deleted,
+                        // so union member y.x's lifetime does not implicitly start
+  return 0;
+}
+constexpr int b = g();  // { dg-message "in .constexpr. expansion" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-union5.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-union5.C
new file mode 100644
index 00000000000..e18b833db78
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-union5.C
@@ -0,0 +1,46 @@ 
+// { dg-do compile { target c++20 } }
+
+union U { int a; int b; };
+
+constexpr int test1() {
+  U u;
+  u.a = 10;
+  *&u.b = 20;  // { dg-error "accessing" }
+  return u.b;
+}
+constexpr int a = test1();  // { dg-message "in .constexpr. expansion" }
+
+constexpr int test2() {
+  U u;
+  u.a = 10;
+  (0, u.b) = 20;  // { dg-error "accessing" }
+  return u.b;
+}
+constexpr int b = test2();  // { dg-message "in .constexpr. expansion" }
+
+constexpr int test3() {
+  U u;
+  u.a = 0;
+  int* p = &u.b;
+  p[u.a] = 10;  // { dg-error "accessing" }
+  return u.b;
+}
+constexpr int c = test3();  // { dg-message "in .constexpr. expansion" }
+
+constexpr int test4() {
+  U u;
+  u.a = 0;
+  int* p = &u.b;
+  u.a[p] = 10;  // { dg-error "accessing" }
+  return u.b;
+}
+constexpr int d = test4();  // { dg-message "in .constexpr. expansion" }
+
+struct S { U u[10]; };
+constexpr int test5() {
+  S s;
+  s.u[4].a = 10;
+  6[s.u].b = 15;
+  return 4[s.u].a + s.u[6].b;
+}
+static_assert(test5() == 25);