The code coverage support uses counters to determine which edges in the control
flow graph were executed. If a counter overflows, then the code coverage
information is invalid. Therefore the counter type should be a 64-bit integer.
In multithreaded applications, it is important that the counter increments are
atomic. This is not the case by default. The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.
If the hardware supports 64-bit atomic operations, then everything is fine. If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used. However, if the hardware does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fallback to non-atomic operations was
done. This is probably not what a user wants. There is still hardware on the
market which does not have atomic operations and is used for multithreaded
applications. A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.
This patch removes the fallback to non-atomic operations for
-fprofile-update=atomic. If atomic operations in hardware are not available,
then a library call to libatomic is emitted. To mitigate potential performance
issues an optimization for systems which only support 32-bit atomic operations
is provided. Here, the edge counter increments are done like this:
low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED);
high_inc = low == 0 ? 1 : 0;
__atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED);
In gimple_gen_time_profiler() this split operation cannot be used, since the
updated counter value is also required. Here, a library call is emitted. This
is not a performance issue since the update is only done if counters[0] == 0.
gcc/ChangeLog:
* tree-profile.cc (split_atomic_increment): New.
(gimple_gen_edge_profiler): Split the atomic edge counter increment in
two 32-bit atomic operations if necessary.
(tree_profiling): Remove profile update warning and fallback. Set
split_atomic_increment if necessary.
* doc/invoke.texi (-fprofile-update): Clarify default method. Document
the atomic method behaviour.
---
v2:
* Check gcov_type_size for split_atomic_increment.
* Update documentation.
gcc/doc/invoke.texi | 16 ++++++++-
gcc/tree-profile.cc | 81 +++++++++++++++++++++++++++++++++------------
2 files changed, 74 insertions(+), 23 deletions(-)
@@ -16457,7 +16457,21 @@ while the second one prevents profile corruption by emitting thread-safe code.
Using @samp{prefer-atomic} would be transformed either to @samp{atomic},
when supported by a target, or to @samp{single} otherwise. The GCC driver
automatically selects @samp{prefer-atomic} when @option{-pthread}
-is present in the command line.
+is present in the command line, otherwise the default method is @samp{single}.
+
+If @samp{atomic} is selected, then the profile information is updated using
+atomic operations. If the target does not support atomic operations in
+hardware, then function calls to @code{__atomic_fetch_add_4} or
+@code{__atomic_fetch_add_8} are emitted. These functions are usually provided
+by the @file{libatomic} runtime library. Not all targets provide the
+@file{libatomic} runtime library. If it is not available for the target, then
+a linker error may happen. Using function calls to update the profiling
+information may be a performance issue. For targets which use 64-bit counters
+for the profiling information and support only 32-bit atomic operations, the
+performance critical profiling updates are done using two 32-bit atomic
+operations for each counter update. If a signal interrupts these two
+operations updating a counter, then the profiling information may be in an
+inconsistent state.
@item -fprofile-filter-files=@var{regex}
@opindex fprofile-filter-files
@@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var;
static GTY(()) tree ic_tuple_counters_field;
static GTY(()) tree ic_tuple_callee_field;
+/* If the user selected atomic profile counter updates
+ (-fprofile-update=atomic), then the counter updates will be done atomically.
+ Ideally, this is done through atomic operations in hardware. If the
+ hardware supports only 32-bit atomic increments and gcov_type_node is a
+ 64-bit integer type, then for the profile edge counters the increment is
+ performed through two separate 32-bit atomic increments. This case is
+ indicated by the split_atomic_increment variable begin true. If the
+ hardware does not support atomic operations at all, then a library call to
+ libatomic is emitted. */
+static bool split_atomic_increment;
+
/* Do initialization work for the edge profiler. */
/* Add code:
@@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void)
void
gimple_gen_edge_profiler (int edgeno, edge e)
{
- tree one;
-
- one = build_int_cst (gcov_type_node, 1);
+ const char *name = "PROF_edge_counter";
+ tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
+ tree one = build_int_cst (gcov_type_node, 1);
if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
{
- /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
- tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno);
- tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
- ? BUILT_IN_ATOMIC_FETCH_ADD_8:
- BUILT_IN_ATOMIC_FETCH_ADD_4);
- gcall *stmt = gimple_build_call (f, 3, addr, one,
- build_int_cst (integer_type_node,
- MEMMODEL_RELAXED));
- gsi_insert_on_edge (e, stmt);
+ tree addr = build_fold_addr_expr (ref);
+ tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
+ if (!split_atomic_increment)
+ {
+ /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
+ tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
+ ? BUILT_IN_ATOMIC_FETCH_ADD_8:
+ BUILT_IN_ATOMIC_FETCH_ADD_4);
+ gcall *stmt = gimple_build_call (f, 3, addr, one, relaxed);
+ gsi_insert_on_edge (e, stmt);
+ }
+ else
+ {
+ /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED);
+ high_inc = low == 0 ? 1 : 0;
+ __atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */
+ tree zero32 = build_zero_cst (uint32_type_node);
+ tree one32 = build_one_cst (uint32_type_node);
+ tree addr_high = make_temp_ssa_name (TREE_TYPE (addr), NULL, name);
+ gimple *stmt = gimple_build_assign (addr_high, POINTER_PLUS_EXPR,
+ addr,
+ build_int_cst (size_type_node,
+ 4));
+ gsi_insert_on_edge (e, stmt);
+ if (WORDS_BIG_ENDIAN)
+ std::swap (addr, addr_high);
+ tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4);
+ stmt = gimple_build_call (f, 3, addr, one, relaxed);
+ tree low = make_temp_ssa_name (uint32_type_node, NULL, name);
+ gimple_call_set_lhs (stmt, low);
+ gsi_insert_on_edge (e, stmt);
+ tree is_zero = make_temp_ssa_name (boolean_type_node, NULL, name);
+ stmt = gimple_build_assign (is_zero, EQ_EXPR, low, zero32);
+ gsi_insert_on_edge (e, stmt);
+ tree high_inc = make_temp_ssa_name (uint32_type_node, NULL, name);
+ stmt = gimple_build_assign (high_inc, COND_EXPR, is_zero, one32,
+ zero32);
+ gsi_insert_on_edge (e, stmt);
+ stmt = gimple_build_call (f, 3, addr_high, high_inc, relaxed);
+ gsi_insert_on_edge (e, stmt);
+ }
}
else
{
- tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
- tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node,
- NULL, "PROF_edge_counter");
+ tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name);
gassign *stmt1 = gimple_build_assign (gcov_type_tmp_var, ref);
- gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node,
- NULL, "PROF_edge_counter");
+ gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name);
gassign *stmt2 = gimple_build_assign (gcov_type_tmp_var, PLUS_EXPR,
gimple_assign_lhs (stmt1), one);
gassign *stmt3 = gimple_build_assign (unshare_expr (ref),
@@ -710,11 +750,8 @@ tree_profiling (void)
if (flag_profile_update == PROFILE_UPDATE_ATOMIC
&& !can_support_atomic)
- {
- warning (0, "target does not support atomic profile update, "
- "single mode is selected");
- flag_profile_update = PROFILE_UPDATE_SINGLE;
- }
+ split_atomic_increment = gcov_type_size == 8
+ && (HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi);
else if (flag_profile_update == PROFILE_UPDATE_PREFER_ATOMIC)
flag_profile_update = can_support_atomic
? PROFILE_UPDATE_ATOMIC : PROFILE_UPDATE_SINGLE;