From patchwork Fri Dec 16 10:25:21 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sebastian Huber <sebastian.huber@embedded-brains.de>
X-Patchwork-Id: 33937
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp883719wrn;
        Fri, 16 Dec 2022 02:25:59 -0800 (PST)
X-Google-Smtp-Source: 
 AA0mqf43CQrmEN7ZJtoMBK+jMty3Oi/+SMrKvTszV5EYxxSp9gmSU36cWgx6Vc/Wy7SaASnzaxAl
X-Received: by 2002:a17:906:3917:b0:7bf:1081:9472 with SMTP id
 f23-20020a170906391700b007bf10819472mr24916792eje.69.1671186359200;
        Fri, 16 Dec 2022 02:25:59 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1671186359; cv=none;
        d=google.com; s=arc-20160816;
        b=RZdJhk/Nf8zOpYbDzw5hzkdoDEDB4WtdT0rLGeqKHacWhyo+Be3exm9A5omYx6WuR6
         9WtTz4cyZeJbZ63sXtU7RF9qRVYwkoAZ9z0+Tbm3jArOB+iFXTfhKjcN4veRgZArz2Y2
         4+xNyv28XDbuDykFo5ow1ovGb60GxtbtTtX0xE6mzHmHW257ibP3hvQ91nf5z6oPpJJB
         HLTMBG70ohd4z6q0zvX50kz+0TUt7j2rImFRC9HqKvFzCLk34zSm2S9FD5DB9vsYXQLG
         RP4qN/eETPkDSTTat+VWSzjt8I4Ofap3CWDyLARxeyNTjWj2hcfe46tp3u4F6Fg63qC7
         ycVQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:message-id:date:subject:cc:to:from:dmarc-filter
         :delivered-to;
        bh=VOAJ4FIqbh79K0vNj4GvEsLcEgeOsTix6JD5JgVhrdk=;
        b=WuGqBcW1XH43n+fZvJmlsEhgDZPfAveg4YoaHkAi64LAdtGrZkXlS91fgE5Fy+UZTf
         yb4sCcW6s8UeLVcA5DgC5+8iX/NV4rBQmwsvfOPyPc9Bt+4plUUecF1X+JRKAmB3yUTg
         No4y7vpLul65YMEZ7HN9T4xtiK1I49ajrjMXzR4W7bjjFERP4LuIEKdRC3/74BCjbQfF
         27Og5hTXlaauGcz0Lb5md3Wn5dGiVN3LeeJBU80vsclF2ImvW2e3ocVq+fjVKU+22wse
         MBRhRGRhQQ+wBZ1N7Hl1uaUhIfCch+tvJH0CRyZ8fZE9DOppLdJ1vYAojwFUJxcnB5zm
         sA1A==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 tz3-20020a170907c78300b007c18e1ecaf5si1873456ejc.654.2022.12.16.02.25.59
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 16 Dec 2022 02:25:59 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 31731382E740
	for <ouuuleilei@gmail.com>; Fri, 16 Dec 2022 10:25:56 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148])
 by sourceware.org (Postfix) with ESMTPS id D80FB3834E32
 for <gcc-patches@gcc.gnu.org>; Fri, 16 Dec 2022 10:25:28 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D80FB3834E32
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=embedded-brains.de
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=embedded-brains.de
Received: from sslproxy01.your-server.de ([78.46.139.224])
 by dedi548.your-server.de with esmtpsa  (TLS1.3) tls TLS_AES_256_GCM_SHA384
 (Exim 4.94.2) (envelope-from <sebastian.huber@embedded-brains.de>)
 id 1p67uA-000H8u-Hk; Fri, 16 Dec 2022 11:25:26 +0100
Received: from [82.100.198.138] (helo=mail.embedded-brains.de)
 by sslproxy01.your-server.de with esmtpsa
 (TLSv1.3:TLS_AES_256_GCM_SHA384:256)
 (Exim 4.92) (envelope-from <sebastian.huber@embedded-brains.de>)
 id 1p67uA-0007dv-EM; Fri, 16 Dec 2022 11:25:26 +0100
Received: from localhost (localhost [127.0.0.1])
 by mail.embedded-brains.de (Postfix) with ESMTP id 2383E4800AA;
 Fri, 16 Dec 2022 11:25:26 +0100 (CET)
Received: from mail.embedded-brains.de ([127.0.0.1])
 by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Ac6Pdeb0lZIG; Fri, 16 Dec 2022 11:25:25 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
 by mail.embedded-brains.de (Postfix) with ESMTP id A90AA480181;
 Fri, 16 Dec 2022 11:25:25 +0100 (CET)
X-Virus-Scanned: amavisd-new at zimbra.eb.localhost
Received: from mail.embedded-brains.de ([127.0.0.1])
 by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id Z2t9owFjcHTN; Fri, 16 Dec 2022 11:25:25 +0100 (CET)
Received: from zimbra.eb.localhost (unknown [192.168.96.242])
 by mail.embedded-brains.de (Postfix) with ESMTPSA id 888C24800AA;
 Fri, 16 Dec 2022 11:25:25 +0100 (CET)
From: Sebastian Huber <sebastian.huber@embedded-brains.de>
To: gcc-patches@gcc.gnu.org
Cc: Richard Biener <richard.guenther@gmail.com>
Subject: [PATCH v2] gcov: Fix -fprofile-update=atomic
Date: Fri, 16 Dec 2022 11:25:21 +0100
Message-Id: <20221216102521.73271-1-sebastian.huber@embedded-brains.de>
X-Mailer: git-send-email 2.35.3
MIME-Version: 1.0
X-Authenticated-Sender: smtp-embedded@poldinet.de
X-Virus-Scanned: Clear (ClamAV 0.103.7/26752/Fri Dec 16 09:25:27 2022)
X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
X-GMAIL-THRID: =?utf-8?q?1752365907730332187?=
X-GMAIL-MSGID: =?utf-8?q?1752365907730332187?=

The code coverage support uses counters to determine which edges in the control
flow graph were executed.  If a counter overflows, then the code coverage
information is invalid.  Therefore the counter type should be a 64-bit integer.
In multithreaded applications, it is important that the counter increments are
atomic.  This is not the case by default.  The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.

If the hardware supports 64-bit atomic operations, then everything is fine.  If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used.  However, if the hardware does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fallback to non-atomic operations was
done.  This is probably not what a user wants.  There is still hardware on the
market which does not have atomic operations and is used for multithreaded
applications.  A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.

This patch removes the fallback to non-atomic operations for
-fprofile-update=atomic.  If atomic operations in hardware are not available,
then a library call to libatomic is emitted.  To mitigate potential performance
issues an optimization for systems which only support 32-bit atomic operations
is provided.  Here, the edge counter increments are done like this:

  low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED);
  high_inc = low == 0 ? 1 : 0;
  __atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED);

In gimple_gen_time_profiler() this split operation cannot be used, since the
updated counter value is also required.  Here, a library call is emitted.  This
is not a performance issue since the update is only done if counters[0] == 0.

gcc/ChangeLog:

	* tree-profile.cc (split_atomic_increment): New.
	(gimple_gen_edge_profiler): Split the atomic edge counter increment in
	two 32-bit atomic operations if necessary.
	(tree_profiling): Remove profile update warning and fallback.  Set
	split_atomic_increment if necessary.
	* doc/invoke.texi (-fprofile-update): Clarify default method.  Document
	the atomic method behaviour.
---
v2:

* Check gcov_type_size for split_atomic_increment.

* Update documentation.

 gcc/doc/invoke.texi | 16 ++++++++-
 gcc/tree-profile.cc | 81 +++++++++++++++++++++++++++++++++------------
 2 files changed, 74 insertions(+), 23 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a50417a4ab7..6b32b659e50 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16457,7 +16457,21 @@ while the second one prevents profile corruption by emitting thread-safe code.
 Using @samp{prefer-atomic} would be transformed either to @samp{atomic},
 when supported by a target, or to @samp{single} otherwise.  The GCC driver
 automatically selects @samp{prefer-atomic} when @option{-pthread}
-is present in the command line.
+is present in the command line, otherwise the default method is @samp{single}.
+
+If @samp{atomic} is selected, then the profile information is updated using
+atomic operations.  If the target does not support atomic operations in
+hardware, then function calls to @code{__atomic_fetch_add_4} or
+@code{__atomic_fetch_add_8} are emitted.  These functions are usually provided
+by the @file{libatomic} runtime library.  Not all targets provide the
+@file{libatomic} runtime library.  If it is not available for the target, then
+a linker error may happen.  Using function calls to update the profiling
+information may be a performance issue.  For targets which use 64-bit counters
+for the profiling information and support only 32-bit atomic operations, the
+performance critical profiling updates are done using two 32-bit atomic
+operations for each counter update.  If a signal interrupts these two
+operations updating a counter, then the profiling information may be in an
+inconsistent state.
 
 @item -fprofile-filter-files=@var{regex}
 @opindex fprofile-filter-files
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 2beb49241f2..49c8caeae18 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var;
 static GTY(()) tree ic_tuple_counters_field;
 static GTY(()) tree ic_tuple_callee_field;
 
+/* If the user selected atomic profile counter updates
+   (-fprofile-update=atomic), then the counter updates will be done atomically.
+   Ideally, this is done through atomic operations in hardware.  If the
+   hardware supports only 32-bit atomic increments and gcov_type_node is a
+   64-bit integer type, then for the profile edge counters the increment is
+   performed through two separate 32-bit atomic increments.  This case is
+   indicated by the split_atomic_increment variable begin true.  If the
+   hardware does not support atomic operations at all, then a library call to
+   libatomic is emitted.  */
+static bool split_atomic_increment;
+
 /* Do initialization work for the edge profiler.  */
 
 /* Add code:
@@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void)
 void
 gimple_gen_edge_profiler (int edgeno, edge e)
 {
-  tree one;
-
-  one = build_int_cst (gcov_type_node, 1);
+  const char *name = "PROF_edge_counter";
+  tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
+  tree one = build_int_cst (gcov_type_node, 1);
 
   if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
     {
-      /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
-      tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno);
-      tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
-				      ? BUILT_IN_ATOMIC_FETCH_ADD_8:
-				      BUILT_IN_ATOMIC_FETCH_ADD_4);
-      gcall *stmt = gimple_build_call (f, 3, addr, one,
-				       build_int_cst (integer_type_node,
-						      MEMMODEL_RELAXED));
-      gsi_insert_on_edge (e, stmt);
+      tree addr = build_fold_addr_expr (ref);
+      tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
+      if (!split_atomic_increment)
+	{
+	  /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
+	  tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
+					  ? BUILT_IN_ATOMIC_FETCH_ADD_8:
+					  BUILT_IN_ATOMIC_FETCH_ADD_4);
+	  gcall *stmt = gimple_build_call (f, 3, addr, one, relaxed);
+	  gsi_insert_on_edge (e, stmt);
+	}
+      else
+	{
+	  /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED);
+	     high_inc = low == 0 ? 1 : 0;
+	     __atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */
+	  tree zero32 = build_zero_cst (uint32_type_node);
+	  tree one32 = build_one_cst (uint32_type_node);
+	  tree addr_high = make_temp_ssa_name (TREE_TYPE (addr), NULL, name);
+	  gimple *stmt = gimple_build_assign (addr_high, POINTER_PLUS_EXPR,
+					      addr,
+					      build_int_cst (size_type_node,
+							     4));
+	  gsi_insert_on_edge (e, stmt);
+	  if (WORDS_BIG_ENDIAN)
+	    std::swap (addr, addr_high);
+	  tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4);
+	  stmt = gimple_build_call (f, 3, addr, one, relaxed);
+	  tree low = make_temp_ssa_name (uint32_type_node, NULL, name);
+	  gimple_call_set_lhs (stmt, low);
+	  gsi_insert_on_edge (e, stmt);
+	  tree is_zero = make_temp_ssa_name (boolean_type_node, NULL, name);
+	  stmt = gimple_build_assign (is_zero, EQ_EXPR, low, zero32);
+	  gsi_insert_on_edge (e, stmt);
+	  tree high_inc = make_temp_ssa_name (uint32_type_node, NULL, name);
+	  stmt = gimple_build_assign (high_inc, COND_EXPR, is_zero, one32,
+				      zero32);
+	  gsi_insert_on_edge (e, stmt);
+	  stmt = gimple_build_call (f, 3, addr_high, high_inc, relaxed);
+	  gsi_insert_on_edge (e, stmt);
+	}
     }
   else
     {
-      tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
-      tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node,
-						   NULL, "PROF_edge_counter");
+      tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name);
       gassign *stmt1 = gimple_build_assign (gcov_type_tmp_var, ref);
-      gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node,
-					      NULL, "PROF_edge_counter");
+      gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name);
       gassign *stmt2 = gimple_build_assign (gcov_type_tmp_var, PLUS_EXPR,
 					    gimple_assign_lhs (stmt1), one);
       gassign *stmt3 = gimple_build_assign (unshare_expr (ref),
@@ -710,11 +750,8 @@ tree_profiling (void)
 
   if (flag_profile_update == PROFILE_UPDATE_ATOMIC
       && !can_support_atomic)
-    {
-      warning (0, "target does not support atomic profile update, "
-	       "single mode is selected");
-      flag_profile_update = PROFILE_UPDATE_SINGLE;
-    }
+    split_atomic_increment = gcov_type_size == 8
+      && (HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi);
   else if (flag_profile_update == PROFILE_UPDATE_PREFER_ATOMIC)
     flag_profile_update = can_support_atomic
       ? PROFILE_UPDATE_ATOMIC : PROFILE_UPDATE_SINGLE;