From patchwork Fri Dec 16 10:25:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Huber X-Patchwork-Id: 33937 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp883719wrn; Fri, 16 Dec 2022 02:25:59 -0800 (PST) X-Google-Smtp-Source: AA0mqf43CQrmEN7ZJtoMBK+jMty3Oi/+SMrKvTszV5EYxxSp9gmSU36cWgx6Vc/Wy7SaASnzaxAl X-Received: by 2002:a17:906:3917:b0:7bf:1081:9472 with SMTP id f23-20020a170906391700b007bf10819472mr24916792eje.69.1671186359200; Fri, 16 Dec 2022 02:25:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671186359; cv=none; d=google.com; s=arc-20160816; b=RZdJhk/Nf8zOpYbDzw5hzkdoDEDB4WtdT0rLGeqKHacWhyo+Be3exm9A5omYx6WuR6 9WtTz4cyZeJbZ63sXtU7RF9qRVYwkoAZ9z0+Tbm3jArOB+iFXTfhKjcN4veRgZArz2Y2 4+xNyv28XDbuDykFo5ow1ovGb60GxtbtTtX0xE6mzHmHW257ibP3hvQ91nf5z6oPpJJB HLTMBG70ohd4z6q0zvX50kz+0TUt7j2rImFRC9HqKvFzCLk34zSm2S9FD5DB9vsYXQLG RP4qN/eETPkDSTTat+VWSzjt8I4Ofap3CWDyLARxeyNTjWj2hcfe46tp3u4F6Fg63qC7 ycVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dmarc-filter :delivered-to; bh=VOAJ4FIqbh79K0vNj4GvEsLcEgeOsTix6JD5JgVhrdk=; b=WuGqBcW1XH43n+fZvJmlsEhgDZPfAveg4YoaHkAi64LAdtGrZkXlS91fgE5Fy+UZTf yb4sCcW6s8UeLVcA5DgC5+8iX/NV4rBQmwsvfOPyPc9Bt+4plUUecF1X+JRKAmB3yUTg No4y7vpLul65YMEZ7HN9T4xtiK1I49ajrjMXzR4W7bjjFERP4LuIEKdRC3/74BCjbQfF 27Og5hTXlaauGcz0Lb5md3Wn5dGiVN3LeeJBU80vsclF2ImvW2e3ocVq+fjVKU+22wse MBRhRGRhQQ+wBZ1N7Hl1uaUhIfCch+tvJH0CRyZ8fZE9DOppLdJ1vYAojwFUJxcnB5zm sA1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id tz3-20020a170907c78300b007c18e1ecaf5si1873456ejc.654.2022.12.16.02.25.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Dec 2022 02:25:59 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31731382E740 for ; Fri, 16 Dec 2022 10:25:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) by sourceware.org (Postfix) with ESMTPS id D80FB3834E32 for ; Fri, 16 Dec 2022 10:25:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D80FB3834E32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embedded-brains.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embedded-brains.de Received: from sslproxy01.your-server.de ([78.46.139.224]) by dedi548.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1p67uA-000H8u-Hk; Fri, 16 Dec 2022 11:25:26 +0100 Received: from [82.100.198.138] (helo=mail.embedded-brains.de) by sslproxy01.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p67uA-0007dv-EM; Fri, 16 Dec 2022 11:25:26 +0100 Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 2383E4800AA; Fri, 16 Dec 2022 11:25:26 +0100 (CET) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Ac6Pdeb0lZIG; Fri, 16 Dec 2022 11:25:25 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id A90AA480181; Fri, 16 Dec 2022 11:25:25 +0100 (CET) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Z2t9owFjcHTN; Fri, 16 Dec 2022 11:25:25 +0100 (CET) Received: from zimbra.eb.localhost (unknown [192.168.96.242]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 888C24800AA; Fri, 16 Dec 2022 11:25:25 +0100 (CET) From: Sebastian Huber To: gcc-patches@gcc.gnu.org Cc: Richard Biener Subject: [PATCH v2] gcov: Fix -fprofile-update=atomic Date: Fri, 16 Dec 2022 11:25:21 +0100 Message-Id: <20221216102521.73271-1-sebastian.huber@embedded-brains.de> X-Mailer: git-send-email 2.35.3 MIME-Version: 1.0 X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.103.7/26752/Fri Dec 16 09:25:27 2022) X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752365907730332187?= X-GMAIL-MSGID: =?utf-8?q?1752365907730332187?= The code coverage support uses counters to determine which edges in the control flow graph were executed. If a counter overflows, then the code coverage information is invalid. Therefore the counter type should be a 64-bit integer. In multithreaded applications, it is important that the counter increments are atomic. This is not the case by default. The user can enable atomic counter increments through the -fprofile-update=atomic and -fprofile-update=prefer-atomic options. If the hardware supports 64-bit atomic operations, then everything is fine. If not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic counter increments will be used. However, if the hardware does not support the required atomic operations and -fprofile-atomic=update was chosen by the user, then a warning was issued and as a forced fallback to non-atomic operations was done. This is probably not what a user wants. There is still hardware on the market which does not have atomic operations and is used for multithreaded applications. A user which selects -fprofile-update=atomic wants consistent code coverage data and not random data. This patch removes the fallback to non-atomic operations for -fprofile-update=atomic. If atomic operations in hardware are not available, then a library call to libatomic is emitted. To mitigate potential performance issues an optimization for systems which only support 32-bit atomic operations is provided. Here, the edge counter increments are done like this: low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED); high_inc = low == 0 ? 1 : 0; __atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED); In gimple_gen_time_profiler() this split operation cannot be used, since the updated counter value is also required. Here, a library call is emitted. This is not a performance issue since the update is only done if counters[0] == 0. gcc/ChangeLog: * tree-profile.cc (split_atomic_increment): New. (gimple_gen_edge_profiler): Split the atomic edge counter increment in two 32-bit atomic operations if necessary. (tree_profiling): Remove profile update warning and fallback. Set split_atomic_increment if necessary. * doc/invoke.texi (-fprofile-update): Clarify default method. Document the atomic method behaviour. --- v2: * Check gcov_type_size for split_atomic_increment. * Update documentation. gcc/doc/invoke.texi | 16 ++++++++- gcc/tree-profile.cc | 81 +++++++++++++++++++++++++++++++++------------ 2 files changed, 74 insertions(+), 23 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index a50417a4ab7..6b32b659e50 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16457,7 +16457,21 @@ while the second one prevents profile corruption by emitting thread-safe code. Using @samp{prefer-atomic} would be transformed either to @samp{atomic}, when supported by a target, or to @samp{single} otherwise. The GCC driver automatically selects @samp{prefer-atomic} when @option{-pthread} -is present in the command line. +is present in the command line, otherwise the default method is @samp{single}. + +If @samp{atomic} is selected, then the profile information is updated using +atomic operations. If the target does not support atomic operations in +hardware, then function calls to @code{__atomic_fetch_add_4} or +@code{__atomic_fetch_add_8} are emitted. These functions are usually provided +by the @file{libatomic} runtime library. Not all targets provide the +@file{libatomic} runtime library. If it is not available for the target, then +a linker error may happen. Using function calls to update the profiling +information may be a performance issue. For targets which use 64-bit counters +for the profiling information and support only 32-bit atomic operations, the +performance critical profiling updates are done using two 32-bit atomic +operations for each counter update. If a signal interrupts these two +operations updating a counter, then the profiling information may be in an +inconsistent state. @item -fprofile-filter-files=@var{regex} @opindex fprofile-filter-files diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc index 2beb49241f2..49c8caeae18 100644 --- a/gcc/tree-profile.cc +++ b/gcc/tree-profile.cc @@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var; static GTY(()) tree ic_tuple_counters_field; static GTY(()) tree ic_tuple_callee_field; +/* If the user selected atomic profile counter updates + (-fprofile-update=atomic), then the counter updates will be done atomically. + Ideally, this is done through atomic operations in hardware. If the + hardware supports only 32-bit atomic increments and gcov_type_node is a + 64-bit integer type, then for the profile edge counters the increment is + performed through two separate 32-bit atomic increments. This case is + indicated by the split_atomic_increment variable begin true. If the + hardware does not support atomic operations at all, then a library call to + libatomic is emitted. */ +static bool split_atomic_increment; + /* Do initialization work for the edge profiler. */ /* Add code: @@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void) void gimple_gen_edge_profiler (int edgeno, edge e) { - tree one; - - one = build_int_cst (gcov_type_node, 1); + const char *name = "PROF_edge_counter"; + tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno); + tree one = build_int_cst (gcov_type_node, 1); if (flag_profile_update == PROFILE_UPDATE_ATOMIC) { - /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */ - tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno); - tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32 - ? BUILT_IN_ATOMIC_FETCH_ADD_8: - BUILT_IN_ATOMIC_FETCH_ADD_4); - gcall *stmt = gimple_build_call (f, 3, addr, one, - build_int_cst (integer_type_node, - MEMMODEL_RELAXED)); - gsi_insert_on_edge (e, stmt); + tree addr = build_fold_addr_expr (ref); + tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED); + if (!split_atomic_increment) + { + /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */ + tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32 + ? BUILT_IN_ATOMIC_FETCH_ADD_8: + BUILT_IN_ATOMIC_FETCH_ADD_4); + gcall *stmt = gimple_build_call (f, 3, addr, one, relaxed); + gsi_insert_on_edge (e, stmt); + } + else + { + /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED); + high_inc = low == 0 ? 1 : 0; + __atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */ + tree zero32 = build_zero_cst (uint32_type_node); + tree one32 = build_one_cst (uint32_type_node); + tree addr_high = make_temp_ssa_name (TREE_TYPE (addr), NULL, name); + gimple *stmt = gimple_build_assign (addr_high, POINTER_PLUS_EXPR, + addr, + build_int_cst (size_type_node, + 4)); + gsi_insert_on_edge (e, stmt); + if (WORDS_BIG_ENDIAN) + std::swap (addr, addr_high); + tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4); + stmt = gimple_build_call (f, 3, addr, one, relaxed); + tree low = make_temp_ssa_name (uint32_type_node, NULL, name); + gimple_call_set_lhs (stmt, low); + gsi_insert_on_edge (e, stmt); + tree is_zero = make_temp_ssa_name (boolean_type_node, NULL, name); + stmt = gimple_build_assign (is_zero, EQ_EXPR, low, zero32); + gsi_insert_on_edge (e, stmt); + tree high_inc = make_temp_ssa_name (uint32_type_node, NULL, name); + stmt = gimple_build_assign (high_inc, COND_EXPR, is_zero, one32, + zero32); + gsi_insert_on_edge (e, stmt); + stmt = gimple_build_call (f, 3, addr_high, high_inc, relaxed); + gsi_insert_on_edge (e, stmt); + } } else { - tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno); - tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, - NULL, "PROF_edge_counter"); + tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name); gassign *stmt1 = gimple_build_assign (gcov_type_tmp_var, ref); - gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, - NULL, "PROF_edge_counter"); + gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name); gassign *stmt2 = gimple_build_assign (gcov_type_tmp_var, PLUS_EXPR, gimple_assign_lhs (stmt1), one); gassign *stmt3 = gimple_build_assign (unshare_expr (ref), @@ -710,11 +750,8 @@ tree_profiling (void) if (flag_profile_update == PROFILE_UPDATE_ATOMIC && !can_support_atomic) - { - warning (0, "target does not support atomic profile update, " - "single mode is selected"); - flag_profile_update = PROFILE_UPDATE_SINGLE; - } + split_atomic_increment = gcov_type_size == 8 + && (HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi); else if (flag_profile_update == PROFILE_UPDATE_PREFER_ATOMIC) flag_profile_update = can_support_atomic ? PROFILE_UPDATE_ATOMIC : PROFILE_UPDATE_SINGLE;