From patchwork Wed Oct 4 12:39:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= X-Patchwork-Id: 148289 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:254a:b0:403:3b70:6f57 with SMTP id hf10csp105752vqb; Wed, 4 Oct 2023 05:47:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG/vxpmYD95ErqZtm+SL+vVjBFMRhYuTNiO/vBZAm2lZsQoi/xVHskgC/9zh4+uqqkFF1Ox X-Received: by 2002:a17:907:980f:b0:9aa:206d:b052 with SMTP id ji15-20020a170907980f00b009aa206db052mr4351793ejc.27.1696423673796; Wed, 04 Oct 2023 05:47:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696423673; cv=none; d=google.com; s=arc-20160816; b=hxNZ1wVM5FELOvHiJiyTkS2GQWhBuAyKnTQidacTsTh66vbRl0GIN3y6pOMWHlsBEI zC/UQwfvbBp4Spe/Bsgr0VgqI/clVgCV71Fw7UXu1lNTuxfRQkTL0JFKIRr8Z9HWmZOS P2DxHOTqm3webZelOmIk4QJDTKhc6Tsr9+77HbTs0OjSGwhMKvQAG4fPWYxmAHrcwnXW vNUQC23DzBgZJHbyAJK43DHP6zxJTrl+3dciPsTe66S+xwhyS9/1Koxk7gLCw656GKSz RG9mTqtV32EOcRvQ6PXe5B6i07tmBoiWZ/pTmqDRxscqNSc6kt6b7sM1vAd9ZI+RbYG+ xU5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=p/rfwVI1mLjOsjZEzkhiGqda0o4vTso799sIafKFQ2Q=; fh=9YvKuYzPOH4H8p6ZxADoseEyE0mG6135swhZOYPib+Q=; b=ocZb7NC14WIWTf8Oo3JFN4kKLDqP5tu5KYaUDrAcX1gLBJ2TfVtE1s3zWgoKrJLWFJ 4YPSUCUfw0D0AtKwMwHIGCW6ewFhYt4fL+ERftHsOBVJR6MxQ15BAq5nktoLtgOPvdCc YK0y95xOovKRYJFlUIb+1b0dwpt2PGRwLswln5I0WrgsS0tOe6Ni20P0Z/tXCIjp2uI3 nxJGseCM5r9pGfijdisfuC5c6qxQXi6DDW4elrEbzXNn5Law5RQPBbi8bzRRdizLGwK1 gJkSf/cLo7cEXCJEtns5AKeK0mODBBjvW3uBX7Q4B6il0ecCuS1tFTvsHAbChpZ5eT5E wOow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kolabnow.com header.s=dkim20160331 header.b=wTz12PfJ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id j12-20020a170906474c00b009b2f680d77bsi1663384ejs.111.2023.10.04.05.47.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 05:47:53 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@kolabnow.com header.s=dkim20160331 header.b=wTz12PfJ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 15B4638AA25F for ; Wed, 4 Oct 2023 12:42:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx.kolabnow.com (mx.kolabnow.com [212.103.80.155]) by sourceware.org (Postfix) with ESMTPS id E6AF8385B524 for ; Wed, 4 Oct 2023 12:40:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E6AF8385B524 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=lambda.is Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=lambda.is Received: from localhost (unknown [127.0.0.1]) by mx.kolabnow.com (Postfix) with ESMTP id 2292B20C81C4; Wed, 4 Oct 2023 14:40:51 +0200 (CEST) Authentication-Results: ext-mx-out011.mykolab.com (amavis); dkim=pass (4096-bit key) reason="pass (just generated, assumed good)" header.d=kolabnow.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kolabnow.com; h= content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:date:subject:subject:from:from:received :received:received; s=dkim20160331; t=1696423250; x=1698237651; bh=p/rfwVI1mLjOsjZEzkhiGqda0o4vTso799sIafKFQ2Q=; b=wTz12PfJtI/6 wklAfGVqZ9G/dS3KU3LoaXsKAmKVjF5OS6q3OziM77miM1WlBdnkBv1fJv/jE1GH Xbr8NMcAX0nbxKlaNLZjNl8G9XAWoB+TZfuRwp8FU0/9elk3bbDDH9r0L/HS+0Ki JDmuDldpFiRaPbnsft5GZa4AunxUK+jJO8wt8M2EBFlzyskf7jKnCjLpP6p8rmiv ARqS3hmHBkV9ZDuM17mfuFFiEf2jGWYgFA/Af/eHKTSW/CWbNq/LUsLcClGOr8sn wJIHCeQRyNIiHBTF88e8QnnxtXzOoe1IQrKmTYxw74bb5xrivkSHYUAsGjtncU75 inEj17wsY3mbFY3NOBQXpWQieF4ObxErVDKrJKaUlan2PXYAi/MlXQTkhVIf7r4w lR4xv9FoSeR24t6tqvOiHoJDF26CARVqI97O+wbRFmWGmAgHRaOtYf8ZTsXB7YmV UEaAxekhKhd0JdP82wiW3Yk3eDywFnKKQjXQ8H83zWLl/VIyrg6RcDMAjo76AWlC cfnDAoRE+EmqJ4t/RvVykdj/nUQTVkWHEB59zg9Y0lW5TL17YUq9T7JKom8oMEIK Yq0OqsLLQYK3mBYFJ/up6BC8E++EiGvAnr9FC9UUv7700/qTcWzHLvQiFjTmDfzQ U3b0XYh2Wg+BpTLpHWt3akrJZYWlXS4= X-Virus-Scanned: amavis at mykolab.com X-Spam-Score: -0.999 X-Spam-Level: X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 Received: from mx.kolabnow.com ([127.0.0.1]) by localhost (ext-mx-out011.mykolab.com [127.0.0.1]) (amavis, port 10024) with ESMTP id lQFtf86rpDYO; Wed, 4 Oct 2023 14:40:50 +0200 (CEST) Received: from int-mx009.mykolab.com (unknown [10.9.13.9]) by mx.kolabnow.com (Postfix) with ESMTPS id 0864B20AB2E2; Wed, 4 Oct 2023 14:40:50 +0200 (CEST) Received: from ext-subm010.mykolab.com (unknown [10.9.6.10]) by int-mx009.mykolab.com (Postfix) with ESMTPS id E7FC720D6AEA; Wed, 4 Oct 2023 14:40:49 +0200 (CEST) From: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= To: gcc-patches@gcc.gnu.org Cc: mliska@suse.cz, jh@suse.cz, =?utf-8?q?J=C3=B8rgen_Kvalsvik?= Subject: [PATCH 12/22] Do two-phase filtering in expr isolation Date: Wed, 4 Oct 2023 21:39:12 +0900 Message-Id: <20231004123921.634024-13-j@lambda.is> In-Reply-To: <20231004123921.634024-1-j@lambda.is> References: <20231004123921.634024-1-j@lambda.is> MIME-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778829150158642703 X-GMAIL-MSGID: 1778829150158642703 In complex control flow graphs where gotos and returns escape the inner then-blocks and the right else-blocks are omitted, inner conditions would erroneously be considered a part of the outer. By using the observation that the neighborhood of a proper boolean expression should always be the two outcome nodes we can repeat the search and ancestor filtering to pinpoint the proper expression. This is helped by replacing every pair of nodes that share some dominator in the candidate set that is not the leading term, as that dominator must be the leading term in some other boolean expression than the one we are looking for. --- gcc/testsuite/gcc.misc-tests/gcov-19.c | 125 +++++++++++++++++++++++++ gcc/testsuite/gcc.misc-tests/gcov-22.c | 31 ++++++ gcc/tree-profile.cc | 96 ++++++++++++++++++- 3 files changed, 251 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c b/gcc/testsuite/gcc.misc-tests/gcov-19.c index 5b5c1c275b0..8d6eb610af2 100644 --- a/gcc/testsuite/gcc.misc-tests/gcov-19.c +++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c @@ -1036,6 +1036,10 @@ mcdc023g (int a, int b, int c, int d, int e, int f, int g, int h, int i, int k, x = b + c; } +/* Gotos, return, labels can make odd graphs. It is important that conditions + are assigned to the right expression, and that there are no miscounts. In + these tests values may be re-used, as checking things like masking an + independence is done in other test cases and not so useful here. */ void mcdc024a (int a, int b) { @@ -1117,6 +1121,123 @@ label4: } } +int +mcdc024d (int a, int b, int c) +{ + /* Graphs can get complicated with the innermost returns and else-less if, + so we must make sure these conditions are counted correctly. */ + if (a) /* conditions(1/2) true(0) */ + /* conditions(end) */ + { + if (b) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + { + if (c) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + return 1; + else + return 2; + } + + if (a) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + return 3; + } + + return 5; +} + +/* Nested else-less ifs with inner returns needs to be counted right, which + puts some pressure on the expression isolation. The fallthrough from inner + expressions into the next if cause the cfg to have edges from deeper in + subexpression to the next block sequence, which can confuse the expression + isolation. */ +int +mcdc024e (int a, int b, int c) +{ + if (a) /* conditions(1/2) true(0) */ + /* conditions(end) */ + { + if (b) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + { + if (c) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + { + if (a) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + return 1; + else + return 2; + } + + if (a) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + return 3; + } + + if (b) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + return 4; + } + return 5; +} + +int +mcdc024f (int a, int b, int c) +{ + if (b) /* conditions(1/2) true(0) */ + /* conditions(end) */ + return 0; + + if (a) /* conditions(1/2) true(0) */ + /* conditions(end) */ + { + if (b) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + { + b += 2; + if (b & 0xFF) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + c++; + + return c; + } + c += 10; + } +} + + +int +mcdc024g (int a, int b, int c) +{ + if (b) /* conditions(1/2) true(0) */ + /* conditions(end) */ + goto inner; + + if (a) /* conditions(1/2) true(0) */ + /* conditions(end) */ + ++a; + + + if (a) /* conditions(1/2) true(0) */ + /* conditions(end) */ + { + if (b) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + { +inner: + b += 2; + if (b & 0xFF) /* conditions(0/2) true(0) false(0) */ + /* conditions(end) */ + c++; + + return c; + } + c += 10; + } +} + int main () { mcdc001a (0, 1); @@ -1313,6 +1434,10 @@ int main () mcdc024a (0, 0); mcdc024b (0, 0); mcdc024c (0, 0); + mcdc024d (0, 0, 0); + mcdc024e (0, 0, 0); + mcdc024f (0, 0, 0); + mcdc024g (0, 0, 0); } /* { dg-final { run-gcov conditions { --conditions gcov-19.c } } } */ diff --git a/gcc/testsuite/gcc.misc-tests/gcov-22.c b/gcc/testsuite/gcc.misc-tests/gcov-22.c index 8aa8867dbf1..28b7de66022 100644 --- a/gcc/testsuite/gcc.misc-tests/gcov-22.c +++ b/gcc/testsuite/gcc.misc-tests/gcov-22.c @@ -5,6 +5,7 @@ jmp_buf buf; void noop() {} +int identity(int x) { return x; } /* This function is a test to verify that the expression isolation does not break on a CFG with the right set of complex edges. The (_ && setjmp) @@ -30,10 +31,40 @@ pathological001 (int a, int b, int c) noop (); } +///* Adapted from freetype-2.13.0 gxvalid/gxvmod.c classic_kern_validate */ +int +pathological002 (int a) +{ + int error = identity(a); + + if (error) /* conditions(1/2) true(0) */ + /* conditions(end) */ + goto Exit; + + if (a+1) /* conditions(1/2) false(0) */ + /* conditions(end) */ + { + noop (); + if (setjmp (buf)) /* conditions(1/2) true(0) */ + /* conditions(end) */ + noop (); + + if (error) /* conditions(1/2) true(0) */ + /* conditions(end) */ + noop (); + } + + error--; + +Exit: + return error; +} + int main () { pathological001 (0, 1, 0); + pathological002 (0); } diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc index f98bb1f76c8..ab96b872db3 100644 --- a/gcc/tree-profile.cc +++ b/gcc/tree-profile.cc @@ -292,6 +292,8 @@ contract_edge (edge e) return source; if (block_conditional_p (dest)) return e; + if (dest->index == EXIT_BLOCK) + return source; e = single_edge (dest->succs); if (!e) @@ -360,6 +362,8 @@ merge_split_outcome (basic_block b) edge e = single_edge (b->succs); for (edge pred : e->dest->preds) { + if (!(pred->flags & EDGE_FALLTHRU)) + return b; if (!single (pred->src->preds)) return b; if (!(single_edge (pred->src->preds)->flags & flag)) @@ -727,7 +731,13 @@ neighborhood (const vec& blocks, sbitmap G, vec& out) the expression B, but the walk may include expressions from the then/else blocks if they start with conditions. Only the subgraph B is the ancestor of *both* the then/else outcome, which means B is the intersection of the - ancestors of the nodes in the neighborhood N(G). */ + ancestors of the nodes in the neighborhood N(G). + + In complex graphs this may capture more than the expression proper. In that + case, perform the algorithm again but on the neighborhood N(B) rather than + N(G). Unless there is complex control flow with deep early returns, gotos + and else-less ifs N(B) will be the two outcome nodes, otherwise search again + after replacing nodes with their common dominator. */ void isolate_expression (conds_ctx &ctx, basic_block p, vec& out) { @@ -765,6 +775,90 @@ isolate_expression (conds_ctx &ctx, basic_block p, vec& out) if (bitmap_bit_p (expr, b->index)) out.safe_push (b); out.sort (cmp_index_map, &ctx.index_map); + + + /* In a lot of programs we would be done now, but in complex CFGs with + gotos or returns from nested then-blocks we need some post processing. + + Essentially, perform another step of the algorithm - find the + neighborhood NG' of the candidate expression B. If |B| == 2 we have the + two outcome nodes and are done. If not, we know the previous step must + have captured something in a nested expression (probably because of a + fallthrough from the then-block being merged into the next block. If + two nodes are dominated by some node lower (according to topsort) in the + CFG then the dominating node is the first term in some conditional + expression, which means it cannot be a part of the expression we're + searching for. We redo the ancestor filtering, but now use a more + precise neighborhood that is unaffected. */ + + NG.truncate (0); + neighborhood (out, expr, NG); + if (NG.length () == 2) + return; + + /* Names are reused in this section and generally mean the same thing. */ + bitmap_clear (reachable); + for (const basic_block b : NG) + bitmap_set_bit (reachable, b->index); + + /* If a pair of nodes has a shared dominator *that is not the root* + then that dominator must be the first term of another boolean + expression or some other structure outside the expression. */ + gcc_assert (!NG.is_empty ()); + for (unsigned i = 0; i != NG.length () - 1; i++) + { + for (unsigned j = i+1; j != NG.length (); j++) + { + auto b = nearest_common_dominator (CDI_DOMINATORS, NG[i], NG[j]); + if (ctx.index_map[b->index] > ctx.index_map[p->index]) + { + bitmap_clear_bit (reachable, NG[i]->index); + bitmap_clear_bit (reachable, NG[j]->index); + bitmap_set_bit (reachable, b->index); + } + } + } + + NG.truncate (0); + for (basic_block b : G) + if (bitmap_bit_p (reachable, b->index)) + NG.safe_push (b); + + bitmap_copy (reachable, expr); + for (const basic_block neighbor : NG) + { + bitmap_clear (ancestors); + for (edge e : neighbor->preds) + { + /* The e edge might have been contracted past when doing the + first scan, so we must make sure its bit is set to not think it + is outside of our candidate set. */ + bitmap_set_bit (reachable, e->src->index); + ancestors_of (e->src, p, reachable, ancestors); + } + bitmap_and (expr, expr, ancestors); + } + + out.truncate (0); + for (const basic_block b : G) + if (bitmap_bit_p (expr, b->index)) + out.safe_push (b); + out.sort (cmp_index_map, &ctx.index_map); + + bitmap_clear (expr); + for (const basic_block b : out) + bitmap_set_bit (expr, b->index); + + /* Perform a final step as a sanity check - if the neighborhood is still + larger than the outcome set (|NG| == 2) the graph is too complex and we + give up instrumentation. Any graph that exhibit this behaviour should + be studied as it might reveal an implementation error. */ + NG.truncate (0); + neighborhood (out, expr, NG); + bitmap_clear (expr); + for (const basic_block b : NG) + bitmap_set_bit (expr, b->index); + gcc_assert (bitmap_count_bits (expr) == 2); } /* Emit lhs = op1 op2 on edges. This emits non-atomic instructions and