From patchwork Thu Jul 20 07:09:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 123030 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2936317vqt; Thu, 20 Jul 2023 00:10:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlHLwIOlUttVoDhtrExKixwjewBCRMi9iCfJrbI+ozzsTpm39Nexqp8ck55ng6yUCe8KzXM6 X-Received: by 2002:a17:906:2813:b0:99b:415f:2e4f with SMTP id r19-20020a170906281300b0099b415f2e4fmr1712417ejc.57.1689837022853; Thu, 20 Jul 2023 00:10:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689837022; cv=none; d=google.com; s=arc-20160816; b=bBG9dcTeRgsKXqfOn5c4uk1rXO7ARvWLhB0UpajudevjIb2cnRoYXR7vFY+/ml05dv ShqUFBjtSjIHHR1SxLtjAfwaNGIskq4UoyAgi++6r84lt1gKvX/JOmtrshLacNR0JB9+ IS68k04bhunUcte4BqsHgF692hrsjPFFj4zgeOTwowaWFpsVv4tG/5UYYZie8cJq9Nau EczskWez+4juBb8GPFoTdTEzsLVDk4HSz1H4uQbo6badSjUAKsR/zNNY4+gBP1SSOzxn gmpcdWjUa6aDg+lp18/98CtFaPJaW7RkdxzvehyWs81W1nRLC9CgMqiU3dUnnYt51M7N 03tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=Eyw17bX5LYyYGdZxf0V0GG7UmkLdQKRko6hoPv4JBEw=; fh=KxPrluukuxmHUzYWntxWAHNXePa4vWvGreVeug6Wy5k=; b=vceKoAmpW2bsDzyYzFGV3JD2IJoxDbw9TuBNGmSzGRdXO1RtwP3SPLkhtmfNprXfAC TlKAO8dfDHUn/5Wj1XpJHJ13nGhc+UkKeo2g//CrXea4icc0ei8koO+U8LSwGAIcDXFG tvOApRmaTm4fDynttrGFVwnr7hR9B2OcoU45bag6hTxVicewTngUitFqophIR5g27UJT 6p0SETC2/m874lIlCCZMrmIydHMKtlMasFchk6oQ4dq/a/l5RFqVvpfNpc8BYxWnoM4M U7ey1g/sViwOz6M4R7WX+Zdv/zSpeK2WOn/jfvbUeSmtrik7DTZDpE66B7TcHk4OC6YR Y99w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ycCP4Saw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f5-20020a170906824500b00996d0a9b50csi278117ejx.234.2023.07.20.00.10.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Jul 2023 00:10:22 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ycCP4Saw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62457385771D for ; Thu, 20 Jul 2023 07:10:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 62457385771D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689837021; bh=Eyw17bX5LYyYGdZxf0V0GG7UmkLdQKRko6hoPv4JBEw=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=ycCP4SawNxr3KEZhVdndkwxcRFMhihimjFXdT5qoc71vp7P68Yvvrgzu2sS2zd1/f 5oMxvKAP4XUkq6ipSvqXgxXxIwtqm26YTu3sNrCj0bbxIiiTTzzt/HZcs6zqr92VDK 1nEJDGmNeI/hWDFGkEPBF0l8OTBTnxJKEUnj3lj8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 90B7E3858C2B for ; Thu, 20 Jul 2023 07:09:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 90B7E3858C2B Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id A4F232825A6; Thu, 20 Jul 2023 09:09:25 +0200 (CEST) Date: Thu, 20 Jul 2023 09:09:25 +0200 To: gcc-patches@gcc.gnu.org Subject: loop-ch improvements, part 3 Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771922546272095195 X-GMAIL-MSGID: 1771922546272095195 Hi, this patch makes tree-ssa-loop-ch to understand if-combined conditionals (which are quite common) and remove the IV-derived heuristics. That heuristics is quite dubious because every variable with PHI in header of integral or pointer type is seen as IV, so in the first basic block we match all loop invariants as invariants and everything that chagnes in loop as IV-like. I think the heuristics was mostly there to make header duplication happen when the exit conditional is constant false in the first iteration and with ranger we can work this out in good enough precision. The patch adds notion of "combined exit" which has conditional that is and/or/xor of loop invariant exit and exit known to be false in first iteration. Copying these is a win since the loop conditional will simplify in both copies. It seems that those are usual bit or/and/xor and the code size accounting is true only when the values have at most one bit set or when the static constant and invariant versions are simple (such as all zeros). I am not testing this, so the code may be optimistic here. I think it is not common enough to matter and I can not think of correct condition that is not quite complex. I also improved code size estimate not accounting non-conditionals that are know to be constant in peeled copy and improved debug output. This requires testsuite compensaiton. uninit-pred-loop-1.c.C does: /* { dg-do compile } */ /* { dg-options "-Wuninitialized -O2 -std=c++98" } */ extern int bar(); int foo(int n, int m) { for (;;) { int err = ({int _err; for (int i = 0; i < 16; ++i) { if (m+i > n) break; _err = 17; _err = bar(); } _err; }); if (err == 0) return 17; } Before path we duplicate if (m+i > n) which makes maybe-uninitialized warning to not be output. I do not quite see why copying this out would be a win, since it won't simlify. Also I think the warning is correct. if m>n the loop will bail out before initializing _err and it will be used unitialized. I think it is bug elsewhere that header duplication supresses this. copy headers does: int is_sorted(int *a, int n, int m, int k) { for (int i = 0; i < n - 1 && m && k > i; i++) if (a[i] > a[i + 1]) return 0; return 1; } it tests that all three for statement conditionals are duplicaed. With patch we no longer do k>i since it is not going to simplify. So I added test ensuring that k is positive. Also the tests requires disabling if-combining and vrp to avoid conditionals becoming combined ones. So I aded new version of test that we now behave correctly aslo with if-combine. ivopt_mult_2.c and ivopt_mult_1.c seems to require loop header duplication for ivopts to behave particular way, so I also ensured by value range that the header is duplicated. Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: * tree-ssa-loop-ch.cc (edge_range_query): Rename to ... (get_range_query): ... this one; do (static_loop_exit): Add query parametr, turn ranger to reference. (loop_static_stmt_p): New function. (loop_static_op_p): New function. (loop_iv_derived_p): Remove. (loop_combined_static_and_iv_p): New function. (should_duplicate_loop_header_p): Discover combined onditionals; do not track iv derived; improve dumps. (pass_ch::execute): Fix whitespace. gcc/testsuite/ChangeLog: * g++.dg/uninit-pred-loop-1_c.C: Allow warning. * gcc.dg/tree-ssa/copy-headers-7.c: Add tests so exit conditition is static; update template. * gcc.dg/tree-ssa/ivopt_mult_1.c: Add test so exit condition is static. * gcc.dg/tree-ssa/ivopt_mult_2.c: Add test so exit condition is static. * gcc.dg/tree-ssa/copy-headers-8.c: New test. diff --git a/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C b/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C index 711812aae1b..1ee1615526f 100644 --- a/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C +++ b/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C @@ -15,7 +15,7 @@ int foo(int n, int m) _err; }); - if (err == 0) return 17; + if (err == 0) return 17; /* { dg-warning "uninitialized" "warning" } */ } return 18; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c index 3c9b3807041..e2a6c75f2e9 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c @@ -3,9 +3,10 @@ int is_sorted(int *a, int n, int m, int k) { - for (int i = 0; i < n - 1 && m && k > i; i++) - if (a[i] > a[i + 1]) - return 0; + if (k > 0) + for (int i = 0; i < n - 1 && m && k > i; i++) + if (a[i] > a[i + 1]) + return 0; return 1; } @@ -13,4 +14,8 @@ int is_sorted(int *a, int n, int m, int k) the invariant test, not the alternate exit test. */ /* { dg-final { scan-tree-dump "is now do-while loop" "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Conditional combines static and invariant" 0 "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Will elliminate invariant exit" 1 "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Will eliminate peeled conditional" 1 "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Not duplicating bb .: condition based on non-IV loop variant." 1 "ch2" } } */ /* { dg-final { scan-tree-dump-times "Will duplicate bb" 3 "ch2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-8.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-8.c new file mode 100644 index 00000000000..8b4b5e7ea81 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-8.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ch2-details" } */ + +int is_sorted(int *a, int n, int m, int k) +{ + if (k > 0) + for (int i = 0; i < n - 1 && m && k > i; i++) + if (a[i] > a[i + 1]) + return 0; + return 1; +} + +/* Verify we apply loop header copying but only copy the IV tests and + the invariant test, not the alternate exit test. */ + +/* { dg-final { scan-tree-dump "is now do-while loop" "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Conditional combines static and invariant" 1 "ch2" } } */ +/* { dg-final { scan-tree-dump-times "Will duplicate bb" 2 "ch2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c index adfe371c7ce..faed9114f6f 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c @@ -9,6 +9,7 @@ long foo(long* p, long* p2, int N1, int N2) long* p_limit = p + N1; long* p_limit2 = p2 + N2; long s = 0; + if (p2 <= p_limit2) while (p <= p_limit) { p++; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c index 50d0cc5d2ae..6a82aeb0268 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c @@ -8,15 +8,16 @@ long foo(long* p, long* p2, int N1, int N2) int i = 0; long* p_limit2 = p2 + N2; long s = 0; - while (i < N1) - { - p++; - p2++; - i++; - if (p2 > p_limit2) - break; - s += (*p); - } + if (p2 <= p_limit2) + while (i < N1) + { + p++; + p2++; + i++; + if (p2 > p_limit2) + break; + s += (*p); + } return s; } diff --git a/gcc/tree-ssa-loop-ch.cc b/gcc/tree-ssa-loop-ch.cc index e0139cb432c..f3dc3d998e3 100644 --- a/gcc/tree-ssa-loop-ch.cc +++ b/gcc/tree-ssa-loop-ch.cc @@ -38,34 +38,33 @@ along with GCC; see the file COPYING3. If not see #include "value-range.h" #include "gimple-range.h" #include "gimple-range-path.h" +#include "gimple-pretty-print.h" #include "cfganal.h" -/* Duplicates headers of loops if they are small enough, so that the statements - in the loop body are always executed when the loop is entered. This - increases effectiveness of code motion optimizations, and reduces the need - for loop preconditioning. */ +/* Return path query insteance for testing ranges of statements + in headers of LOOP contained in basic block BB. + Use RANGER instance. */ -/* Given a path through edge E, whose last statement is COND, return - the range of the solved conditional in R. */ - -static void -edge_range_query (irange &r, class loop *loop, gcond *cond, gimple_ranger &ranger) +static path_range_query * +get_range_query (class loop *loop, + basic_block bb, + gimple_ranger &ranger) { auto_vec path; - for (basic_block bb = gimple_bb (cond); bb != loop->header; bb = single_pred_edge (bb)->src) + for (; bb != loop->header; bb = single_pred_edge (bb)->src) path.safe_push (bb); path.safe_push (loop->header); path.safe_push (loop_preheader_edge (loop)->src); - path_range_query query (ranger, path); - if (!query.range_of_stmt (r, cond)) - r.set_varying (boolean_type_node); + return new path_range_query (ranger, path); } /* Return edge that is true in the first iteration of the loop - and NULL otherwise. */ + and NULL otherwise. + Formulate corrent ranger query to RANGER. */ static edge -static_loop_exit (class loop *l, basic_block bb, gimple_ranger *ranger) +static_loop_exit (class loop *l, basic_block bb, gimple_ranger &ranger, + path_range_query *&query) { gcond *last = safe_dyn_cast (*gsi_last_bb (bb)); edge ret_e; @@ -83,21 +82,48 @@ static_loop_exit (class loop *l, basic_block bb, gimple_ranger *ranger) int_range<1> desired_static_range; if (loop_exit_edge_p (l, true_e)) - { + { desired_static_range = range_false (); ret_e = true_e; - } + } else - { - desired_static_range = range_true (); - ret_e = false_e; - } + { + desired_static_range = range_true (); + ret_e = false_e; + } + + if (!query) + query = get_range_query (l, gimple_bb (last), ranger); int_range<2> r; - edge_range_query (r, l, last, *ranger); + if (!query->range_of_stmt (r, last)) + return NULL; return r == desired_static_range ? ret_e : NULL; } +/* Return true if STMT is static in LOOP. This means that its value + is constant in the first iteration. + Use RANGER and formulate query cached in QUERY. */ + +static bool +loop_static_stmt_p (class loop *loop, + gimple_ranger &ranger, + path_range_query *&query, + gimple *stmt) +{ + tree type = gimple_range_type (stmt); + if (!type || !Value_Range::supports_type_p (type)) + return false; + + if (!query) + query = get_range_query (loop, gimple_bb (stmt), ranger); + + Value_Range r (gimple_range_type (stmt)); + if (!query->range_of_stmt (r, stmt)) + return false; + return r.singleton_p (); +} + /* Return true if OP is invariant. */ static bool @@ -109,21 +135,37 @@ loop_invariant_op_p (class loop *loop, if (SSA_NAME_IS_DEFAULT_DEF (op) || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (op)))) return true; + return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; +} + +/* Return true if OP combines outcome of static and + loop invariant conditional. */ + +static bool +loop_static_op_p (class loop *loop, tree op) +{ + /* Always check for invariant first. */ + gcc_checking_assert (!is_gimple_min_invariant (op) + && !SSA_NAME_IS_DEFAULT_DEF (op) + && flow_bb_inside_loop_p (loop, + gimple_bb (SSA_NAME_DEF_STMT (op)))); return gimple_uid (SSA_NAME_DEF_STMT (op)) & 2; } -/* Return true if OP looks like it is derived from IV. */ + +/* Return true if OP combines outcome of static and + loop invariant conditional. */ static bool -loop_iv_derived_p (class loop *loop, - tree op) +loop_combined_static_and_iv_p (class loop *loop, + tree op) { /* Always check for invariant first. */ gcc_checking_assert (!is_gimple_min_invariant (op) && !SSA_NAME_IS_DEFAULT_DEF (op) && flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (op)))); - return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; + return gimple_uid (SSA_NAME_DEF_STMT (op)) & 4; } /* Check whether we should duplicate HEADER of LOOP. At most *LIMIT @@ -182,25 +224,18 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, return false; } + path_range_query *query = NULL; for (gphi_iterator psi = gsi_start_phis (header); !gsi_end_p (psi); gsi_next (&psi)) - /* If this is actual loop header PHIs indicate that the SSA_NAME - may be IV. Otherwise just give up. */ - if (header == loop->header) + if (!virtual_operand_p (gimple_phi_result (psi.phi ()))) { - gphi *phi = psi.phi (); - tree res = gimple_phi_result (phi); - if (INTEGRAL_TYPE_P (TREE_TYPE (res)) - || POINTER_TYPE_P (TREE_TYPE (res))) - gimple_set_uid (phi, 1 /* IV */); - else - gimple_set_uid (phi, 0); + bool static_p = loop_static_stmt_p (loop, *ranger, + query, psi.phi ()); + gimple_set_uid (psi.phi (), static_p ? 2 : 0); } - else - gimple_set_uid (psi.phi (), 0); /* Count number of instructions and punt on calls. - Populate stmts INV/IV flag to later apply heuristics to the + Populate stmts INV flag to later apply heuristics to the kind of conditions we want to copy. */ for (bsi = gsi_start_bb (header); !gsi_end_p (bsi); gsi_next (&bsi)) { @@ -215,6 +250,12 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, if (gimple_code (last) == GIMPLE_COND) break; + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " Analyzing: "); + print_gimple_stmt (dump_file, last, 0, TDF_SLIM); + } + if (gimple_code (last) == GIMPLE_CALL && (!gimple_inexpensive_call_p (as_a (last)) /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed @@ -225,53 +266,152 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, fprintf (dump_file, " Not duplicating bb %i: it contains call.\n", header->index); + if (query) + delete query; return false; } - /* Classify the stmt based on whether its computation is based - on a IV or whether it is invariant in the loop. */ + /* Classify the stmt is invariant in the loop. */ gimple_set_uid (last, 0); if (!gimple_vuse (last) && gimple_code (last) != GIMPLE_ASM && (gimple_code (last) != GIMPLE_CALL || gimple_call_flags (last) & ECF_CONST)) { - bool inv = true; - bool iv = true; + bool inv = true, static_p = false; ssa_op_iter i; tree op; FOR_EACH_SSA_TREE_OPERAND (op, last, i, SSA_OP_USE) if (!loop_invariant_op_p (loop, op)) - { - if (!loop_iv_derived_p (loop, op)) - { - inv = false; - iv = false; - break; - } - else - inv = false; - } - gimple_set_uid (last, (iv ? 1 : 0) | (inv ? 2 : 0)); + inv = false; + /* Assume that code is reasonably optimized and invariant + constants are already identified. */ + if (!inv) + static_p = loop_static_stmt_p (loop, *ranger, query, last); + gimple_set_uid (last, (inv ? 1 : 0) | (static_p ? 2 : 0)); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + if (inv) + fprintf (dump_file, " Stmt is loop invariant\n"); + if (static_p) + fprintf (dump_file, " Stmt is static " + "(constant in the first iteration)\n"); + } /* Loop invariants will be optimized out in loop body after duplication; do not account invariant computation in code - size costs. */ - if (inv) + size costs. + + Similarly static computations will be optimized out in the + duplicatd header. */ + if (inv || static_p) continue; + + /* Match the following: + _1 = i_1 < 10 <- static condtion + _2 = n != 0 <- loop invariant condition + _3 = _1 & _2 <- combined static and iv statement. */ + if (gimple_code (last) == GIMPLE_ASSIGN + && (gimple_assign_rhs_code (last) == TRUTH_AND_EXPR + || gimple_assign_rhs_code (last) == TRUTH_OR_EXPR + || gimple_assign_rhs_code (last) == TRUTH_XOR_EXPR + || gimple_assign_rhs_code (last) == BIT_AND_EXPR + || gimple_assign_rhs_code (last) == BIT_IOR_EXPR + || gimple_assign_rhs_code (last) == BIT_XOR_EXPR)) + { + tree op1 = gimple_assign_rhs1 (last); + tree op2 = gimple_assign_rhs2 (last); + + if ((loop_invariant_op_p (loop, op1) + || loop_combined_static_and_iv_p (loop, op1) + || loop_static_op_p (loop, op1)) + && (loop_invariant_op_p (loop, op2) + || loop_combined_static_and_iv_p (loop, op2) + || loop_static_op_p (loop, op2))) + { + /* Duplicating loop header with combned conditional will + remove this statement in each copy. But we account for + that later when seeing that condition. + + Note that this may be overly optimistic for bit operations + where the static parameter may still result in non-trivial + bit operation. */ + gimple_set_uid (last, 4); + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " Stmt combines static and invariant op\n"); + continue; + } + } } - *limit -= estimate_num_insns (last, &eni_size_weights); + int insns = estimate_num_insns (last, &eni_size_weights); + *limit -= insns; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " Acconting stmt as %i insns\n", insns); if (*limit < 0) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " Not duplicating bb %i contains too many insns.\n", header->index); + if (query) + delete query; return false; } } - edge static_exit = static_loop_exit (loop, header, ranger); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " Analyzing: "); + print_gimple_stmt (dump_file, last, 0, TDF_SLIM); + } + + /* If the condition tests a non-IV loop variant we do not want to rotate + the loop further. Unless this is the original loop header. */ + tree lhs = gimple_cond_lhs (last); + tree rhs = gimple_cond_rhs (last); + bool lhs_invariant = loop_invariant_op_p (loop, lhs); + bool rhs_invariant = loop_invariant_op_p (loop, rhs); + + /* Combined conditional is a result of if combining: + + _1 = i_1 < 10 <- static condtion + _2 = n != 0 <- loop invariant condition + _3 = _1 & _2 <- combined static and iv statement + if (_3 != 0) <- combined conditional + + Combined conditionals will not be optimized out in either copy. + However duplicaed header simplifies to: + + if (n < 10) + + while loop body to + + if (i_1 < 10) + + So effectively the resulting code sequence will be of same length as + the original code. + + Combined conditionals are never static or invariant, so save some work + below. */ + if (lhs_invariant != rhs_invariant + && (lhs_invariant + || loop_combined_static_and_iv_p (loop, lhs)) + && (rhs_invariant + || loop_combined_static_and_iv_p (loop, rhs))) + { + if (query) + delete query; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " Conditional combines static and invariant op.\n"); + return true; + } + + edge static_exit = static_loop_exit (loop, header, *ranger, query); + if (query) + delete query; if (static_exit && static_exits) { @@ -282,13 +422,6 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, static_exit->src->index); /* Still look for invariant exits; exit may be both. */ } - - /* If the condition tests a non-IV loop variant we do not want to rotate - the loop further. Unless this is the original loop header. */ - tree lhs = gimple_cond_lhs (last); - tree rhs = gimple_cond_rhs (last); - bool lhs_invariant = loop_invariant_op_p (loop, lhs); - bool rhs_invariant = loop_invariant_op_p (loop, rhs); if (lhs_invariant && rhs_invariant) { if (invariant_exits) @@ -312,7 +445,11 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, return true; /* We was not able to prove that conditional will be eliminated. */ - *limit -= estimate_num_insns (last, &eni_size_weights); + int insns = estimate_num_insns (last, &eni_size_weights); + *limit -= insns; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " Acconting stmt as %i insns\n", insns); if (*limit < 0) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -322,12 +459,6 @@ should_duplicate_loop_header_p (basic_block header, class loop *loop, return false; } - /* TODO: This is heuristics that claims that IV based ocnditionals will - likely be optimized out in duplicated header. We could use ranger - query instead to tell this more precisely. */ - if ((lhs_invariant || loop_iv_derived_p (loop, lhs)) - && (rhs_invariant || loop_iv_derived_p (loop, rhs))) - return true; if (header != loop->header) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -550,7 +681,7 @@ public: /* opt_pass methods: */ bool gate (function *) final override { return flag_tree_ch != 0; } - + /* Initialize and finalize loop structures, copying headers inbetween. */ unsigned int execute (function *) final override; @@ -590,7 +721,7 @@ public: return flag_tree_ch != 0 && (flag_tree_loop_vectorize != 0 || fun->has_force_vectorize_loops); } - + /* Just copy headers, no initialization/finalization of loop structures. */ unsigned int execute (function *) final override; @@ -973,7 +1104,7 @@ pass_ch::execute (function *fun) /* Assume an earlier phase has already initialized all the loop structures that we need here (and perhaps others too), and that these will be finalized by a later phase. */ - + unsigned int pass_ch_vect::execute (function *fun) {