From patchwork Sat Jul 29 06:38:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 128016 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp880593vqg; Fri, 28 Jul 2023 23:39:36 -0700 (PDT) X-Google-Smtp-Source: APBJJlHpLfrcGROkbcrCBx15x/1ysfHtO3byVSSzDLqqqqTTWAxq2g+oNGR+izqJi3UCONAIuykX X-Received: by 2002:a50:e602:0:b0:521:d368:1628 with SMTP id y2-20020a50e602000000b00521d3681628mr5412383edm.16.1690612775835; Fri, 28 Jul 2023 23:39:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690612775; cv=none; d=google.com; s=arc-20160816; b=ht5OZkCNJfcb9k7Bf5VAjjpbq918n1aI/aGE2ywnW6jm12p+CQqRFM9BuHm458Pj50 v8aFOo8VbKJfkg6D+tz8permUDHCyfNXI9ucaWKhKb9gRAGr/sTYHx+eAfYOnWMCRbzv q3r8Kvdsx/bOBCqCWHuGpdS8LziDCGQFUJ643fDuZiw/xebrmIUBFumJQ6No0qjqGjqH y+gjK8uG/Ypj7ndRPPfcZ0nVy72BB/Uj9I7EwA5QnQogDMNgNBLQr2kfc6hqFSFJavEE gYs8JHDNM9UBwCte+OLo4wfuiHGSUXSMA297ZtSK03VB/vDQuDzlI2kr+lkSsYTWbwGh VnmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:to:date :resent-to:resent-message-id:resent-date:resent-from:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=bgGezyMxFrVjCKHkgEoiCsSYAZWx3AYuftXcNCP6Mfo=; fh=61dbR2UA6QTohWan4232QJy2ONzCey2vu3ydhWnGdEg=; b=A5FHEQAqhMpOnZ+/XQ5AM8cmMyauNwlCZDY4vM0AWDczEqE9fO516XRmk7eONg5Kjg lkOL5qW9AKf2HT2TsmMRQbSkyr3CREpOuxKX5ZbSjymeNy5vds93i5Wootb41Hd6yorc rqr+/+fcGxT/j29aO/NXF8b7O5S5jzx6BIuE70OOZXmIkiQfvCa4uYxY0qGPOwYnu2nU sStrK0v5ZwNgPDsknZZq7GCjPuFXhSPO8lc+SZqkIcumdo2kqLE/VPcB2EdBBhkl7IIa TIDXEEOAqLIWe7wutgoyQTsY5zA3Tdt7eOwL/82P1qiCsUC+34PWn/sqNFVMZpZH3b+h XQmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ky4rh+D4; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id z11-20020aa7d40b000000b005225a5526d6si820846edq.284.2023.07.28.23.39.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Jul 2023 23:39:35 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ky4rh+D4; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFE48385840A for ; Sat, 29 Jul 2023 06:39:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFE48385840A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1690612774; bh=bgGezyMxFrVjCKHkgEoiCsSYAZWx3AYuftXcNCP6Mfo=; h=Resent-From:Resent-Date:Resent-To:Date:To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Ky4rh+D4Kkq/y5XbKeSZ/cKX6lIdMs4TpkkyDUw4I5aap7E630yvgi6y9PaEXEMI0 9B6xOcHUZQV4vLWpihkFDPmUZzcm2FcRW7xLy/g2/M9aR68ICpy4Novz1gJvcrkORP jMUUikWAA4MyGWFDWBv2SF7D/nN0RlDwDlY+NnP0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 5D8CE3858D39 for ; Sat, 29 Jul 2023 06:38:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D8CE3858D39 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 530E12829BC; Sat, 29 Jul 2023 08:38:50 +0200 (CEST) Resent-From: Jan Hubicka Resent-Date: Sat, 29 Jul 2023 08:38:50 +0200 Resent-Message-ID: Resent-To: gcc-patches@gcc.gnu.org Date: Sat, 29 Jul 2023 08:38:35 +0200 To: ngcc-patches@gcc.gnu.org Subject: Fix profile update after loop versioning in vectorizer Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, JMQ_SPF_NEUTRAL, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772735981878635406 X-GMAIL-MSGID: 1772735981878635406 Hi, Vectorizer while loop versioning produces a versioned loop guarded with two conditionals of the form if (cond1) goto scalar_loop else goto next_bb next_bb: if (cond2) godo scalar_loop else goto vector_loop It wants the combined test to be prob (whch is set to likely) and uses profile_probability::split to determine probability of cond1 and cond2. However spliting is turning: if (cond) goto lab; // ORIG probability into if (cond1) goto lab; // FIRST = ORIG * CPROB probability if (cond2) goto lab; // SECOND probability Which is or instead of and. As a result we get pretty low probabiility of entering vectorized loop. The fixes this by introducing sqrt to profile probability (which is correct way to split this) and also adding pow that is needed elsewhere. While loop versioning I now produce code as if there was only one combined conditional and then update probability of conditional produced (containig cond1). Later edge is split and new conditional is added. At that time it is necessary to update probability of the BB containing second conditional so everything matches. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * profile-count.cc (profile_probability::sqrt): New member function. (profile_probability::pow): Likewise. * profile-count.h: (profile_probability::sqrt): Declare (profile_probability::pow): Likewise. * tree-vect-loop-manip.cc (vect_loop_versioning): Fix profile update. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vect-profile-upate-2.c: New test. diff --git a/gcc/profile-count.cc b/gcc/profile-count.cc index eaf0f0d787e..e63c9432388 100644 --- a/gcc/profile-count.cc +++ b/gcc/profile-count.cc @@ -471,3 +471,60 @@ profile_probability::to_sreal () const gcc_checking_assert (initialized_p ()); return ((sreal)m_val) >> (n_bits - 2); } + +/* Compute square root. */ + +profile_probability +profile_probability::sqrt () const +{ + if (!initialized_p () || *this == never () || *this == always ()) + return *this; + profile_probability ret = *this; + ret.m_quality = MIN (ret.m_quality, ADJUSTED); + uint32_t min_range = m_val; + uint32_t max_range = max_probability; + if (!m_val) + max_range = 0; + if (m_val == max_probability) + min_range = max_probability; + while (min_range != max_range) + { + uint32_t val = (min_range + max_range) / 2; + uint32_t val2 = RDIV ((uint64_t)val * val, max_probability); + if (val2 == m_val) + min_range = max_range = m_val; + else if (val2 > m_val) + max_range = val - 1; + else if (val2 < m_val) + min_range = val + 1; + } + ret.m_val = min_range; + return ret; +} + +/* Compute n-th power of THIS. */ + +profile_probability +profile_probability::pow (int n) const +{ + if (n == 1 || !initialized_p ()) + return *this; + if (!n) + return profile_probability::always (); + if (!nonzero_p () + || !(profile_probability::always () - *this).nonzero_p ()) + return *this; + profile_probability ret = profile_probability::always (); + profile_probability v = *this; + int p = 1; + while (true) + { + if (n & p) + ret = ret * v; + p <<= 1; + if (p > n) + break; + v = v * v; + } + return ret; +} diff --git a/gcc/profile-count.h b/gcc/profile-count.h index 88a6431c21a..002bcb83481 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -650,6 +650,12 @@ public: return *this; } + /* Compute n-th power. */ + profile_probability pow (int) const; + + /* Compute sware root. */ + profile_probability sqrt () const; + /* Get the value of the count. */ uint32_t value () const { return m_val; } diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vect-profile-upate-2.c b/gcc/testsuite/gcc.dg/tree-ssa/vect-profile-upate-2.c new file mode 100644 index 00000000000..4a5f6bc4e23 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/vect-profile-upate-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized-details-blocks" } */ +void +test (int *a, int *b, int n) +{ + for (int i = 0; i < n; i++) + a[i]+=b[i]; +} +/* { dg-final { scan-tree-dump-not "Invalid sum" "optimized"} } */ diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 30baac6db44..e53a99e7c3c 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -3784,7 +3784,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo, } tree cost_name = NULL_TREE; - profile_probability prob2 = profile_probability::uninitialized (); + profile_probability prob2 = profile_probability::always (); if (cond_expr && EXPR_P (cond_expr) && (version_niter @@ -3797,7 +3797,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo, is_gimple_val, NULL_TREE); /* Split prob () into two so that the overall probability of passing both the cost-model and versioning checks is the orig prob. */ - prob2 = prob.split (prob); + prob2 = prob = prob.sqrt (); } if (version_niter) @@ -3941,7 +3941,15 @@ vect_loop_versioning (loop_vec_info loop_vinfo, initialize_original_copy_tables (); nloop = loop_version (loop_to_version, cond_expr, &condition_bb, - prob, prob.invert (), prob, prob.invert (), true); + prob * prob2, (prob * prob2).invert (), + prob * prob2, (prob * prob2).invert (), + true); + /* We will later insert second conditional so overall outcome of + both is prob * prob2. */ + edge true_e, false_e; + extract_true_false_edges_from_block (condition_bb, &true_e, &false_e); + true_e->probability = prob; + false_e->probability = prob.invert (); gcc_assert (nloop); nloop = get_loop_copy (loop); @@ -4037,6 +4045,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo, edge e2 = make_edge (e->src, false_e->dest, EDGE_FALSE_VALUE); e->probability = prob2; e2->probability = prob2.invert (); + e->dest->count = e->count (); set_immediate_dominator (CDI_DOMINATORS, false_e->dest, e->src); auto_vec adj; for (basic_block son = first_dom_son (CDI_DOMINATORS, e->dest);