From patchwork Wed Dec 13 12:30:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 177944 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp7740112dys; Wed, 13 Dec 2023 04:33:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IFH6RXxjML26bnOhrptZ+1+judY6BfmzygZqc7fWTS6K8Xd2P0domjH0/xibTN5aUe1viMB X-Received: by 2002:a05:620a:27cc:b0:77e:c67:32e4 with SMTP id i12-20020a05620a27cc00b0077e0c6732e4mr7952770qkp.23.1702470816028; Wed, 13 Dec 2023 04:33:36 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702470816; cv=pass; d=google.com; s=arc-20160816; b=ty0IjZKckuZ7EuaX0J6FKAU5Y0/R07YoLjZ5gbW/YV/TYxFQxQXBsVOlVkjxwOhAnR 538PLe5Ig7PA16oBHbb8FVc1uONOW1r1o7J8GJjmr5/OsK2+j/tN4j7QEHkKtzwS6mJV w5b3xf9QTNW3oRqWnYMjdO2ydQqTViD+DQ3wZ8XujunuDcM0UBviBJPIAgRSRgr9q0L0 f9dL8gJhcARd+aiH0B6foHyywQjm7+eBDMcWkrMsTtjhZaKwIBbY6UlPlgR7G+LY7xTB RV+yL62mLCxoGiFx1ZAntT8dUas4zCLKzSffI56GHZEH1d0ROUuM3VJ9equKXeXGsMdf ZM8A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :subject:to:from:date:dkim-signature:dkim-signature:dkim-signature :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=oPkb6Fzlxk+KirHk/3o/0G4CEY93oQIRclOhTQwpzRE=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=VjYa4cK4SlbT++xUg1k2RM7aicuFdq4Mb7Db06P7T2cqEF7xDxyb2ckG/kuCDlx2mB eh2I9DPJHtj/Y1qMlJUyCX4bdp6FykZ11I3TI/egf5nqWJcc/F7pvKiwnOiHNjsy7Xp0 6OA9DIEnIceEcWRnHeL4wNolYjUV3T6POweU4AZWFekl2yj4Xwqj3z2rvE2H4zfYmXwc HNB0ghzGlQicDOgzcugt6pMdkoy9jn7wV2FFmcZ7ObCs3wy05KwXys7MUKStozU5TDW6 9yf3AfOEPUxWX6unyYFyq2qZuI4D88aFAP1T5fnJbqkB6Cxsr0lo825bw4DbBJQYNMPp O9zg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=HXg6UDCm; dkim=neutral (no key) header.i=@suse.de; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=HXg6UDCm; dkim=neutral (no key) header.i=@suse.de; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id q21-20020a05620a0d9500b0077f7cf586d2si5760375qkl.38.2023.12.13.04.33.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 04:33:36 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=HXg6UDCm; dkim=neutral (no key) header.i=@suse.de; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=HXg6UDCm; dkim=neutral (no key) header.i=@suse.de; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA2D6384DEF7 for ; Wed, 13 Dec 2023 12:32:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 0B8AF385C419 for ; Wed, 13 Dec 2023 12:32:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B8AF385C419 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0B8AF385C419 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470734; cv=none; b=kkC4r5nd3ztjmtP2Ld5ESYPYy+1lkteP3rsx3kFce/pDG9Lx9DojuwPKFplH7kc93JMNU+Pf7gmlnNCq78elN+DwKIdl95g+5aEbUNBC4+QmiUtYa72dzWb+Vw9G5Fc/gC74UEn3m1rRIkM8L/sV6/zFawbm135fzE44FOYQye4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470734; c=relaxed/simple; bh=6NF+dbvFAhpKWz0CHGKc9MGUBosC8UXfFMfHSD/bhYc=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=vgm0vnuIEDqpKPAHcJkVx4QA3F5+I+4lErXLDQSTXFA53sgNTbNiEjLzBpEVSolF2HoAScaFZiSYJ3TlVXRbHVn9xi+xTHBE0/NxzwHjF2n1ZHGkJCuRB5cxySqCvZaVhfp6NyDzuhY2uw8gmwQt9BDCyhJoqmdtqodCs3I8vI8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F2BA61F392 for ; Wed, 13 Dec 2023 12:32:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470723; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=oPkb6Fzlxk+KirHk/3o/0G4CEY93oQIRclOhTQwpzRE=; b=HXg6UDCmIvPl00jeAfji+C/9OmkQbIGNPtxfESWzfiVJlefWPA8RP4PKtg+uFcrkL7maHO HpHLUiJQv6U/cpEbYKMgHIKRQ342oS8GoOCd3ffEQ8EgIAsPlg0r7ds7ChJMCVP9FBfM30 1IvhvfDo0qIR+bRk1HBdT7rcH8ZW6+E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470723; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=oPkb6Fzlxk+KirHk/3o/0G4CEY93oQIRclOhTQwpzRE=; b=q3rUINkpM6r75bsEmMms6L8crUqcTbqp6EseOU2jPtPTJ09Ou0beRE/6UAAwAnC6c5zSIB G8BfFeMuMvwVdQAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470723; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=oPkb6Fzlxk+KirHk/3o/0G4CEY93oQIRclOhTQwpzRE=; b=HXg6UDCmIvPl00jeAfji+C/9OmkQbIGNPtxfESWzfiVJlefWPA8RP4PKtg+uFcrkL7maHO HpHLUiJQv6U/cpEbYKMgHIKRQ342oS8GoOCd3ffEQ8EgIAsPlg0r7ds7ChJMCVP9FBfM30 1IvhvfDo0qIR+bRk1HBdT7rcH8ZW6+E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470723; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=oPkb6Fzlxk+KirHk/3o/0G4CEY93oQIRclOhTQwpzRE=; b=q3rUINkpM6r75bsEmMms6L8crUqcTbqp6EseOU2jPtPTJ09Ou0beRE/6UAAwAnC6c5zSIB G8BfFeMuMvwVdQAQ== Date: Wed, 13 Dec 2023 13:30:59 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/6] Set LOOP_VINFO_VECT_FACTOR only when it is final MIME-Version: 1.0 X-Spam-Score: -0.60 Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: 2.31 X-Spamd-Result: default: False [2.31 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(2.91)[0.970]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Flag: NO X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20231213123257.BA2D6384DEF7@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785170038256694221 X-GMAIL-MSGID: 1785170038256694221 The following makes sure to keep LOOP_VINFO_VECT_FACTOR at the indetermined value zero until it is final, making LOOP_VINFO_VECT_FACTOR an rvalue and changing some direct references to use the macro. * tree-vectorizer.h (LOOP_VINFO_VECT_FACTOR): Make an rvalue. * tree-vect-loop.cc (vect_determine_vectorization_factor): Do not set LOOP_VINFO_VECT_FACTOR, return value via reference. (vect_update_vf_for_slp): Likewise. (vect_analyze_loop_2): Set LOOP_VINFO_VECT_FACTOR only ever to its final value. Perform SLP optimization after setting the vectorization factor. * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Use LOOP_VINFO_VECT_FACTOR. (vect_slp_analyze_node_operations): Likewise. * tree-vectorizer.cc (vect_transform_loops): Likewise. --- gcc/tree-vect-loop.cc | 43 +++++++++++++++++++++++------------------- gcc/tree-vect-slp.cc | 4 ++-- gcc/tree-vectorizer.cc | 2 +- gcc/tree-vectorizer.h | 2 +- 4 files changed, 28 insertions(+), 23 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 3a0731f3bea..3af4160426b 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -282,12 +282,12 @@ vect_determine_vf_for_stmt (vec_info *vinfo, */ static opt_result -vect_determine_vectorization_factor (loop_vec_info loop_vinfo) +vect_determine_vectorization_factor (loop_vec_info loop_vinfo, + poly_uint64 &vectorization_factor) { class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); unsigned nbbs = loop->num_nodes; - poly_uint64 vectorization_factor = 1; tree scalar_type = NULL_TREE; gphi *phi; tree vectype; @@ -296,6 +296,8 @@ vect_determine_vectorization_factor (loop_vec_info loop_vinfo) DUMP_VECT_SCOPE ("vect_determine_vectorization_factor"); + vectorization_factor = 1; + for (i = 0; i < nbbs; i++) { basic_block bb = bbs[i]; @@ -370,7 +372,6 @@ vect_determine_vectorization_factor (loop_vec_info loop_vinfo) if (known_le (vectorization_factor, 1U)) return opt_result::failure_at (vect_location, "not vectorized: unsupported data-type\n"); - LOOP_VINFO_VECT_FACTOR (loop_vinfo) = vectorization_factor; return opt_result::success (); } @@ -1937,17 +1938,16 @@ vect_create_loop_vinfo (class loop *loop, vec_info_shared *shared, statements update the vectorization factor. */ static void -vect_update_vf_for_slp (loop_vec_info loop_vinfo) +vect_update_vf_for_slp (loop_vec_info loop_vinfo, + poly_uint64 &vectorization_factor) { class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); int nbbs = loop->num_nodes; - poly_uint64 vectorization_factor; int i; DUMP_VECT_SCOPE ("vect_update_vf_for_slp"); - vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); gcc_assert (known_ne (vectorization_factor, 0U)); /* If all the stmts in the loop can be SLPed, we perform only SLP, and @@ -2006,7 +2006,6 @@ vect_update_vf_for_slp (loop_vec_info loop_vinfo) LOOP_VINFO_SLP_UNROLLING_FACTOR (loop_vinfo)); } - LOOP_VINFO_VECT_FACTOR (loop_vinfo) = vectorization_factor; if (dump_enabled_p ()) { dump_printf_loc (MSG_NOTE, vect_location, @@ -2809,7 +2808,8 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal, return opt_result::failure_at (vect_location, "bad data dependence.\n"); LOOP_VINFO_MAX_VECT_FACTOR (loop_vinfo) = max_vf; - ok = vect_determine_vectorization_factor (loop_vinfo); + poly_uint64 vectorization_factor; + ok = vect_determine_vectorization_factor (loop_vinfo, vectorization_factor); if (!ok) { if (dump_enabled_p ()) @@ -2821,7 +2821,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal, /* Compute the scalar iteration cost. */ vect_compute_single_scalar_iteration_cost (loop_vinfo); - poly_uint64 saved_vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); + poly_uint64 saved_vectorization_factor = vectorization_factor; if (slp) { @@ -2839,13 +2839,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal, vect_detect_hybrid_slp (loop_vinfo); /* Update the vectorization factor based on the SLP decision. */ - vect_update_vf_for_slp (loop_vinfo); - - /* Optimize the SLP graph with the vectorization factor fixed. */ - vect_optimize_slp (loop_vinfo); - - /* Gather the loads reachable from the SLP graph entries. */ - vect_gather_slp_loads (loop_vinfo); + vect_update_vf_for_slp (loop_vinfo, vectorization_factor); } } @@ -2863,11 +2857,12 @@ start_over: during finish_cost the first time we ran the analyzis for this vector mode. */ if (applying_suggested_uf) - LOOP_VINFO_VECT_FACTOR (loop_vinfo) *= loop_vinfo->suggested_unroll_factor; + vectorization_factor *= loop_vinfo->suggested_unroll_factor; /* Now the vectorization factor is final. */ - poly_uint64 vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); gcc_assert (known_ne (vectorization_factor, 0U)); + gcc_assert (known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 0U)); + loop_vinfo->vectorization_factor = vectorization_factor; if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && dump_enabled_p ()) { @@ -2882,6 +2877,15 @@ start_over: && maybe_lt (max_vf, LOOP_VINFO_VECT_FACTOR (loop_vinfo))) return opt_result::failure_at (vect_location, "bad data dependence.\n"); + if (slp) + { + /* Optimize the SLP graph with the vectorization factor fixed. */ + vect_optimize_slp (loop_vinfo); + + /* Gather the loads reachable from the SLP graph entries. */ + vect_gather_slp_loads (loop_vinfo); + } + loop_vinfo->vector_costs = init_cost (loop_vinfo, false); /* Analyze the alignment of the data-refs in the loop. @@ -3303,7 +3307,8 @@ again: /* Roll back state appropriately. No SLP this time. */ slp = false; /* Restore vectorization factor as it were without SLP. */ - LOOP_VINFO_VECT_FACTOR (loop_vinfo) = saved_vectorization_factor; + vectorization_factor = saved_vectorization_factor; + loop_vinfo->vectorization_factor = 0; /* Free the SLP instances. */ FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (loop_vinfo), j, instance) vect_free_slp_instance (instance); diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 6799b9375ae..efda358a7f6 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -6126,7 +6126,7 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, slp_tree node, { poly_uint64 vf; if (loop_vec_info loop_vinfo = dyn_cast (vinfo)) - vf = loop_vinfo->vectorization_factor; + vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); else vf = 1; unsigned int group_size = SLP_TREE_LANES (node); @@ -6399,7 +6399,7 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, unsigned group_size = SLP_TREE_LANES (child); poly_uint64 vf = 1; if (loop_vec_info loop_vinfo = dyn_cast (vinfo)) - vf = loop_vinfo->vectorization_factor; + vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); SLP_TREE_NUMBER_OF_VEC_STMTS (child) = vect_get_num_vectors (vf * group_size, vector_type); /* And cost them. */ diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc index d97e2b54c25..08ff932fb53 100644 --- a/gcc/tree-vectorizer.cc +++ b/gcc/tree-vectorizer.cc @@ -1015,7 +1015,7 @@ vect_transform_loops (hash_table *&simduid_to_vf_htab, if (!simduid_to_vf_htab) simduid_to_vf_htab = new hash_table (15); simduid_to_vf_data->simduid = DECL_UID (loop->simduid); - simduid_to_vf_data->vf = loop_vinfo->vectorization_factor; + simduid_to_vf_data->vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); *simduid_to_vf_htab->find_slot (simduid_to_vf_data, INSERT) = simduid_to_vf_data; } diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 1810833a324..a2bab8676af 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -967,7 +967,7 @@ public: #define LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P(L) \ (L)->epil_using_partial_vectors_p #define LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS(L) (L)->partial_load_store_bias -#define LOOP_VINFO_VECT_FACTOR(L) (L)->vectorization_factor +#define LOOP_VINFO_VECT_FACTOR(L) ((L)->vectorization_factor + 0) #define LOOP_VINFO_MAX_VECT_FACTOR(L) (L)->max_vectorization_factor #define LOOP_VINFO_MASKS(L) (L)->masks #define LOOP_VINFO_LENS(L) (L)->lens