From patchwork Wed Dec 13 12:31:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 177947 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp7740414dys; Wed, 13 Dec 2023 04:34:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IFkqEGQjk0auWuDKmr0WNzFS3arnXzTMNB0E9DoVo6bdDtlsYfq6VDV3fJSjVrQFbZvjgu2 X-Received: by 2002:a05:622a:308:b0:425:8cbf:c8b3 with SMTP id q8-20020a05622a030800b004258cbfc8b3mr11948875qtw.99.1702470850150; Wed, 13 Dec 2023 04:34:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702470850; cv=pass; d=google.com; s=arc-20160816; b=yFyg4SkTXoi/k3o8eHmELFOIB7of2WXWC8vq8SeFwpPZYEm7cc4km8/Ly26Rn6AV4M +flpW5O+pc9C7I4F3TCrBGypo8v97vFUjnrpkMWsNhwEbYcgcHSRQoId+fIWQeluh9u0 FOA1V0nHVFCUGLe3XEYcjGf1ii6d/ApiEGUKZABQC0nSAcONvzV538PcZuvedkeAs0Nb 1knpF/iXE37n2LkhKALskUQ3CriS4HpeHt5QshbFzNO3PMEE6d1PzXJzwnInWZmS04+2 xJtMBQaxHQkJvguH+Yc7L4wF8Yv0kcqDBdsmADTTBJRaxac0TSjhbn7kp2Ajj8YGP3pO 3xBA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :subject:to:from:date:dkim-signature:dkim-signature:dkim-signature :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=tigVD7Cv8uKq8yX9Ni0e6GOKDE0vHrIMimMk4SlCDvY=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=I967wTsTxfWWnF2fDEk+F/0c0yOg3hEPcPJvhhtSQTa2x8rVxah/nMCxlEQvDi8BgY CkwGYRI7sTuLgfFNyw/IpzF2uJBSHQTN/h+Gd2SLhbFtdznld2UA5l9OjUvHrLma5CCQ Jgld9M4jGz59NUDXpEh6lHu6cVJEFQTlvIgErEtH8lPQvR84kr16JGUcTcOcYTDWbxAS WL8pdYatJQw2OTjxzrJY+hiDXf+9mBQA+Mtl40xFjkgp4/y1RuT7Zijy82pxN17srxsv h7UVm9HNWHioCKEmwze3iaEXBfE4DSZ/FEBWP5Ed3L28l6PUErARWYzU9Y4OlIcZLi+H S/Ww== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=DGn9nqU0; dkim=neutral (no key) header.i=@suse.de; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=DGn9nqU0; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id jv12-20020a05622aa08c00b00423852de519si11809339qtb.758.2023.12.13.04.34.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 04:34:10 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=DGn9nqU0; dkim=neutral (no key) header.i=@suse.de; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=DGn9nqU0; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4544D3858D39 for ; Wed, 13 Dec 2023 12:33:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id DF380385AC3E for ; Wed, 13 Dec 2023 12:32:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF380385AC3E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DF380385AC3E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470754; cv=none; b=ayDD7paIKxrOpuvG9CjxPdLiRQeNjMsW2IM66pGod7H6EZwRCt6GoJan43T4znTooiu/Mt6Du2iIRqKxKjPcb7fZchcs0Gh6O0nPbb7yTIhq6EsQr8DJG6RYoyq8B0/WBSGiBO6k7ENhsFA/9NfLmoZHIlg2golO2PGZkOXqc9Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470754; c=relaxed/simple; bh=nZgoYd/u5i3oZdiXU7Yjk/oTMwvcfIeP1xG2LDzVGpA=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=wkai8131mhOOiRXdHJN1ZF5KOyq/rwvu+RPYt3bNdp6neaW5n8sy4XgGfDhDgIRV6Xi5iIWMgr+UgsCvKBf9xpxFd1Us6QbtUuS6s5yGonV6Hj9ZRmNwZK4har9gOKI36leoHP/Yu4sEjIjCfmsItDRynC3CYWv27675N+owKG0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C41D21F392 for ; Wed, 13 Dec 2023 12:32:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470750; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=tigVD7Cv8uKq8yX9Ni0e6GOKDE0vHrIMimMk4SlCDvY=; b=DGn9nqU0nA2houb76ikEmkCSITGkRt/wwrHrrWDhng1LjoKKz4l5nJXabgpvrNyq153K9m ybv8EMIdawWiaLLtr7LuNcDSWUlncPElVTH9jvAFKI4VvDHeOJ94+TOEz2/uvfAL04atGm QMLa9MhQHkq1MKqPCCTIVSkY0Cr4sJc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470750; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=tigVD7Cv8uKq8yX9Ni0e6GOKDE0vHrIMimMk4SlCDvY=; b=JvW8sAojXtwJJMnWSWN+hyr70YluzM3xaq64I84y9XzOzATYTWmVy6dVOHP2rxQrBOFdK9 VJp4xeJdPaW0LjAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470750; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=tigVD7Cv8uKq8yX9Ni0e6GOKDE0vHrIMimMk4SlCDvY=; b=DGn9nqU0nA2houb76ikEmkCSITGkRt/wwrHrrWDhng1LjoKKz4l5nJXabgpvrNyq153K9m ybv8EMIdawWiaLLtr7LuNcDSWUlncPElVTH9jvAFKI4VvDHeOJ94+TOEz2/uvfAL04atGm QMLa9MhQHkq1MKqPCCTIVSkY0Cr4sJc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470750; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=tigVD7Cv8uKq8yX9Ni0e6GOKDE0vHrIMimMk4SlCDvY=; b=JvW8sAojXtwJJMnWSWN+hyr70YluzM3xaq64I84y9XzOzATYTWmVy6dVOHP2rxQrBOFdK9 VJp4xeJdPaW0LjAA== Date: Wed, 13 Dec 2023 13:31:27 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH 6/6] Defer assigning vector types until after VF is determined MIME-Version: 1.0 X-Spam-Score: -0.60 Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: 2.08 X-Spamd-Result: default: False [2.08 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(2.68)[0.894]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[gcc.target:url]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Flag: NO X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20231213123336.4544D3858D39@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785170074421451927 X-GMAIL-MSGID: 1785170074421451927 The following defers, for non-gather/scatter and non-pattern stmts, setting of STMT_VINFO_VECTYPE until after we computed the desired vectorization factor. This allows us to use larger vector types when the vectorization factor and the preferred vector mode allow, reducing the number of vector stmt copies and enabling vectorization in the first place if ncopies restrictions requires the use of different size vector types like for PR65947. vectorizable_operation handles some of the required vector type inference. * tree-vect-data-refs.cc (vect_analyze_data_refs): Do not set STMT_VINFO_VECTYPE unless this is a gather/scatter. * tree-vect-loop.cc (vect_determine_vf_for_stmt_1): Do not set STMT_VINFO_VECTYPE, only determine the VF. (vect_determine_vectorization_factor): Likewise. (vect_analyze_loop_2): Set STMT_VINFO_VECTYPE where missing and non-mask. Choose larger vectors to reduce the number of stmt copies. * tree-vect-stmts.cc (vect_analyze_stmt): Allow not specified vector type for mask producers. (vectorizable_operation): Refactor to handle STMT_VINFO_VECTYPE inference from operands. * gcc.dg/vect/pr65947-7.c: Adjust. * gcc.target/i386/vect-multi-size-1.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr65947-7.c | 2 +- .../gcc.target/i386/vect-multi-size-1.c | 17 ++ gcc/tree-vect-data-refs.cc | 11 +- gcc/tree-vect-loop.cc | 148 +++++++++++++++--- gcc/tree-vect-stmts.cc | 121 +++++++------- 5 files changed, 202 insertions(+), 97 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/vect-multi-size-1.c diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-7.c b/gcc/testsuite/gcc.dg/vect/pr65947-7.c index 58c46df5c54..8f8adce3d91 100644 --- a/gcc/testsuite/gcc.dg/vect/pr65947-7.c +++ b/gcc/testsuite/gcc.dg/vect/pr65947-7.c @@ -53,4 +53,4 @@ main (void) } /* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target vect_fold_extract_last } } } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target aarch64*-*-* } } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { aarch64*-*-* } || { vect_multiple_sizes } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-multi-size-1.c b/gcc/testsuite/gcc.target/i386/vect-multi-size-1.c new file mode 100644 index 00000000000..a0dd3cf9801 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-multi-size-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=znver4 -fdump-tree-vect" } */ + +double x[1024]; +char y[1024]; +void foo () +{ + for (int i = 0 ; i < 16; ++i) + { + x[i] = i; + y[i] = i; + } +} + +/* We expect to see AVX512 vectors for x[] and a SSE vector for y[]. */ +/* { dg-final { scan-tree-dump-times "MEM " 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "MEM " 1 "vect" } } */ diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 59e296e7976..80057474af9 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -4716,18 +4716,19 @@ vect_analyze_data_refs (vec_info *vinfo, poly_uint64 *min_vf, bool *fatal) vf = TYPE_VECTOR_SUBPARTS (vectype); *min_vf = upper_bound (*min_vf, vf); - /* Leave the BB vectorizer to pick the vector type later, based on - the final dataref group size and SLP node size. */ - if (is_a (vinfo)) - STMT_VINFO_VECTYPE (stmt_info) = vectype; - if (gatherscatter != SG_NONE) { + /* ??? We should perform a coarser check here, or none at all. + We're checking this again later, in particular during + relevancy analysis where we hook on the discovered offset + operand. */ + STMT_VINFO_VECTYPE (stmt_info) = vectype; gather_scatter_info gs_info; if (!vect_check_gather_scatter (stmt_info, as_a (vinfo), &gs_info)) { + STMT_VINFO_VECTYPE (stmt_info) = NULL_TREE; if (fatal) *fatal = false; return opt_result::failure_at diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 9e531921e29..f226135cb1d 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -189,22 +189,19 @@ vect_determine_vf_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, if (!res) return res; - if (stmt_vectype) + if (nunits_vectype) { - if (STMT_VINFO_VECTYPE (stmt_info)) - /* The only case when a vectype had been already set is for stmts - that contain a data ref, or for "pattern-stmts" (stmts generated - by the vectorizer to represent/replace a certain idiom). */ - gcc_assert ((STMT_VINFO_DATA_REF (stmt_info) - || vectype_maybe_set_p) - && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype); - else - STMT_VINFO_VECTYPE (stmt_info) = stmt_vectype; + poly_uint64 saved_vf = *vf; + vect_update_max_nunits (vf, nunits_vectype); + if (maybe_ne (*vf, saved_vf) && dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, "updated " + "vectorization factor to "); + dump_dec (MSG_NOTE, *vf); + dump_printf (MSG_NOTE, "\n"); + } } - if (nunits_vectype) - vect_update_max_nunits (vf, nunits_vectype); - return opt_result::success (); } @@ -330,20 +327,17 @@ vect_determine_vectorization_factor (loop_vec_info loop_vinfo, "not vectorized: unsupported " "data-type %T\n", scalar_type); - STMT_VINFO_VECTYPE (stmt_info) = vectype; - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", - vectype); - if (dump_enabled_p ()) + poly_uint64 saved_vectorization_factor = vectorization_factor; + vect_update_max_nunits (&vectorization_factor, vectype); + if (maybe_ne (vectorization_factor, saved_vectorization_factor) + && dump_enabled_p ()) { - dump_printf_loc (MSG_NOTE, vect_location, "nunits = "); - dump_dec (MSG_NOTE, TYPE_VECTOR_SUBPARTS (vectype)); + dump_printf_loc (MSG_NOTE, vect_location, "updated " + "vectorization factor to "); + dump_dec (MSG_NOTE, vectorization_factor); dump_printf (MSG_NOTE, "\n"); } - - vect_update_max_nunits (&vectorization_factor, vectype); } } @@ -2864,6 +2858,114 @@ start_over: gcc_assert (known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 0U)); loop_vinfo->vectorization_factor = vectorization_factor; + /* At this point we have the vectorization factor that should determine + the vector types to use decided. The unrolling factor should not + influence that since otherwise we'd eventually use larger vectors + rather than doing actual effective unrolling. + + Note that with re-starting without SLP we actually will have the + original loop VF so we're off here - but then non-SLP should go + away ... */ + /* Check that nothing set STMT_VINFO_VECTYPE so nothing could have + relied on it. ??? Same for SLP. ??? That also catches pattern + stmts which might be more difficult to "fix". */ + for (stmt_vec_info stmt_info : loop_vinfo->stmt_vec_infos) + { + if (!stmt_info + || gimple_clobber_p (stmt_info->stmt)) + continue; + + if (!STMT_VINFO_RELEVANT_P (stmt_info) + && !STMT_VINFO_LIVE_P (stmt_info)) + continue; + + if (STMT_VINFO_VECTYPE (stmt_info)) + { + /* Pattern stmts and gather/scatter may have a precomputed + vector type. */ + gcc_assert (STMT_VINFO_RELATED_STMT (stmt_info) + || STMT_VINFO_GATHER_SCATTER_P (stmt_info)); + continue; + } + + /* ??? This is still a coarse vector type decision. Multiple + up/down passes over use-def chains should be used to set + vector types from within vectorizable_* itself, in a new + special mode. Possibly identifying the responsible worker early. + Not worth spending much time on this in the non-SLP path. */ + tree stmt_vectype, nunits_vectype; + opt_result res + = vect_get_vector_types_for_stmt (loop_vinfo, stmt_info, &stmt_vectype, + &nunits_vectype); + gcc_assert (res); + if (!stmt_vectype) + /* OMP SIMD calls without LHS. */ + continue; + + tree scalar_type = NULL_TREE; + if (vect_use_mask_type_p (stmt_info)) + { + if (is_a (stmt_info->stmt)) + { + /* Only with BB vectorization or as PHI in a nested cycle. */ + gcc_assert (flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo), + gimple_bb (stmt_info->stmt))); + /* ??? vectorizable_* should set the vector type. */ + continue; + } + else + { + tree_code code = gimple_assign_rhs_code (stmt_info->stmt); + if (is_gimple_assign (stmt_info->stmt) + && TREE_CODE_CLASS (code) == tcc_comparison) + scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt_info->stmt)); + else + /* ??? vectorizable_* should set the vector type. */ + continue; + } + } + else + scalar_type = TREE_TYPE (stmt_vectype); + + /* Try to use a larger vector type when the above one has less lanes + than the chosen VF, up to the one recommended by the perferred vector + mode hook. This keeps ncopies down, generating more efficient code + and in some cases enables vectorizing in the first place. */ + tree preferred_vectype = get_related_vectype_for_scalar_type (VOIDmode, + scalar_type, + 0); + if (known_lt (TYPE_VECTOR_SUBPARTS (stmt_vectype), + LOOP_VINFO_VECT_FACTOR (loop_vinfo)) + && known_lt (TYPE_VECTOR_SUBPARTS (stmt_vectype), + TYPE_VECTOR_SUBPARTS (preferred_vectype)) + && ordered_p (TYPE_VECTOR_SUBPARTS (preferred_vectype), + LOOP_VINFO_VECT_FACTOR (loop_vinfo))) + { + /* ??? Could try all nunits between stmt_vectype and MIN. */ + poly_uint64 nunits + = ordered_min (LOOP_VINFO_VECT_FACTOR (loop_vinfo), + TYPE_VECTOR_SUBPARTS (preferred_vectype)); + tree cand = get_related_vectype_for_scalar_type + (TYPE_MODE (preferred_vectype), scalar_type, nunits); + if (cand) + { + if (VECTOR_BOOLEAN_TYPE_P (stmt_vectype)) + cand = truth_type_for (cand); + stmt_vectype = cand; + } + } + + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, + "==> examining statement: %G", stmt_info->stmt); + dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", + stmt_vectype); + } + + STMT_VINFO_VECTYPE (stmt_info) = stmt_vectype; + } + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && dump_enabled_p ()) { dump_printf_loc (MSG_NOTE, vect_location, diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a5e26b746fb..da27404aadb 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6490,7 +6490,6 @@ vectorizable_operation (vec_info *vinfo, int ndts = 3; poly_uint64 nunits_in; poly_uint64 nunits_out; - tree vectype_out; int ncopies, vec_num; int i; vec vec_oprnds0 = vNULL; @@ -6550,25 +6549,6 @@ vectorizable_operation (vec_info *vinfo, return false; } - scalar_dest = gimple_assign_lhs (stmt); - vectype_out = STMT_VINFO_VECTYPE (stmt_info); - - /* Most operations cannot handle bit-precision types without extra - truncations. */ - bool mask_op_p = VECTOR_BOOLEAN_TYPE_P (vectype_out); - if (!mask_op_p - && !type_has_mode_precision_p (TREE_TYPE (scalar_dest)) - /* Exception are bitwise binary operations. */ - && code != BIT_IOR_EXPR - && code != BIT_XOR_EXPR - && code != BIT_AND_EXPR) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "bit-precision arithmetic not supported.\n"); - return false; - } - slp_tree slp_op0; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0, &dt[0], &vectype)) @@ -6580,47 +6560,6 @@ vectorizable_operation (vec_info *vinfo, } bool is_invariant = (dt[0] == vect_external_def || dt[0] == vect_constant_def); - /* If op0 is an external or constant def, infer the vector type - from the scalar type. */ - if (!vectype) - { - /* For boolean type we cannot determine vectype by - invariant value (don't know whether it is a vector - of booleans or vector of integers). We use output - vectype because operations on boolean don't change - type. */ - if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op0))) - { - if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (scalar_dest))) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "not supported operation on bool value.\n"); - return false; - } - vectype = vectype_out; - } - else - vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0), - slp_node); - } - if (vec_stmt) - gcc_assert (vectype); - if (!vectype) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "no vectype for scalar type %T\n", - TREE_TYPE (op0)); - - return false; - } - - nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out); - nunits_in = TYPE_VECTOR_SUBPARTS (vectype); - if (maybe_ne (nunits_out, nunits_in)) - return false; - tree vectype2 = NULL_TREE, vectype3 = NULL_TREE; slp_tree slp_op1 = NULL, slp_op2 = NULL; if (op_type == binary_op || op_type == ternary_op) @@ -6635,9 +6574,8 @@ vectorizable_operation (vec_info *vinfo, } is_invariant &= (dt[1] == vect_external_def || dt[1] == vect_constant_def); - if (vectype2 - && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype2))) - return false; + if (!vectype) + vectype = vectype2; } if (op_type == ternary_op) { @@ -6651,9 +6589,52 @@ vectorizable_operation (vec_info *vinfo, } is_invariant &= (dt[2] == vect_external_def || dt[2] == vect_constant_def); - if (vectype3 - && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype3))) - return false; + if (!vectype) + vectype = vectype3; + } + + if (!vectype) + vectype = STMT_VINFO_VECTYPE (stmt_info); + if (!vectype) + { + /* We want to pre-assign sth here. */ + gcc_assert (!vec_stmt + && is_invariant + && !vect_use_mask_type_p (stmt_info)); + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0), slp_node); + } + + tree vectype_out = vectype; + nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out); + nunits_in = TYPE_VECTOR_SUBPARTS (vectype); + if (maybe_ne (nunits_out, nunits_in)) + return false; + /* ??? Isn't the constraint the types are the same apart from + signednes (ABSU_EXPR for example)? The rest suggests this as + we are using 'vectype' for constants/invariants. */ + if (vectype2 + && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype2))) + return false; + if (vectype3 + && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype3))) + return false; + + scalar_dest = gimple_assign_lhs (stmt); + + /* Most operations cannot handle bit-precision types without extra + truncations. */ + bool mask_op_p = VECTOR_BOOLEAN_TYPE_P (vectype_out); + if (!mask_op_p + && !type_has_mode_precision_p (TREE_TYPE (scalar_dest)) + /* Exception are bitwise binary operations. */ + && code != BIT_IOR_EXPR + && code != BIT_XOR_EXPR + && code != BIT_AND_EXPR) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "bit-precision arithmetic not supported.\n"); + return false; } /* Multiple types in SLP are handled by creating the appropriate number of @@ -6788,6 +6769,8 @@ vectorizable_operation (vec_info *vinfo, return false; } + if (!STMT_VINFO_VECTYPE (stmt_info)) + STMT_VINFO_VECTYPE (stmt_info) = vectype; STMT_VINFO_TYPE (stmt_info) = op_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_operation"); vect_model_simple_cost (vinfo, stmt_info, @@ -12890,7 +12873,9 @@ vect_analyze_stmt (vec_info *vinfo, { gcall *call = dyn_cast (stmt_info->stmt); gcc_assert (STMT_VINFO_VECTYPE (stmt_info) - || (call && gimple_call_lhs (call) == NULL_TREE)); + || (call && gimple_call_lhs (call) == NULL_TREE) + /* ??? Inconsistently so. */ + || vect_use_mask_type_p (stmt_info)); *need_to_vectorize = true; }