From patchwork Fri Oct 13 09:52:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 152469 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp1774110vqb; Fri, 13 Oct 2023 02:53:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH7H7kz7CYkYKwMwlnLndpfZcgFlu4zn/ByDWoOnNKFtJCut6AP4Hd8fjtbZkcPiu9hiD8q X-Received: by 2002:a05:622a:14d:b0:419:5bf1:f627 with SMTP id v13-20020a05622a014d00b004195bf1f627mr32029172qtw.37.1697190798771; Fri, 13 Oct 2023 02:53:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697190798; cv=none; d=google.com; s=arc-20160816; b=Qr1QvyJyt8oV0JEdQsi71ouGaggas3nHn0SvFGAZEPNkHB/2KHLsTDMGJzt48am9yV /0yWDr0BiEmDoWLQsUxS4gQ7bUpBjDUtbDkG02GOhqFBnS3qpniG4ZfMqM9YaNi4SClT XudKNhYPjhCccG1ylKqcTZG3HXk19M9Qftzk5tk5pHh6z3nw+R48tPiCbV/EAp2IO65N hW5uC50HseksyXY4CWxoEJ8hV2jhtCHiQYDmxObK++QqY5SNLyqNt3ddGZTdFw/wlISB tM9pwCZXa+ujrmviuKuCglLoqdeLBGGvQeBBPXvsgwN6hx4yvNHK+iDWHxmegK3XgKJj LIXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:message-id:mime-version:subject :to:from:date:dkim-signature:dkim-signature:dmarc-filter :delivered-to; bh=+ko8eDOrazo+cXkYB8/v7TsJZDViMXA+NF1AqydWBcQ=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=AY+VonwHz9mhEL9C8pfrNb5XYPGNb6Ms71hAZv+Z12WXxsj4q+qj5u4alFdexyRbOP ULDUQqa304AskN11cuwb4avjy7bYtPou/dg06GYkTVjMC24meO6O7ed2pfgAe6Bu32Vn XiLkDTPrwiQ3MUVJKBWeeROkXwG8P/TciSf3FYKECQ6SOUwEhBLkRQ7MWzXWg50O9nyL bJfKXMCuv8cf/bbr+kwLeptq0xVu9dVp2HGNZ0yY2gKTsA781RlhJuWhQiwb960ne4oo Lv0WnBcRfrHalywfDipxC/umxNtnseKeVNpmCJ2pAHmtcfRj+ogEq7+FDMwhIW7Y1L2Q xuIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PdLbbVJM; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b8-20020ac85bc8000000b004151e5a9a15si941896qtb.66.2023.10.13.02.53.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 02:53:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PdLbbVJM; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8B7503858D28 for ; Fri, 13 Oct 2023 09:53:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id BC37D3858D1E for ; Fri, 13 Oct 2023 09:52:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BC37D3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D0DEE21891 for ; Fri, 13 Oct 2023 09:52:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1697190767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=+ko8eDOrazo+cXkYB8/v7TsJZDViMXA+NF1AqydWBcQ=; b=PdLbbVJM37ULc50yXzv7wKiio13S+Vs62JvNijW+ex0saE0nCc8ILZRCQFFQSjE0naHX1K Xt7cNu6gK/HCx0kM8A0763XS7Qjlb1jpF7Sa0ZJZ84IwG2sY4HiU6Rz0uA5c/shvlYO1OZ 38YgfEhI3TVH+cguzvow8DPvHKPYsMk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1697190767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=+ko8eDOrazo+cXkYB8/v7TsJZDViMXA+NF1AqydWBcQ=; b=xPgC+/apQwzfJ7r+3pnuk6XKFYjIGSlUVsQ9gYfYQznSrG12s4bPT8eJGdAryILR6q1oRp eF6eJTm5xuCsrrDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BEA251358F for ; Fri, 13 Oct 2023 09:52:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id s0U7LW8TKWVIIAAAMHmgww (envelope-from ) for ; Fri, 13 Oct 2023 09:52:47 +0000 Date: Fri, 13 Oct 2023 11:52:37 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] Add support for SLP vectorization of OpenMP SIMD clone calls MIME-Version: 1.0 Message-Id: <20231013095247.BEA251358F@imap2.suse-dmz.suse.de> Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -7.10 X-Spamd-Result: default: False [-7.10 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-3.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[gcc-patches@gcc.gnu.org]; RCPT_COUNT_ONE(0.00)[1]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779633539039393509 X-GMAIL-MSGID: 1779633539039393509 This adds support for SLP vectorization of OpenMP SIMD clone calls. There's a complication when vectorizing calls involving virtual operands since this is now for the first time not only leafs (loads or stores). With SLP this runs into the issue that placement of the vectorized stmts is not necessarily at one of the original scalar stmts which leads to the magic updating virtual operands in vect_finish_stmt_generation not working. So we run into the assert that updating virtual operands isn't necessary. I've papered over this similar to how we do for mismatched const/pure attribution by setting vinfo->any_known_not_updated_vssa. I've added two basic testcases with multi-lane SLP and verified that with single-lane SLP enabled the rest of the existing testcases pass. Bootstrapped and tested on x86_64-unknown-linux-gnu, will push later today. Richard. * tree-vect-slp.cc (mask_call_maps): New. (vect_get_operand_map): Handle IFN_MASK_CALL. (vect_build_slp_tree_1): Likewise. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle SLP. * gcc.dg/vect/slp-simd-clone-1.c: New testcase. * gcc.dg/vect/slp-simd-clone-2.c: Likewise. --- gcc/testsuite/gcc.dg/vect/slp-simd-clone-1.c | 46 +++++++++ gcc/testsuite/gcc.dg/vect/slp-simd-clone-2.c | 57 +++++++++++ gcc/tree-vect-slp.cc | 20 +++- gcc/tree-vect-stmts.cc | 102 ++++++++++++++----- 4 files changed, 196 insertions(+), 29 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/slp-simd-clone-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/slp-simd-clone-2.c diff --git a/gcc/testsuite/gcc.dg/vect/slp-simd-clone-1.c b/gcc/testsuite/gcc.dg/vect/slp-simd-clone-1.c new file mode 100644 index 00000000000..6ccbb39b567 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/slp-simd-clone-1.c @@ -0,0 +1,46 @@ +/* { dg-require-effective-target vect_simd_clones } */ +/* { dg-additional-options "-fopenmp-simd" } */ + +#include "tree-vect.h" + +int x[1024]; + +#pragma omp declare simd simdlen(4) notinbranch +__attribute__((noinline)) int +foo (int a, int b) +{ + return a + b; +} + +void __attribute__((noipa)) +bar (void) +{ +#pragma omp simd + for (int i = 0; i < 512; i++) + { + x[2*i+0] = foo (x[2*i+0], x[2*i+0]); + x[2*i+1] = foo (x[2*i+1], x[2*i+1]); + } +} + +int +main () +{ + int i; + check_vect (); + +#pragma GCC novector + for (i = 0; i < 1024; i++) + x[i] = i; + + bar (); + +#pragma GCC novector + for (i = 0; i < 1024; i++) + if (x[i] != i + i) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-simd-clone-2.c b/gcc/testsuite/gcc.dg/vect/slp-simd-clone-2.c new file mode 100644 index 00000000000..98387c92486 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/slp-simd-clone-2.c @@ -0,0 +1,57 @@ +/* { dg-require-effective-target vect_simd_clones } */ +/* { dg-additional-options "-fopenmp-simd" } */ +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ + +#include "tree-vect.h" + +int x[1024]; + +#pragma omp declare simd simdlen(4) inbranch +__attribute__((noinline)) int +foo (int a, int b) +{ + return a + b; +} + +void __attribute__((noipa)) +bar (void) +{ +#pragma omp simd + for (int i = 0; i < 512; i++) + { + if (x[2*i+0] < 10) + x[2*i+0] = foo (x[2*i+0], x[2*i+0]); + if (x[2*i+1] < 20) + x[2*i+1] = foo (x[2*i+1], x[2*i+1]); + } +} + +int +main () +{ + int i; + check_vect (); + +#pragma GCC novector + for (i = 0; i < 1024; i++) + x[i] = i; + + bar (); + +#pragma GCC novector + for (i = 0; i < 1024; i++) + { + if (((i & 1) && i < 20) + || (!(i & 1) && i < 10)) + { + if (x[i] != i + i) + abort (); + } + else if (x[i] != i) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { target avx2_runtime } } } */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 4ff8cbaec04..436efdd4807 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -505,6 +505,14 @@ static const int arg2_map[] = { 1, 2 }; static const int arg1_arg4_map[] = { 2, 1, 4 }; static const int arg3_arg2_map[] = { 2, 3, 2 }; static const int op1_op0_map[] = { 2, 1, 0 }; +static const int mask_call_maps[6][7] = { + { 1, 1, }, + { 2, 1, 2, }, + { 3, 1, 2, 3, }, + { 4, 1, 2, 3, 4, }, + { 5, 1, 2, 3, 4, 5, }, + { 6, 1, 2, 3, 4, 5, 6 }, +}; /* For most SLP statements, there is a one-to-one mapping between gimple arguments and child nodes. If that is not true for STMT, @@ -547,6 +555,15 @@ vect_get_operand_map (const gimple *stmt, unsigned char swap = 0) case IFN_MASK_STORE: return arg3_arg2_map; + case IFN_MASK_CALL: + { + unsigned nargs = gimple_call_num_args (call); + if (nargs >= 2 && nargs <= 7) + return mask_call_maps[nargs-2]; + else + return nullptr; + } + default: break; } @@ -1070,7 +1087,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, if (call_stmt) { combined_fn cfn = gimple_call_combined_fn (call_stmt); - if (cfn != CFN_LAST) + if (cfn != CFN_LAST && cfn != CFN_MASK_CALL) rhs_code = cfn; else rhs_code = CALL_EXPR; @@ -1085,6 +1102,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, rhs_code = CFN_MASK_STORE; } else if ((cfn != CFN_LAST + && cfn != CFN_MASK_CALL && internal_fn_p (cfn) && !vectorizable_internal_fn_p (as_internal_fn (cfn))) || gimple_call_tail_p (call_stmt) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index f29ee9f19bf..0fb6fc3394a 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4315,10 +4315,6 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (loop_vinfo && nested_in_vect_loop_p (loop, stmt_info)) return false; - /* FORNOW */ - if (slp_node) - return false; - /* Process function arguments. */ nargs = gimple_call_num_args (stmt) - arg_offset; @@ -4327,6 +4323,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, return false; arginfo.reserve (nargs, true); + auto_vec slp_op; + slp_op.safe_grow_cleared (nargs); for (i = 0; i < nargs; i++) { @@ -4338,9 +4336,12 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, thisarginfo.op = NULL_TREE; thisarginfo.simd_lane_linear = false; - op = gimple_call_arg (stmt, i + arg_offset); - if (!vect_is_simple_use (op, vinfo, &thisarginfo.dt, - &thisarginfo.vectype) + int op_no = i + arg_offset; + if (slp_node) + op_no = vect_slp_child_index_for_operand (stmt, op_no); + if (!vect_is_simple_use (vinfo, stmt_info, slp_node, + op_no, &op, &slp_op[i], + &thisarginfo.dt, &thisarginfo.vectype) || thisarginfo.dt == vect_uninitialized_def) { if (dump_enabled_p ()) @@ -4351,7 +4352,13 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (thisarginfo.dt == vect_constant_def || thisarginfo.dt == vect_external_def) - gcc_assert (thisarginfo.vectype == NULL_TREE); + { + gcc_assert (vec_stmt || thisarginfo.vectype == NULL_TREE); + if (!vec_stmt) + thisarginfo.vectype = get_vectype_for_scalar_type (vinfo, + TREE_TYPE (op), + slp_node); + } else gcc_assert (thisarginfo.vectype != NULL_TREE); @@ -4408,15 +4415,14 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, && thisarginfo.dt != vect_constant_def && thisarginfo.dt != vect_external_def && loop_vinfo - && !slp_node && TREE_CODE (op) == SSA_NAME) vect_simd_lane_linear (op, loop, &thisarginfo); arginfo.quick_push (thisarginfo); } - poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - if (!vf.is_constant ()) + if (loop_vinfo + && !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -4425,6 +4431,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, return false; } + poly_uint64 vf = loop_vinfo ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) : 1; + unsigned group_size = slp_node ? SLP_TREE_LANES (slp_node) : 1; unsigned int badness = 0; struct cgraph_node *bestn = NULL; if (STMT_VINFO_SIMD_CLONE_INFO (stmt_info).exists ()) @@ -4435,7 +4443,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, { unsigned int this_badness = 0; unsigned int num_calls; - if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls) + if (!constant_multiple_p (vf * group_size, + n->simdclone->simdlen, &num_calls) || n->simdclone->nargs != nargs) continue; if (num_calls != 1) @@ -4561,7 +4570,10 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, fndecl = bestn->decl; nunits = bestn->simdclone->simdlen; - ncopies = vector_unroll_factor (vf, nunits); + if (slp_node) + ncopies = vector_unroll_factor (vf * group_size, nunits); + else + ncopies = vector_unroll_factor (vf, nunits); /* If the function isn't const, only allow it in simd loops where user has asserted that at least nunits consecutive iterations can be @@ -4576,6 +4588,15 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (!vec_stmt) /* transformation not required. */ { + if (slp_node) + for (unsigned i = 0; i < nargs; ++i) + if (!vect_maybe_update_slp_op_vectype (slp_op[i], arginfo[i].vectype)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "incompatible vector types for invariants\n"); + return false; + } /* When the original call is pure or const but the SIMD ABI dictates an aggregate return we will have to use a virtual definition and in a loop eventually even need to add a virtual PHI. That's @@ -4584,6 +4605,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, && !gimple_vdef (stmt) && TREE_CODE (TREE_TYPE (TREE_TYPE (bestn->decl))) == ARRAY_TYPE) vinfo->any_known_not_updated_vssa = true; + /* ??? For SLP code-gen we end up inserting after the last + vector argument def rather than at the original call position + so automagic virtual operand updating doesn't work. */ + if (gimple_vuse (stmt) && slp_node) + vinfo->any_known_not_updated_vssa = true; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (bestn->decl); for (i = 0; i < nargs; i++) if ((bestn->simdclone->args[i].arg_type @@ -4633,8 +4659,14 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, auto_vec > vec_oprnds; auto_vec vec_oprnds_i; - vec_oprnds.safe_grow_cleared (nargs, true); vec_oprnds_i.safe_grow_cleared (nargs, true); + if (slp_node) + { + vec_oprnds.reserve_exact (nargs); + vect_get_slp_defs (vinfo, slp_node, &vec_oprnds); + } + else + vec_oprnds.safe_grow_cleared (nargs, true); for (j = 0; j < ncopies; ++j) { /* Build argument list for the vectorized call. */ @@ -4665,9 +4697,10 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, gcc_assert ((k & (k - 1)) == 0); if (m == 0) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, - ncopies * o / k, op, - &vec_oprnds[i]); + if (!slp_node) + vect_get_vec_defs_for_operand (vinfo, stmt_info, + ncopies * o / k, op, + &vec_oprnds[i]); vec_oprnds_i[i] = 0; vec_oprnd0 = vec_oprnds[i][vec_oprnds_i[i]++]; } @@ -4703,10 +4736,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, { if (m == 0 && l == 0) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, - k * o * ncopies, - op, - &vec_oprnds[i]); + if (!slp_node) + vect_get_vec_defs_for_operand (vinfo, stmt_info, + k * o * ncopies, + op, + &vec_oprnds[i]); vec_oprnds_i[i] = 0; vec_oprnd0 = vec_oprnds[i][vec_oprnds_i[i]++]; } @@ -4777,10 +4811,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, elements as the current function. */ if (m == 0) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, - o * ncopies, - op, - &vec_oprnds[i]); + if (!slp_node) + vect_get_vec_defs_for_operand (vinfo, stmt_info, + o * ncopies, + op, + &vec_oprnds[i]); vec_oprnds_i[i] = 0; } vec_oprnd0 = vec_oprnds[i][vec_oprnds_i[i]++]; @@ -4924,7 +4959,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (j == 0 && l == 0) *vec_stmt = new_stmt; - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + if (slp_node) + SLP_TREE_VEC_DEFS (slp_node) + .quick_push (gimple_assign_lhs (new_stmt)); + else + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } if (ratype) @@ -4967,7 +5006,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if ((unsigned) j == k - 1) *vec_stmt = new_stmt; - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + if (slp_node) + SLP_TREE_VEC_DEFS (slp_node) + .quick_push (gimple_assign_lhs (new_stmt)); + else + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); continue; } else if (ratype) @@ -4990,7 +5033,10 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (j == 0) *vec_stmt = new_stmt; - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + if (slp_node) + SLP_TREE_VEC_DEFS (slp_node).quick_push (gimple_get_lhs (new_stmt)); + else + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } for (i = 0; i < nargs; ++i)