From patchwork Mon Aug 14 08:59:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 135240 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp2611334vqi; Mon, 14 Aug 2023 02:00:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF6uuS82QAZPxdq9NAL8C2aFQLDH65hror1R7jUzDQYIU9vcljRB+u+vkrG3XLmWwr5cyHH X-Received: by 2002:a05:6402:b14:b0:525:65f7:60 with SMTP id bm20-20020a0564020b1400b0052565f70060mr1641932edb.29.1692003633198; Mon, 14 Aug 2023 02:00:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692003633; cv=none; d=google.com; s=arc-20160816; b=QF+Kvl/eXyeNR+NtOwJoL+qadn9Ehz7q2KuH64hZaTYfvoSTEYBaXGoLaOrJvsSqen eVInVJL+IT8OhzcaTbpCMG1BSln3qs6eyZZ/jPEMS1tNCmtEoxib2hleEYDfXSUZ+7z6 jL3TKRX5uCPaDLsUvyGHhJqusiU3FCzJOHdO/8hR5m2lt4TF9UsTZ+AauELbpFCrT7TU PnMRZLey+LU3K7cdphS2t81rt4diR2N6LKnousamvWWBqZ4IVyNjqZsO4/sEX0vz6E19 fxMdkEEEh9H3l45RTKTnmz2Ukv9oB/OWCRhqDAd/KP8Ablp+ohiY53qmBr8KVDN6RFkJ FFDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:date:message-id:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=bDHQQVmrT6nED4X7WTUSRDzKSnnb4Mif0TqEr1Bsnhw=; fh=igBTPjHNdy5qHek4vewG2hwX+gxODq9WIce1cDBLf4Y=; b=J3dLWbcoizV7fsRTGYSjHpErxGU8YpOTD9jL67YkvUsouauZ6GaC0w9WmshE2e1Rlq 6Ay0Qs00r5Wn4HuBrsUItK5vu2E8HKNQXXRl08Paimy+sq6v7/ai0XZ597oHFzRkQuBe CI4kDZCIsrw1CxrXd0PEMskBZAXQr1gfpPmhxRbnbgy5zIzIZ83KnAW1y1VPA7Jaw8Yp 923Vp5VVSpM1CvWU9LTJ8oQCJC56rozmfe4zQFLW3WOdxg5r72Di17SDICC+lCDxLNpm XaZXe4IEkVaZHEc1qjmbcRcW/AlF+wkYkKgKllRxrfFNFioadpNFzPA1X2p3l6lHyMCZ ZwGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=nQPpzlBs; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u25-20020a056402065900b005234dee720dsi7040985edx.226.2023.08.14.02.00.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Aug 2023 02:00:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=nQPpzlBs; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BDAC4385841D for ; Mon, 14 Aug 2023 09:00:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BDAC4385841D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692003631; bh=bDHQQVmrT6nED4X7WTUSRDzKSnnb4Mif0TqEr1Bsnhw=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=nQPpzlBsp4tYv4uxIqJMx7fsV0Fzt8pjBOYKrE8ZQSogRi0KLHMbWgAiCZruA0/H4 4ySKNWJuFO1I0qMIjgG7GoW4AFycTQ3r3SZsaptgAWs79LKIeF9YUtY0eFTVeGUkk5 mEj3DfZRaFcnWR28Wc8CPpobrTHlq0zZGrwdtiK4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 70235385802F for ; Mon, 14 Aug 2023 08:59:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 70235385802F Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37E8oBXq000316; Mon, 14 Aug 2023 08:59:18 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfh3xr5x8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 08:59:17 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37E8qgsD005744; Mon, 14 Aug 2023 08:59:17 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfh3xr5x1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 08:59:17 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37E87ZoF002403; Mon, 14 Aug 2023 08:59:16 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3sendmtt16-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 08:59:16 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37E8xEjn24707648 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 14 Aug 2023 08:59:14 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6AB3B20040; Mon, 14 Aug 2023 08:59:14 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B8B6B2004D; Mon, 14 Aug 2023 08:59:12 +0000 (GMT) Received: from [9.200.51.201] (unknown [9.200.51.201]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 14 Aug 2023 08:59:12 +0000 (GMT) Message-ID: <7314a4eb-26d0-e33e-94c2-31daca9f490e@linux.ibm.com> Date: Mon, 14 Aug 2023 16:59:11 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest Content-Language: en-US To: GCC Patches Cc: Richard Biener , Richard Sandiford References: In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: hSCoMLqLoZp2eNAnB1UdnH22o7jE8Lg3 X-Proofpoint-ORIG-GUID: _4vSziTSEH7T6L2Y2Fhq0diVli5uvqkV X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-14_03,2023-08-10_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 phishscore=0 impostorscore=0 bulkscore=0 clxscore=1015 mlxlogscore=999 adultscore=0 lowpriorityscore=0 malwarescore=0 suspectscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308140078 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774194402027745086 X-GMAIL-MSGID: 1774194402027745086 Hi, Following Richi's suggestion [1], this patch is to move the handlings on VMAT_GATHER_SCATTER in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_GATHER_SCATTER, remove some unreachable code. Also remove the corresponding handlings in the final loop nest. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623329.html Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Move the handlings on VMAT_GATHER_SCATTER in the final loop nest to its own loop, and update the final nest accordingly. --- gcc/tree-vect-stmts.cc | 361 +++++++++++++++++++++++++---------------- 1 file changed, 219 insertions(+), 142 deletions(-) -- 2.39.1 diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index c361e16cb7b..5e514eca19b 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10455,6 +10455,218 @@ vectorizable_load (vec_info *vinfo, return true; } + if (memory_access_type == VMAT_GATHER_SCATTER) + { + gcc_assert (alignment_support_scheme == dr_aligned + || alignment_support_scheme == dr_unaligned_supported); + gcc_assert (!grouped_load && !slp_perm); + + unsigned int inside_cost = 0, prologue_cost = 0; + for (j = 0; j < ncopies; j++) + { + /* 1. Create the vector or array pointer update chain. */ + if (j == 0 && !costing_p) + { + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, + slp_node, &gs_info, &dataref_ptr, + &vec_offsets); + else + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, + at_loop, offset, &dummy, gsi, + &ptr_incr, false, bump); + } + else if (!costing_p) + { + gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); + if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, + gsi, stmt_info, bump); + } + + if (mask && !costing_p) + vec_mask = vec_masks[j]; + + gimple *new_stmt = NULL; + for (i = 0; i < vec_num; i++) + { + tree final_mask = NULL_TREE; + tree final_len = NULL_TREE; + tree bias = NULL_TREE; + if (!costing_p) + { + if (loop_masks) + final_mask + = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + vec_num * ncopies, vectype, + vec_num * j + i); + if (vec_mask) + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, + final_mask, vec_mask, gsi); + + if (i > 0 && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, + gsi, stmt_info, bump); + } + + /* 2. Create the vector-load in the loop. */ + unsigned HOST_WIDE_INT align; + if (gs_info.ifn != IFN_LAST) + { + if (costing_p) + { + unsigned int cnunits = vect_nunits_for_cost (vectype); + inside_cost + = record_stmt_cost (cost_vec, cnunits, scalar_load, + stmt_info, 0, vect_body); + continue; + } + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vec_offset = vec_offsets[vec_num * j + i]; + tree zero = build_zero_cst (vectype); + tree scale = size_int (gs_info.scale); + + if (gs_info.ifn == IFN_MASK_LEN_GATHER_LOAD) + { + if (loop_lens) + final_len + = vect_get_loop_len (loop_vinfo, gsi, loop_lens, + vec_num * ncopies, vectype, + vec_num * j + i, 1); + else + final_len + = build_int_cst (sizetype, + TYPE_VECTOR_SUBPARTS (vectype)); + signed char biasval + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); + bias = build_int_cst (intQI_type_node, biasval); + if (!final_mask) + { + mask_vectype = truth_type_for (vectype); + final_mask = build_minus_one_cst (mask_vectype); + } + } + + gcall *call; + if (final_len && final_mask) + call + = gimple_build_call_internal (IFN_MASK_LEN_GATHER_LOAD, 7, + dataref_ptr, vec_offset, + scale, zero, final_mask, + final_len, bias); + else if (final_mask) + call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5, + dataref_ptr, vec_offset, + scale, zero, final_mask); + else + call = gimple_build_call_internal (IFN_GATHER_LOAD, 4, + dataref_ptr, vec_offset, + scale, zero); + gimple_call_set_nothrow (call, true); + new_stmt = call; + data_ref = NULL_TREE; + } + else + { + /* Emulated gather-scatter. */ + gcc_assert (!final_mask); + unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); + if (costing_p) + { + /* For emulated gathers N offset vector element + offset add is consumed by the load). */ + inside_cost = record_stmt_cost (cost_vec, const_nunits, + vec_to_scalar, stmt_info, + 0, vect_body); + /* N scalar loads plus gathering them into a + vector. */ + inside_cost + = record_stmt_cost (cost_vec, const_nunits, scalar_load, + stmt_info, 0, vect_body); + inside_cost + = record_stmt_cost (cost_vec, 1, vec_construct, + stmt_info, 0, vect_body); + continue; + } + unsigned HOST_WIDE_INT const_offset_nunits + = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype) + .to_constant (); + vec *ctor_elts; + vec_alloc (ctor_elts, const_nunits); + gimple_seq stmts = NULL; + /* We support offset vectors with more elements + than the data vector for now. */ + unsigned HOST_WIDE_INT factor + = const_offset_nunits / const_nunits; + vec_offset = vec_offsets[j / factor]; + unsigned elt_offset = (j % factor) * const_nunits; + tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset)); + tree scale = size_int (gs_info.scale); + align = get_object_alignment (DR_REF (first_dr_info->dr)); + tree ltype = build_aligned_type (TREE_TYPE (vectype), align); + for (unsigned k = 0; k < const_nunits; ++k) + { + tree boff = size_binop (MULT_EXPR, TYPE_SIZE (idx_type), + bitsize_int (k + elt_offset)); + tree idx + = gimple_build (&stmts, BIT_FIELD_REF, idx_type, + vec_offset, TYPE_SIZE (idx_type), boff); + idx = gimple_convert (&stmts, sizetype, idx); + idx = gimple_build (&stmts, MULT_EXPR, sizetype, idx, + scale); + tree ptr = gimple_build (&stmts, PLUS_EXPR, + TREE_TYPE (dataref_ptr), + dataref_ptr, idx); + ptr = gimple_convert (&stmts, ptr_type_node, ptr); + tree elt = make_ssa_name (TREE_TYPE (vectype)); + tree ref = build2 (MEM_REF, ltype, ptr, + build_int_cst (ref_type, 0)); + new_stmt = gimple_build_assign (elt, ref); + gimple_set_vuse (new_stmt, gimple_vuse (gsi_stmt (*gsi))); + gimple_seq_add_stmt (&stmts, new_stmt); + CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, elt); + } + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + new_stmt = gimple_build_assign ( + NULL_TREE, build_constructor (vectype, ctor_elts)); + data_ref = NULL_TREE; + } + + vec_dest = vect_create_destination_var (scalar_dest, vectype); + /* DATA_REF is null if we've already built the statement. */ + if (data_ref) + { + vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); + new_stmt = gimple_build_assign (vec_dest, data_ref); + } + new_temp = make_ssa_name (vec_dest, new_stmt); + gimple_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + + /* Store vector loads in the corresponding SLP_NODE. */ + if (slp) + slp_node->push_vec_def (new_stmt); + } + + if (!slp && !costing_p) + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } + + if (!slp && !costing_p) + *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; + + if (costing_p) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = %u .\n", + inside_cost, prologue_cost); + } + return true; + } + poly_uint64 group_elt = 0; unsigned int inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) @@ -10504,12 +10716,6 @@ vectorizable_load (vec_info *vinfo, gcc_assert (!compute_in_loop); } } - else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - { - vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, - slp_node, &gs_info, &dataref_ptr, - &vec_offsets); - } else dataref_ptr = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, @@ -10525,7 +10731,7 @@ vectorizable_load (vec_info *vinfo, if (dataref_offset) dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset, bump); - else if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + else dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, stmt_info, bump); if (mask) @@ -10551,7 +10757,7 @@ vectorizable_load (vec_info *vinfo, final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, vec_mask, gsi); - if (i > 0 && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + if (i > 0) dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, stmt_info, bump); } @@ -10562,139 +10768,11 @@ vectorizable_load (vec_info *vinfo, case dr_aligned: case dr_unaligned_supported: { - unsigned int misalign; - unsigned HOST_WIDE_INT align; - - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info.ifn != IFN_LAST) - { - if (costing_p) - { - unsigned int cnunits = vect_nunits_for_cost (vectype); - inside_cost - = record_stmt_cost (cost_vec, cnunits, scalar_load, - stmt_info, 0, vect_body); - break; - } - if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vec_offset = vec_offsets[vec_num * j + i]; - tree zero = build_zero_cst (vectype); - tree scale = size_int (gs_info.scale); - - if (gs_info.ifn == IFN_MASK_LEN_GATHER_LOAD) - { - if (loop_lens) - final_len - = vect_get_loop_len (loop_vinfo, gsi, loop_lens, - vec_num * ncopies, vectype, - vec_num * j + i, 1); - else - final_len - = build_int_cst (sizetype, - TYPE_VECTOR_SUBPARTS (vectype)); - signed char biasval - = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); - bias = build_int_cst (intQI_type_node, biasval); - if (!final_mask) - { - mask_vectype = truth_type_for (vectype); - final_mask = build_minus_one_cst (mask_vectype); - } - } - - gcall *call; - if (final_len && final_mask) - call = gimple_build_call_internal ( - IFN_MASK_LEN_GATHER_LOAD, 7, dataref_ptr, vec_offset, - scale, zero, final_mask, final_len, bias); - else if (final_mask) - call - = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5, - dataref_ptr, vec_offset, - scale, zero, final_mask); - else - call - = gimple_build_call_internal (IFN_GATHER_LOAD, 4, - dataref_ptr, vec_offset, - scale, zero); - gimple_call_set_nothrow (call, true); - new_stmt = call; - data_ref = NULL_TREE; - break; - } - else if (memory_access_type == VMAT_GATHER_SCATTER) - { - /* Emulated gather-scatter. */ - gcc_assert (!final_mask); - unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); - if (costing_p) - { - /* For emulated gathers N offset vector element - offset add is consumed by the load). */ - inside_cost - = record_stmt_cost (cost_vec, const_nunits, - vec_to_scalar, stmt_info, 0, - vect_body); - /* N scalar loads plus gathering them into a - vector. */ - inside_cost = record_stmt_cost (cost_vec, const_nunits, - scalar_load, stmt_info, - 0, vect_body); - inside_cost - = record_stmt_cost (cost_vec, 1, vec_construct, - stmt_info, 0, vect_body); - break; - } - unsigned HOST_WIDE_INT const_offset_nunits - = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype) - .to_constant (); - vec *ctor_elts; - vec_alloc (ctor_elts, const_nunits); - gimple_seq stmts = NULL; - /* We support offset vectors with more elements - than the data vector for now. */ - unsigned HOST_WIDE_INT factor - = const_offset_nunits / const_nunits; - vec_offset = vec_offsets[j / factor]; - unsigned elt_offset = (j % factor) * const_nunits; - tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset)); - tree scale = size_int (gs_info.scale); - align = get_object_alignment (DR_REF (first_dr_info->dr)); - tree ltype - = build_aligned_type (TREE_TYPE (vectype), align); - for (unsigned k = 0; k < const_nunits; ++k) - { - tree boff = size_binop (MULT_EXPR, TYPE_SIZE (idx_type), - bitsize_int (k + elt_offset)); - tree idx = gimple_build (&stmts, BIT_FIELD_REF, - idx_type, vec_offset, - TYPE_SIZE (idx_type), boff); - idx = gimple_convert (&stmts, sizetype, idx); - idx = gimple_build (&stmts, MULT_EXPR, sizetype, idx, - scale); - tree ptr = gimple_build (&stmts, PLUS_EXPR, - TREE_TYPE (dataref_ptr), - dataref_ptr, idx); - ptr = gimple_convert (&stmts, ptr_type_node, ptr); - tree elt = make_ssa_name (TREE_TYPE (vectype)); - tree ref = build2 (MEM_REF, ltype, ptr, - build_int_cst (ref_type, 0)); - new_stmt = gimple_build_assign (elt, ref); - gimple_set_vuse (new_stmt, - gimple_vuse (gsi_stmt (*gsi))); - gimple_seq_add_stmt (&stmts, new_stmt); - CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, elt); - } - gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); - new_stmt = gimple_build_assign ( - NULL_TREE, build_constructor (vectype, ctor_elts)); - data_ref = NULL_TREE; - break; - } - if (costing_p) break; + unsigned int misalign; + unsigned HOST_WIDE_INT align; align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) misalign = 0; @@ -11156,10 +11234,9 @@ vectorizable_load (vec_info *vinfo, if (costing_p) { - gcc_assert (memory_access_type != VMAT_INVARIANT - && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP - && memory_access_type != VMAT_LOAD_STORE_LANES); + gcc_assert (memory_access_type == VMAT_CONTIGUOUS + || memory_access_type == VMAT_CONTIGUOUS_REVERSE + || memory_access_type == VMAT_CONTIGUOUS_PERMUTE); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "vect_model_load_cost: inside_cost = %u, "