From patchwork Wed Oct 18 05:09:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 154660 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4571509vqb; Tue, 17 Oct 2023 22:09:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEcrY39GR4BL5GOqd3BC3mytx3GoxZ9uS32UsszDHnZDEosJTfQ4I9hO2cPkklNc8dIJv7B X-Received: by 2002:a05:620a:f15:b0:774:32f0:e2b5 with SMTP id v21-20020a05620a0f1500b0077432f0e2b5mr5032807qkl.9.1697605792569; Tue, 17 Oct 2023 22:09:52 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697605792; cv=pass; d=google.com; s=arc-20160816; b=hKfFso6vkM6egvPgyRTpZxww3XqVQtmdDEp0RFzwleRPNbCUVSLaQAkqZY9KD5AiPf fYXBJC2IjLEQDMhIZJA/5G2igEnfIjBMAIWmfWG/56hapTNQqj++gJK/V52CrrYGDX6W p86IR9uyZMQsbYMa6fiGF4k64BDI7HfKmDArN5njRPMOTDz/GHW/32Elc8arjWyK7j3d aczqlXfRIPF7eD9ZlqAV7iuseiU+zJT0GzbVfuCm3cmQQ1lB1fa3SJuaEgzTAi4Kvu0X KPZXX2doQ8JJAulVJ/GjDbdTwlBWfM+4Mq4JuVAyopknZOhvRBnz5A7hoadEF525f/Cm reyw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:cc:to:subject:from:content-language :user-agent:date:message-id:dkim-signature:arc-filter:dmarc-filter :delivered-to; bh=5K3vx1x9g8kIt64YoO7xGiqL5dhH3ZBa4N9QBD7Uyy0=; fh=Hsdf1Z0jqRbQEbD7KMvysAJjBYRucH8AEqtko5uCUz0=; b=T3ZjRTT8sjOdAlYzcg0LcpH3LuOF7/UlOZRuu3w5VqiAmFy9ypQyBOkEDpiOjjbqxp cxbETJ2a2/6awqICImwCCXJGlwzDonQb92Nk0SYoaW1ag/ByUwtmBnX8s7oERRuEncbT Fot770aPKobvg/6MpcLKZ8SYZVG5s6/X8Se1GiMQmHAxqL/RVKFDoFwbkvzyFxJf8yko oWatHFMk+U1ok2mcJrwYvBpTyYH47hFTuWPHZWHtedREyzWoTqvMJB4sHzmddmFPmZ4e 8LsC14AwWdmCbK0lA5VmC48qsKVglqO+yzJwIrx9vCU7d96wTVcE6nK3TawVchcX39wO 4ddA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=pRCGIp7w; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id 30-20020a05620a041e00b0077568c879efsi2065046qkp.154.2023.10.17.22.09.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 22:09:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=pRCGIp7w; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 49741385843E for ; Wed, 18 Oct 2023 05:09:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 49ED03858D33 for ; Wed, 18 Oct 2023 05:09:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 49ED03858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 49ED03858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697605769; cv=none; b=oLQch5Uc3gnty6Ioua38TBXhqRCc5KKz5TShfA3RpkKg0o+sevIqylK0uQLIDT+z9uPUlYxHsLwEV8bi0GSmGQ+TmHexgajup5I3G1hBMYGiKsn53E7w3kN33JPAR8OUlL0Me6D97nUKb6u5fizohzNWGhloz5KnNpkD59A3HGw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697605769; c=relaxed/simple; bh=kL3VfW0bh14NAXUvkYyS+V8m7g7PGXdt3XbFfaGIy6A=; h=DKIM-Signature:Message-ID:Date:From:Subject:To:MIME-Version; b=hepVEBMps2Zh9VBMBo3Msi3LbHETXSB3XKVBvHds59g9tE/WKvn1iE5AB+281urxpTKG0EZczDORjkdH+7hiNcGRb88SA5z0jcWjL0t3T//8ATJJIyN3+o4TgANP+mUYcF7P5i3jF29dTaFPOJizkD5gi3iXiHy9uPLYR4TnEd8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39I57xoj007501; Wed, 18 Oct 2023 05:09:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : from : subject : to : cc : content-type : content-transfer-encoding : mime-version; s=pp1; bh=5K3vx1x9g8kIt64YoO7xGiqL5dhH3ZBa4N9QBD7Uyy0=; b=pRCGIp7wR2pcou3EebrUbessPRYP0XI11TSeK5WajqmoQLtFH5Q6YM1CmDEE6icIN28r hwfV4NwLV5iJhCLzhlP0BVVGgsjaG7Tvhn8qSGJyoTjWbtIWYM0sGnAHUMOb36j0ymul tpUI8VAaHHR4hgj2Xz+mJtkEQZtELMTJft3a05fDjEw1+hUircBreHIM/Xk7mxEUYv3v H3QKTknz1hpQytUhEAdIC8d4X+jztNLPcEFbvg6wLC62qPsxVmCxkMw3fw2sg0LdOlfT obKVprALzwbt36ui3yAwvaOx8ADTGIrUeT+CBDSwy+nXV8CHTcl64LzHFNYSoqrd5HL1 FA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tt8xh83wd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 05:09:22 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39I58Agc009098; Wed, 18 Oct 2023 05:09:22 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tt8xh83uv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 05:09:21 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39I4qmFO020105; Wed, 18 Oct 2023 05:09:20 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3tr6an5vbx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 05:09:20 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39I59I4D15467040 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 18 Oct 2023 05:09:18 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7B6D02007F; Wed, 18 Oct 2023 05:09:18 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F27E220075; Wed, 18 Oct 2023 05:09:16 +0000 (GMT) Received: from [9.177.7.102] (unknown [9.177.7.102]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 18 Oct 2023 05:09:16 +0000 (GMT) Message-ID: Date: Wed, 18 Oct 2023 13:09:15 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Content-Language: en-US From: "Kewen.Lin" Subject: [PATCH] vect: Cost adjacent vector loads/stores together [PR111784] To: GCC Patches Cc: Richard Biener , Richard Sandiford X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: MYLFecyCAqzfxWN9wvRd2bwIlySMjy2B X-Proofpoint-GUID: sFSbLCmS_hRO_dSVD5oe39hRfiHeh4xp X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-18_02,2023-10-17_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 spamscore=0 clxscore=1015 priorityscore=1501 adultscore=0 bulkscore=0 suspectscore=0 lowpriorityscore=0 malwarescore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310180043 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780068691461769155 X-GMAIL-MSGID: 1780068691461769155 Hi, As comments[1][2], this patch is to change the costing way on some adjacent vector loads/stores from costing one by one to costing them together with the total number once. It helps to fix the exposed regression PR111784 on aarch64, as aarch64 specific costing could make different decisions according to the different costing ways (counting with total number vs. counting one by one). Based on a reduced test case from PR111784, only considering vec_num can fix the regression already, but vector loads/stores in regard to ncopies are also adjacent accesses, so they are considered as well. btw, this patch leaves the costing on dr_explicit_realign and dr_explicit_realign_optimized alone to make it simple. The costing way change can cause the differences for them since there is one costing depending on targetm.vectorize. builtin_mask_for_load and it's costed according to the calling times. IIUC, these two dr_alignment_support are mainly used for old Power? (only having 16 bytes aligned vector load/store but no unaligned vector load/store). Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu, powerpc64-linux-gnu P{7,8,9} and powerpc64le-linux-gnu P{8,9,10}. Is it ok for trunk? [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630742.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630744.html BR, Kewen ----- gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Adjust costing way for adjacent vector stores, by costing them with the total number rather than costing them one by one. (vectorizable_load): Adjust costing way for adjacent vector loads, by costing them with the total number rather than costing them one by one. --- gcc/tree-vect-stmts.cc | 137 ++++++++++++++++++++++++++++------------- 1 file changed, 95 insertions(+), 42 deletions(-) -- 2.31.1 diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b3a56498595..af134ff2bf7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8681,6 +8681,9 @@ vectorizable_store (vec_info *vinfo, alias_off = build_int_cst (ref_type, 0); stmt_vec_info next_stmt_info = first_stmt_info; auto_vec vec_oprnds (ncopies); + /* For costing some adjacent vector stores, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_stores = 0; for (g = 0; g < group_size; g++) { running_off = offvar; @@ -8738,10 +8741,7 @@ vectorizable_store (vec_info *vinfo, store to avoid ICE like 110776. */ if (VECTOR_TYPE_P (ltype) && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U)) - vect_get_store_cost (vinfo, stmt_info, 1, - alignment_support_scheme, - misalignment, &inside_cost, - cost_vec); + n_adjacent_stores++; else inside_cost += record_stmt_cost (cost_vec, 1, scalar_store, @@ -8798,11 +8798,18 @@ vectorizable_store (vec_info *vinfo, break; } - if (costing_p && dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_store_cost: inside_cost = %d, " - "prologue_cost = %d .\n", - inside_cost, prologue_cost); + if (costing_p) + { + if (n_adjacent_stores > 0) + vect_get_store_cost (vinfo, stmt_info, n_adjacent_stores, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + } return true; } @@ -8909,6 +8916,9 @@ vectorizable_store (vec_info *vinfo, { gcc_assert (!slp && grouped_store); unsigned inside_cost = 0, prologue_cost = 0; + /* For costing some adjacent vector stores, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_stores = 0; for (j = 0; j < ncopies; j++) { gimple *new_stmt; @@ -8974,10 +8984,7 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { - for (i = 0; i < vec_num; i++) - vect_get_store_cost (vinfo, stmt_info, 1, - alignment_support_scheme, misalignment, - &inside_cost, cost_vec); + n_adjacent_stores += vec_num; continue; } @@ -9067,11 +9074,18 @@ vectorizable_store (vec_info *vinfo, STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } - if (costing_p && dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_store_cost: inside_cost = %d, " - "prologue_cost = %d .\n", - inside_cost, prologue_cost); + if (costing_p) + { + if (n_adjacent_stores > 0) + vect_get_store_cost (vinfo, stmt_info, n_adjacent_stores, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + } return true; } @@ -9290,6 +9304,9 @@ vectorizable_store (vec_info *vinfo, || memory_access_type == VMAT_CONTIGUOUS_REVERSE); unsigned inside_cost = 0, prologue_cost = 0; + /* For costing some adjacent vector stores, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_stores = 0; auto_vec result_chain (group_size); auto_vec vec_oprnds; for (j = 0; j < ncopies; j++) @@ -9451,9 +9468,7 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { - vect_get_store_cost (vinfo, stmt_info, 1, - alignment_support_scheme, misalignment, - &inside_cost, cost_vec); + n_adjacent_stores++; if (!slp) { @@ -9623,6 +9638,11 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { + if (n_adjacent_stores > 0) + vect_get_store_cost (vinfo, stmt_info, n_adjacent_stores, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + /* When vectorizing a store into the function result assign a penalty if the function returns in a multi-register location. In this case we assume we'll end up with having to spill the @@ -10337,6 +10357,9 @@ vectorizable_load (vec_info *vinfo, unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); unsigned int n_groups = 0; + /* For costing some adjacent vector loads, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_loads = 0; for (j = 0; j < ncopies; j++) { if (nloads > 1 && !costing_p) @@ -10350,10 +10373,7 @@ vectorizable_load (vec_info *vinfo, avoid ICE, see PR110776. */ if (VECTOR_TYPE_P (ltype) && memory_access_type != VMAT_ELEMENTWISE) - vect_get_load_cost (vinfo, stmt_info, 1, - alignment_support_scheme, misalignment, - false, &inside_cost, nullptr, cost_vec, - cost_vec, true); + n_adjacent_loads++; else inside_cost += record_stmt_cost (cost_vec, 1, scalar_load, stmt_info, 0, vect_body); @@ -10447,11 +10467,19 @@ vectorizable_load (vec_info *vinfo, false, &n_perms); } - if (costing_p && dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: inside_cost = %u, " - "prologue_cost = 0 .\n", - inside_cost); + if (costing_p) + { + if (n_adjacent_loads > 0) + vect_get_load_cost (vinfo, stmt_info, n_adjacent_loads, + alignment_support_scheme, misalignment, false, + &inside_cost, nullptr, cost_vec, cost_vec, + true); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = 0 .\n", + inside_cost); + } return true; } @@ -10756,6 +10784,9 @@ vectorizable_load (vec_info *vinfo, gcc_assert (grouped_load && !slp); unsigned int inside_cost = 0, prologue_cost = 0; + /* For costing some adjacent vector loads, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_loads = 0; for (j = 0; j < ncopies; j++) { if (costing_p) @@ -10787,9 +10818,7 @@ vectorizable_load (vec_info *vinfo, true); } } - vect_get_load_cost (vinfo, stmt_info, 1, alignment_support_scheme, - misalignment, false, &inside_cost, - &prologue_cost, cost_vec, cost_vec, true); + n_adjacent_loads++; continue; } @@ -10891,11 +10920,19 @@ vectorizable_load (vec_info *vinfo, *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; } - if (costing_p && dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: inside_cost = %u, " - "prologue_cost = %u .\n", - inside_cost, prologue_cost); + if (costing_p) + { + if (n_adjacent_loads > 0) + vect_get_load_cost (vinfo, stmt_info, n_adjacent_loads, + alignment_support_scheme, misalignment, false, + &inside_cost, &prologue_cost, cost_vec, + cost_vec, true); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = %u .\n", + inside_cost, prologue_cost); + } return true; } @@ -11111,6 +11148,9 @@ vectorizable_load (vec_info *vinfo, poly_uint64 group_elt = 0; unsigned int inside_cost = 0, prologue_cost = 0; + /* For costing some adjacent vector loads, we'd like to cost with + the total number of them once instead of cost each one by one. */ + unsigned int n_adjacent_loads = 0; for (j = 0; j < ncopies; j++) { /* 1. Create the vector or array pointer update chain. */ @@ -11505,10 +11545,18 @@ vectorizable_load (vec_info *vinfo, || memory_access_type == VMAT_CONTIGUOUS_REVERSE || (memory_access_type == VMAT_CONTIGUOUS_PERMUTE && (!grouped_load || first_stmt_info_p))) - vect_get_load_cost (vinfo, stmt_info, 1, - alignment_support_scheme, misalignment, - add_realign_cost, &inside_cost, - &prologue_cost, cost_vec, cost_vec, true); + { + /* Leave realign cases alone to keep them simple. */ + if (alignment_support_scheme == dr_explicit_realign_optimized + || alignment_support_scheme == dr_explicit_realign) + vect_get_load_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + add_realign_cost, &inside_cost, + &prologue_cost, cost_vec, cost_vec, + true); + else + n_adjacent_loads++; + } } else { @@ -11679,6 +11727,11 @@ vectorizable_load (vec_info *vinfo, gcc_assert (memory_access_type == VMAT_CONTIGUOUS || memory_access_type == VMAT_CONTIGUOUS_REVERSE || memory_access_type == VMAT_CONTIGUOUS_PERMUTE); + if (n_adjacent_loads > 0) + vect_get_load_cost (vinfo, stmt_info, n_adjacent_loads, + alignment_support_scheme, misalignment, false, + &inside_cost, &prologue_cost, cost_vec, cost_vec, + true); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "vect_model_load_cost: inside_cost = %u, "