From patchwork Thu Sep 14 03:11:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139271 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80025vqi; Wed, 13 Sep 2023 20:15:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFr8xg9SO+K/e6qL9XxswYRpoewyQesvucv5ASAFd6JQgDLSeeHeCRIDmAt/IU8HLAnXjGO X-Received: by 2002:ac2:4565:0:b0:500:c589:95fb with SMTP id k5-20020ac24565000000b00500c58995fbmr3214230lfm.55.1694661300205; Wed, 13 Sep 2023 20:15:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661300; cv=none; d=google.com; s=arc-20160816; b=BUODpvnw1lcjHfwlSHPNFpmFd+ouZ2NY/J99F8wfcKGcYQbQvN+WjhQySBfvQD0IyI lcf+bZCstYvlH0a3Zl9U43po4zbR1WKh1bjQARG0UG+2u4vMZgTMFx7AdLOdy/BZTN5d XRA9t7qDnsYLN2y5eZN4gTsh+of9hD2XF95xNZ7AxfsgF2YSltH3SnAwQ/P/EwSQQ6bP soEpvLtuaz+Vc/t7nqIPZuN8jwVFObVdsQhPsqGhg0hgigjbsNOQbR48XqHoIsPtD4pz O2pJzL2VxvJumREVawxBQMqMrimagHobqNwSSbET0u/ALXNkIr5s/sVCMJOQT252Tpym srqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=OCN7nnYwV6qFjUL82UyY/RtJ08VxPJHNZxK9lc4zI8U=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=CVy61DOM5nvbB7IL1KbZotUkNNqnsI78dIyczB/w5oCPRXkP+u8kDPEh2Rs8p9lGFg t7XgwZsMKRRMhDJkme/FbSAU3wlMmmVyfOIEH+GY2kJr7ihN0illBAxoLlYX0ijcmjGU hCV6r0VgTAmb+rs2k8CqfSSlCFPfEmJgDDPHo3ct8+Qcj4pMXwwyNllGhk/RPCJtaj6o /whioHa5VyY0qDOfdPNv8bysjolXNTRFO7/Du7CzaQUpA0uGIkv3Kaio6jfPJTNVxjL+ g6wIEvyLVheO5DR0qp4pivb1Ne8WztPIHBzF/ElLbpbVeGqXnZH1q8t0ReYYKpt6IlwO Aykg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="C1Iw0h2/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g15-20020aa7c58f000000b0052c946c8359si531953edq.516.2023.09.13.20.14.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:15:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="C1Iw0h2/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 420023857714 for ; Thu, 14 Sep 2023 03:13:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 420023857714 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661201; bh=OCN7nnYwV6qFjUL82UyY/RtJ08VxPJHNZxK9lc4zI8U=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=C1Iw0h2/cJBX/1u6Sd59UHfJ6AHvKrXjiuUApx4KDGI/vAPld4OdoFKQQz5DOKiYR UBiM/7rQhgSIU13WMvw/yFilO/xdp3pX43w5z8J4sNjuY7yHGU4U1h6Q+NnURVrHHb YawnFl554RTYtEGg+NPtcGkl5amWRN7aXLFRlUGg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id C2E073858291 for ; Thu, 14 Sep 2023 03:12:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C2E073858291 Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E36bte032012; Thu, 14 Sep 2023 03:12:11 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11dn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:11 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E37Icu005936; Thu, 14 Sep 2023 03:12:10 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11dg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:10 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E1PXuk024034; Thu, 14 Sep 2023 03:12:09 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:09 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C88f18350734 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:08 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 062C52004E; Thu, 14 Sep 2023 03:12:08 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DEAB20049; Thu, 14 Sep 2023 03:12:07 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:07 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 06/10] vect: Adjust vectorizable_store costing on VMAT_LOAD_STORE_LANES Date: Wed, 13 Sep 2023 22:11:55 -0500 Message-Id: <048c90cf62145799aa31e3ca4edd6f7adc911a6c.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: sAgY-3QliisiifjQwavUUxRNuyqibSgB X-Proofpoint-GUID: GCN-3zzrF0nHHfa6wWoBU74RLbO6TKOI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 priorityscore=1501 impostorscore=0 spamscore=0 mlxlogscore=837 phishscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981167195433866 X-GMAIL-MSGID: 1776981167195433866 This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES in function vectorizable_store. We don't call function vect_model_store_cost for it any more. It's the case of interleaving stores, so it skips all stmts excepting for first_stmt_info, consider the whole group when costing first_stmt_info. This patch shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Assert it will never get VMAT_LOAD_STORE_LANES. (vectorizable_store): Adjust the cost handling on VMAT_LOAD_STORE_LANES without calling vect_model_store_cost. Factor out new lambda function update_prologue_cost. --- gcc/tree-vect-stmts.cc | 110 ++++++++++++++++++++++++++++------------- 1 file changed, 75 insertions(+), 35 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3d01168080a..fbd16b8a487 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -966,7 +966,8 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, { gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP); + && memory_access_type != VMAT_STRIDED_SLP + && memory_access_type != VMAT_LOAD_STORE_LANES); unsigned int inside_cost = 0, prologue_cost = 0; stmt_vec_info first_stmt_info = stmt_info; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -8408,7 +8409,8 @@ vectorizable_store (vec_info *vinfo, if (grouped_store && !slp && first_stmt_info != stmt_info - && memory_access_type == VMAT_ELEMENTWISE) + && (memory_access_type == VMAT_ELEMENTWISE + || memory_access_type == VMAT_LOAD_STORE_LANES)) return true; } gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); @@ -8479,6 +8481,31 @@ vectorizable_store (vec_info *vinfo, dump_printf_loc (MSG_NOTE, vect_location, "transform store. ncopies = %d\n", ncopies); + /* Check if we need to update prologue cost for invariant, + and update it accordingly if so. If it's not for + interleaving store, we can just check vls_type; but if + it's for interleaving store, need to check the def_type + of the stored value since the current vls_type is just + for first_stmt_info. */ + auto update_prologue_cost = [&](unsigned *prologue_cost, tree store_rhs) + { + gcc_assert (costing_p); + if (slp) + return; + if (grouped_store) + { + gcc_assert (store_rhs); + enum vect_def_type cdt; + gcc_assert (vect_is_simple_use (store_rhs, vinfo, &cdt)); + if (cdt != vect_constant_def && cdt != vect_external_def) + return; + } + else if (vls_type != VLS_STORE_INVARIANT) + return; + *prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, stmt_info, + 0, vect_prologue); + }; + if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { @@ -8646,14 +8673,8 @@ vectorizable_store (vec_info *vinfo, if (!costing_p) vect_get_vec_defs (vinfo, next_stmt_info, slp_node, ncopies, op, &vec_oprnds); - else if (!slp) - { - enum vect_def_type cdt; - gcc_assert (vect_is_simple_use (op, vinfo, &cdt)); - if (cdt == vect_constant_def || cdt == vect_external_def) - prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, - stmt_info, 0, vect_prologue); - } + else + update_prologue_cost (&prologue_cost, op); unsigned int group_el = 0; unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); @@ -8857,13 +8878,7 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_LOAD_STORE_LANES) { gcc_assert (!slp && grouped_store); - if (costing_p) - { - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); - return true; - } + unsigned inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) { gimple *new_stmt; @@ -8879,29 +8894,39 @@ vectorizable_store (vec_info *vinfo, DR_GROUP_SIZE is the exact number of stmts in the chain. Therefore, NEXT_STMT_INFO can't be NULL_TREE. */ op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, - op, gvec_oprnds[i]); - vec_oprnd = (*gvec_oprnds[i])[0]; - dr_chain.quick_push (vec_oprnd); + if (costing_p) + update_prologue_cost (&prologue_cost, op); + else + { + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, + ncopies, op, + gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); + } next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); } - if (mask) + + if (!costing_p) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, - mask, &vec_masks, - mask_vectype); - vec_mask = vec_masks[0]; - } + if (mask) + { + vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, + mask, &vec_masks, + mask_vectype); + vec_mask = vec_masks[0]; + } - /* We should have catched mismatched types earlier. */ - gcc_assert ( - useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); - dataref_ptr - = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, - NULL, offset, &dummy, gsi, - &ptr_incr, false, bump); + /* We should have catched mismatched types earlier. */ + gcc_assert ( + useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, + aggr_type, NULL, offset, &dummy, + gsi, &ptr_incr, false, bump); + } } - else + else if (!costing_p) { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); /* DR_CHAIN is then used as an input to @@ -8917,6 +8942,15 @@ vectorizable_store (vec_info *vinfo, stmt_info, bump); } + if (costing_p) + { + for (i = 0; i < vec_num; i++) + vect_get_store_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + continue; + } + /* Get an array into which we can store the individual vectors. */ tree vec_array = create_vector_array (vectype, vec_num); @@ -9003,6 +9037,12 @@ vectorizable_store (vec_info *vinfo, STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } + if (costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return true; }