From patchwork Tue Jun 13 02:03:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107018 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251063vqr; Mon, 12 Jun 2023 19:05:34 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7il/W9YemQpCtm4kr4Ooykb1Gnn8Ix8B5yv4YfWLPfz3uZY77C+IWJWLpcFb+w+bsQ3DS0 X-Received: by 2002:aa7:cd6f:0:b0:514:a110:6bed with SMTP id ca15-20020aa7cd6f000000b00514a1106bedmr6601274edb.27.1686621933808; Mon, 12 Jun 2023 19:05:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686621933; cv=none; d=google.com; s=arc-20160816; b=Vn1MnUI0urVosHxDZArQ8olzeM43HK5502FDYXkKLqKtLpMuLs0GrUnNwOknRIr6Kl HVBrygvcGePzsN0wcVBXFi2EgiZq1xBISKlFJgKqxzZO4h2aKTjW63YIVGOOVCQHdBrn F5rn9yzrd0f6BMYjWO9+ndOKgD3rznhGUI2iY7xok/E+ibavMKvhZZ4IwArGtJiJ1TcY 1u6qz6m8fh3xUMEDgW6aTw2v/5psFfgKYH0glKaRaRqZAXSa1FN6QbbDuawLOEfsuR6i MF89QFPrSstPo9xCnP1OHMDMamIwB+sPUoMvXAd/NhwvP4VHAMj+vMrGuJlUvbBCeNZC yhkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=USFAO+TfZHyeJuqDINwWdcUMUJcmsagSoosUry4cx+o=; b=Yd5XZHTmo/irgxV0MtJW//YzG71mmy4jGGcf9hVDJmUWumjjQmJ295UA7cPYQwh60r mAfXL/p8rhxYF280WITrs7MOeo7hUiX7jSMRvC98Gotnc/oDAag1VsO9U9bl60/kH7oM ZK4tGYc7ArCQz36OLf1KEHDnMCFAZNqBEQkwxFSmVqbJ8ZCD5sGBgwNycj7Xbt9EwLpl o2x3E8+rS9SMvdgPqPt2DNC/1o85qJC+FfQZTm2ktnwIJG5andA0Xj0e7+04zVhchzW+ JroaP20AsgUASs0oOPcBlK8FihwiHxrs3Cyavw7+KU7Mcc3C7fy2xvEq8Tsc9UrbhSL8 7DTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PUJw18N3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b7-20020a056402138700b00514b1f2da27si6375565edv.222.2023.06.12.19.05.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:05:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PUJw18N3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D44623856601 for ; Tue, 13 Jun 2023 02:05:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D44623856601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621900; bh=USFAO+TfZHyeJuqDINwWdcUMUJcmsagSoosUry4cx+o=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=PUJw18N3ssWldr0Dp+Wz4zfF62GklGSTr9ySj1ilAwk4yUfyFrgl5VZA1zuaL+8lO CPjzCEkj998y3g58pCthzXaJi77ZKnZiDjSmPILwuhjMTv1glxp9qma7nDwzty3Fh3 BjiNbEyWLoTgtL/YuyyUt+kusdBNlYD+c5nQl3Rg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id F35A43858D38 for ; Tue, 13 Jun 2023 02:03:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F35A43858D38 Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1qaB4008139; Tue, 13 Jun 2023 02:03:54 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:54 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D1ujaR017694; Tue, 13 Jun 2023 02:03:53 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886n0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:53 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1qLEv027734; Tue, 13 Jun 2023 02:03:51 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3r4gt4sbt2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:51 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23nvI58458608 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:49 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2722020040; Tue, 13 Jun 2023 02:03:49 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A62B20043; Tue, 13 Jun 2023 02:03:48 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:48 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 6/9] vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES Date: Mon, 12 Jun 2023 21:03:27 -0500 Message-Id: <1a263aa46335ad08c0cd198b4c2075560a3ed44d.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 6T6bzLIMIc7ASoN7NIpRuh_KhZJsOiUF X-Proofpoint-GUID: pSeY0fP0Wl69qI97tKf14nsnH8so7R2h X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 suspectscore=0 spamscore=0 mlxlogscore=999 impostorscore=0 priorityscore=1501 clxscore=1015 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551280943865089?= X-GMAIL-MSGID: =?utf-8?q?1768551280943865089?= This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES in function vectorizable_load. We don't call function vect_model_load_cost for it any more. It follows what we do in the function vect_model_load_cost, and shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Adjust the cost handling on VMAT_LOAD_STORE_LANES without calling vect_model_load_cost. (vectorizable_load): Remove VMAT_LOAD_STORE_LANES related handling and assert it will never get VMAT_LOAD_STORE_LANES. --- gcc/tree-vect-stmts.cc | 73 ++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a3fd0bf879e..4c5ce2ab278 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1137,7 +1137,8 @@ vect_model_load_cost (vec_info *vinfo, gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_INVARIANT && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP); + && memory_access_type != VMAT_STRIDED_SLP + && memory_access_type != VMAT_LOAD_STORE_LANES); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1176,31 +1177,6 @@ vect_model_load_cost (vec_info *vinfo, once per group anyhow. */ bool first_stmt_p = (first_stmt_info == stmt_info); - /* An IFN_LOAD_LANES will load all its vector results, regardless of which - ones we actually need. Account for the cost of unused results. */ - if (first_stmt_p && !slp_node && memory_access_type == VMAT_LOAD_STORE_LANES) - { - unsigned int gaps = DR_GROUP_SIZE (first_stmt_info); - stmt_vec_info next_stmt_info = first_stmt_info; - do - { - gaps -= 1; - next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); - } - while (next_stmt_info); - if (gaps) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: %d unused vectors.\n", - gaps); - vect_get_load_cost (vinfo, stmt_info, ncopies * gaps, - alignment_support_scheme, misalignment, false, - &inside_cost, &prologue_cost, - cost_vec, cost_vec, true); - } - } - /* We assume that the cost of a single load-lanes instruction is equivalent to the cost of DR_GROUP_SIZE separate loads. If a grouped access is instead being provided by a load-and-permute operation, @@ -10110,7 +10086,7 @@ vectorizable_load (vec_info *vinfo, } tree vec_mask = NULL_TREE; poly_uint64 group_elt = 0; - unsigned int inside_cost = 0; + unsigned int inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) { /* 1. Create the vector or array pointer update chain. */ @@ -10190,8 +10166,42 @@ vectorizable_load (vec_info *vinfo, dr_chain.create (vec_num); gimple *new_stmt = NULL; - if (memory_access_type == VMAT_LOAD_STORE_LANES && !costing_p) + if (memory_access_type == VMAT_LOAD_STORE_LANES) { + if (costing_p) + { + /* An IFN_LOAD_LANES will load all its vector results, + regardless of which ones we actually need. Account + for the cost of unused results. */ + if (grouped_load && first_stmt_info == stmt_info) + { + unsigned int gaps = DR_GROUP_SIZE (first_stmt_info); + stmt_vec_info next_stmt_info = first_stmt_info; + do + { + gaps -= 1; + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); + } + while (next_stmt_info); + if (gaps) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: %d " + "unused vectors.\n", + gaps); + vect_get_load_cost (vinfo, stmt_info, gaps, + alignment_support_scheme, + misalignment, false, &inside_cost, + &prologue_cost, cost_vec, cost_vec, + true); + } + } + vect_get_load_cost (vinfo, stmt_info, 1, alignment_support_scheme, + misalignment, false, &inside_cost, + &prologue_cost, cost_vec, cost_vec, true); + continue; + } tree vec_array; vec_array = create_vector_array (vectype, vec_num); @@ -10771,13 +10781,14 @@ vec_num_loop_costing_end: if (costing_p) { - if (memory_access_type == VMAT_GATHER_SCATTER) + if (memory_access_type == VMAT_GATHER_SCATTER + || memory_access_type == VMAT_LOAD_STORE_LANES) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "vect_model_load_cost: inside_cost = %u, " - "prologue_cost = 0 .\n", - inside_cost); + "prologue_cost = %u .\n", + inside_cost, prologue_cost); } else vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type,