From patchwork Tue Jun 13 02:03:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107016 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp250908vqr; Mon, 12 Jun 2023 19:05:10 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5bVRHpc9HYodZ3Vav1zximR7Ox3BGhTc9/JfTVIOnMO/tzXmtbKA9+oXCUwlmOkBgTpIst X-Received: by 2002:a17:907:785:b0:968:2bb1:f39d with SMTP id xd5-20020a170907078500b009682bb1f39dmr11787171ejb.36.1686621910141; Mon, 12 Jun 2023 19:05:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686621910; cv=none; d=google.com; s=arc-20160816; b=buIpJfwVgJHvmthMcZ0X2oOORto0CPcp9m4puDE8616zxSx22gNM0z2PW4E0Dw8YtH LZX5CeMH9rf8x3bbz5bi5wWh88cMaHzbBzyBx2TRFZTjxBqy786AXzvaJqvvIHL2VkA2 5vNV6xMAbY5nOGkadOzbe0xAL4RzBoZloUtPSSh4PJqSbejiHS1/iFJ6dcPnWeNDC5Xk n1JZPnQGw9B9OGLz9ow6RQSqCeEFR6LReAyALYU7Yjm9jCryJdUlXAWjjTOA90fVb1Ts SLCCzooUiyhPyArT6VKqLy8AzsBitdkSiEZPKJEHU3tYl4I/4nnsptFU71DnPpjraWpx aRiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=DfP78s4ejZX/xk6T5/bESAP04I3jfMu8OzccAcqdMNQ=; b=p6kg59pNQ4FKB3ZJSH91qDR1W6FRe19s2wxLQt+Uusy4tv37hms7lvfk6fEN3/7gH4 CHJB91elEPiVAcYZK+XIAqK2Lwu5POrUazvH1YFJJj1eFrNQplF2HFaoF1LCLLxxbhNP S/sdteRDuHeF1CV34EMVhqvCpkD4/HPYso67KmcFX0WaCUDySvZDxrnEQweubypIcjpA jpvABhH10H6OJQO3+72U7d2QlPY3XCBdJ20ugz1Xl3p1tg/xqJZzHc7kZtBCoM/JEqCj tyrkLTezIsJPz949FtjtBR4cvDbaLu7QX0T7XEPVUw35dgjIw9ktsNxvRuWwbu3OEJFl +s8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xdXBbjB2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r27-20020a170906281b00b009665a49fc65si6308490ejc.961.2023.06.12.19.05.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:05:10 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xdXBbjB2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CADF63857716 for ; Tue, 13 Jun 2023 02:04:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CADF63857716 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621884; bh=DfP78s4ejZX/xk6T5/bESAP04I3jfMu8OzccAcqdMNQ=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=xdXBbjB2jbmhp++ovv3KM7IbEYljP0GOAKu+88FTIYx1RUBcMfhVz5NucYcTN7Vdu Sgxm+Yny7l5JgzsMVheP3JRMuq3sIJr72nWBA3iaXKxifLfVqrXlqdjnRQnxwmw3j8 wcDsup1jJ9XmNiSchbKCdtrn67GNnZ1qM5RtIcv0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A9A093858D32 for ; Tue, 13 Jun 2023 02:03:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A9A093858D32 Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D2175X020248; Tue, 13 Jun 2023 02:03:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa2829b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:48 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D22Psj024218; Tue, 13 Jun 2023 02:03:48 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa28281-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:48 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1ojFw031937; Tue, 13 Jun 2023 02:03:45 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma02fra.de.ibm.com (PPS) with ESMTPS id 3r4gt51bmn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:45 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23h0C51773790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:43 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4467C20040; Tue, 13 Jun 2023 02:03:43 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5693420043; Tue, 13 Jun 2023 02:03:42 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:42 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 1/9] vect: Move vect_model_load_cost next to the transform in vectorizable_load Date: Mon, 12 Jun 2023 21:03:22 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 2J0PigIBd2vBtkufuH_x7eLqAvvnw3HZ X-Proofpoint-GUID: LLmldVoNeCc4-gqSznHy8e1UdHh10JHo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 impostorscore=0 suspectscore=0 spamscore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551256242239098?= X-GMAIL-MSGID: =?utf-8?q?1768551256242239098?= This patch is an initial patch to move costing next to the transform, it still adopts vect_model_load_cost for costing but moves and duplicates it down according to the handlings of different vect_memory_access_types, hope it can make the subsequent patches easy to review. This patch should not have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Move and duplicate the call to vect_model_load_cost down to some different transform paths according to the handlings of different vect_memory_access_types. --- gcc/tree-vect-stmts.cc | 86 ++++++++++++++++++++++++++++-------------- 1 file changed, 57 insertions(+), 29 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a7acc032d47..44514658be3 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9430,7 +9430,9 @@ vectorizable_load (vec_info *vinfo, } } - if (!vec_stmt) /* transformation not required. */ + bool costing_p = !vec_stmt; + + if (costing_p) /* transformation not required. */ { if (slp_node && mask @@ -9464,17 +9466,13 @@ vectorizable_load (vec_info *vinfo, vinfo->any_known_not_updated_vssa = true; STMT_VINFO_TYPE (stmt_info) = load_vec_info_type; - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, - alignment_support_scheme, misalignment, - &gs_info, slp_node, cost_vec); - return true; } if (!slp) gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); - if (dump_enabled_p ()) + if (dump_enabled_p () && !costing_p) dump_printf_loc (MSG_NOTE, vect_location, "transform load. ncopies = %d\n", ncopies); @@ -9485,13 +9483,26 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER && gs_info.decl) { - vect_build_gather_load_calls (vinfo, - stmt_info, gsi, vec_stmt, &gs_info, mask); + if (costing_p) + vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, + alignment_support_scheme, misalignment, &gs_info, + slp_node, cost_vec); + else + vect_build_gather_load_calls (vinfo, stmt_info, gsi, vec_stmt, &gs_info, + mask); return true; } if (memory_access_type == VMAT_INVARIANT) { + if (costing_p) + { + vect_model_load_cost (vinfo, stmt_info, ncopies, vf, + memory_access_type, alignment_support_scheme, + misalignment, &gs_info, slp_node, cost_vec); + return true; + } + gcc_assert (!grouped_load && !mask && !bb_vinfo); /* If we have versioned for aliasing or the loop doesn't have any data dependencies that would preclude this, @@ -9548,6 +9559,14 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { + if (costing_p) + { + vect_model_load_cost (vinfo, stmt_info, ncopies, vf, + memory_access_type, alignment_support_scheme, + misalignment, &gs_info, slp_node, cost_vec); + return true; + } + gimple_stmt_iterator incr_gsi; bool insert_after; tree offvar; @@ -9989,17 +10008,20 @@ vectorizable_load (vec_info *vinfo, here, since we can't guarantee first_stmt_info DR has been initialized yet, use first_stmt_info_for_drptr DR by bumping the distance from first_stmt_info DR instead as below. */ - if (!diff_first_stmt_info) - msq = vect_setup_realignment (vinfo, - first_stmt_info, gsi, &realignment_token, - alignment_support_scheme, NULL_TREE, - &at_loop); - if (alignment_support_scheme == dr_explicit_realign_optimized) - { - phi = as_a (SSA_NAME_DEF_STMT (msq)); - offset = size_binop (MINUS_EXPR, TYPE_SIZE_UNIT (vectype), - size_one_node); - gcc_assert (!first_stmt_info_for_drptr); + if (!costing_p) + { + if (!diff_first_stmt_info) + msq = vect_setup_realignment (vinfo, first_stmt_info, gsi, + &realignment_token, + alignment_support_scheme, NULL_TREE, + &at_loop); + if (alignment_support_scheme == dr_explicit_realign_optimized) + { + phi = as_a (SSA_NAME_DEF_STMT (msq)); + offset = size_binop (MINUS_EXPR, TYPE_SIZE_UNIT (vectype), + size_one_node); + gcc_assert (!first_stmt_info_for_drptr); + } } } else @@ -10020,8 +10042,9 @@ vectorizable_load (vec_info *vinfo, else if (memory_access_type == VMAT_GATHER_SCATTER) { aggr_type = elem_type; - vect_get_strided_load_store_ops (stmt_info, loop_vinfo, &gs_info, - &bump, &vec_offset); + if (!costing_p) + vect_get_strided_load_store_ops (stmt_info, loop_vinfo, &gs_info, &bump, + &vec_offset); } else { @@ -10035,7 +10058,7 @@ vectorizable_load (vec_info *vinfo, auto_vec vec_offsets; auto_vec vec_masks; - if (mask) + if (mask && !costing_p) { if (slp_node) vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[mask_index], @@ -10049,7 +10072,7 @@ vectorizable_load (vec_info *vinfo, for (j = 0; j < ncopies; j++) { /* 1. Create the vector or array pointer update chain. */ - if (j == 0) + if (j == 0 && !costing_p) { bool simd_lane_access_p = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; @@ -10108,7 +10131,7 @@ vectorizable_load (vec_info *vinfo, if (mask) vec_mask = vec_masks[0]; } - else + else if (!costing_p) { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); if (dataref_offset) @@ -10125,7 +10148,7 @@ vectorizable_load (vec_info *vinfo, dr_chain.create (vec_num); gimple *new_stmt = NULL; - if (memory_access_type == VMAT_LOAD_STORE_LANES) + if (memory_access_type == VMAT_LOAD_STORE_LANES && !costing_p) { tree vec_array; @@ -10177,7 +10200,7 @@ vectorizable_load (vec_info *vinfo, /* Record that VEC_ARRAY is now dead. */ vect_clobber_variable (vinfo, stmt_info, gsi, vec_array); } - else + else if (!costing_p) { for (i = 0; i < vec_num; i++) { @@ -10631,7 +10654,7 @@ vectorizable_load (vec_info *vinfo, if (slp && !slp_perm) continue; - if (slp_perm) + if (slp_perm && !costing_p) { unsigned n_perms; /* For SLP we know we've seen all possible uses of dr_chain so @@ -10643,7 +10666,7 @@ vectorizable_load (vec_info *vinfo, nullptr, true); gcc_assert (ok); } - else + else if (!costing_p) { if (grouped_load) { @@ -10659,9 +10682,14 @@ vectorizable_load (vec_info *vinfo, } dr_chain.release (); } - if (!slp) + if (!slp && !costing_p) *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; + if (costing_p) + vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, + alignment_support_scheme, misalignment, &gs_info, + slp_node, cost_vec); + return true; } From patchwork Tue Jun 13 02:03:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107019 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251417vqr; Mon, 12 Jun 2023 19:06:33 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ68+7hj1hmqvGyYn7HVACLALZoyyco08yAK3Z5RcxqmN05xCByhX+VISZh4yA2Tw2qpMHq7 X-Received: by 2002:a17:906:730b:b0:973:bcf6:1d4 with SMTP id di11-20020a170906730b00b00973bcf601d4mr13215176ejc.76.1686621993753; Mon, 12 Jun 2023 19:06:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686621993; cv=none; d=google.com; s=arc-20160816; b=dk8cRmYdVkuwjJt7gT5yMnm2bGq5+qozifwJHtFyNQ0/vu6Zjmm4ph9A1BO/7XU6CO fxzi539OO9ohtRSAAEDwjdKewr0yfvf5gTBdN5PCPXjeEcJLOdu/jA/3qt8sjIQVIMnv Z6elRK9vZbJxcJ6ZTtGHeaOVhzuutJHgaL6O5EZ8c6GTirimtI1W9cXYjhRdGVK9pZXL 4hTSCjjBgayUVqYg9Cs/FvkNvHnB5fWYxAtc9xauSh3FqMjB6LHzjS7vO34gtRazxeuY bjJfL0TCJcUQWbAy150OrjHkX84U2b2c9QYZ6S74MZGNO4MbXC4PQ0qOeY9DNB3ut5se uKBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=iSFZQDIFqKy7C8+BrTU+LC3kCipl5K4VhKTUUgxNPoY=; b=mt/gNo9uy9M0L+lyaSh1O3JKPVa5z04GRUPdNmasEUH4l+DSuJxQwBmf8dqEozowS2 2w0h/Iv9iiFlQbRweYkyF4yWXIhiOzK8HsnAW6yeO5PFKgO9v/PjoepvOnfMtv9I5iec vMmsVQEttlXULtNMhLFO218AKq9I58vAyF+XyxmXB+cLz9hFHStPsOSnd9CeoXLfgAky kkDWTnjKtGyR3XI0XEWMlurx65XXYHuZpQ1a3H5/VwTvu1UKuhWlmGthoDY5t55zzCk4 FiUqi8gjYJ2gUcLGwZhCspYTnIbB4BFD7LZJBvyfsNrzIdE0Ia4GJFftABLCCqkJuIpb USDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=IKlutqAP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id am10-20020a170906568a00b00974e764c89bsi5997700ejc.57.2023.06.12.19.06.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:06:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=IKlutqAP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2079438558A6 for ; Tue, 13 Jun 2023 02:05:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2079438558A6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621945; bh=iSFZQDIFqKy7C8+BrTU+LC3kCipl5K4VhKTUUgxNPoY=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=IKlutqAPyw7vPJCLopsgKJOrz6Ui9HdY+lFwVGFKeM88p00rjfB2LSW70nBNS1sew wduqM6Tu78Or3lF7I53+r4p0NreEZ4W3ZCB+UQF2pdVsNNI9Wv/w4xZRK9lBjy1bLF ohyeiQqxyBSgpW7R207h/jldMXSPZ9BKqRLs4sAs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id CF3DA3858D33 for ; Tue, 13 Jun 2023 02:03:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF3DA3858D33 Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D217Z6020252; Tue, 13 Jun 2023 02:03:50 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa2829w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:49 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D22xQ6025525; Tue, 13 Jun 2023 02:03:49 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa2828s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:49 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1lqvZ025458; Tue, 13 Jun 2023 02:03:46 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3r4gee1ue3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:46 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23iPv18023006 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:44 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6FF8F20043; Tue, 13 Jun 2023 02:03:44 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 81B9620040; Tue, 13 Jun 2023 02:03:43 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:43 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 2/9] vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_info.decl Date: Mon, 12 Jun 2023 21:03:23 -0500 Message-Id: <9bad792a4bcef35fbd9906245bf3493672b340fe.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Z-tFxoFxGVOAJ2XOlvMjuOK3EcjvOhDb X-Proofpoint-GUID: c-yv9Djaz3OGVHKcHdarypMAH1gdqogk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 impostorscore=0 suspectscore=0 spamscore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551343692288197?= X-GMAIL-MSGID: =?utf-8?q?1768551343692288197?= This patch adds one extra argument cost_vec to function vect_build_gather_load_calls, so that we can do costing next to the tranform in vect_build_gather_load_calls. For now, the implementation just follows the handlings in vect_model_load_cost, it isn't so good, so placing one FIXME for any further improvement. This patch should not cause any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vect_build_gather_load_calls): Add the handlings on costing with one extra argument cost_vec. (vectorizable_load): Adjust the call to vect_build_gather_load_calls. (vect_model_load_cost): Assert it won't get VMAT_GATHER_SCATTER with gs_info.decl set any more. --- gcc/tree-vect-stmts.cc | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 44514658be3..744cdf40e26 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1135,6 +1135,8 @@ vect_model_load_cost (vec_info *vinfo, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { + gcc_assert (memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl); + unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -2819,7 +2821,8 @@ vect_build_gather_load_calls (vec_info *vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, gimple **vec_stmt, gather_scatter_info *gs_info, - tree mask) + tree mask, + stmt_vector_for_cost *cost_vec) { loop_vec_info loop_vinfo = dyn_cast (vinfo); class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); @@ -2831,6 +2834,23 @@ vect_build_gather_load_calls (vec_info *vinfo, stmt_vec_info stmt_info, poly_uint64 gather_off_nunits = TYPE_VECTOR_SUBPARTS (gs_info->offset_vectype); + /* FIXME: Keep the previous costing way in vect_model_load_cost by costing + N scalar loads, but it should be tweaked to use target specific costs + on related gather load calls. */ + if (!vec_stmt) + { + unsigned int assumed_nunits = vect_nunits_for_cost (vectype); + unsigned int inside_cost; + inside_cost = record_stmt_cost (cost_vec, ncopies * assumed_nunits, + scalar_load, stmt_info, 0, vect_body); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %d, " + "prologue_cost = 0 .\n", + inside_cost); + return; + } + tree arglist = TYPE_ARG_TYPES (TREE_TYPE (gs_info->decl)); tree rettype = TREE_TYPE (TREE_TYPE (gs_info->decl)); tree srctype = TREE_VALUE (arglist); arglist = TREE_CHAIN (arglist); @@ -9483,13 +9503,8 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER && gs_info.decl) { - if (costing_p) - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, - alignment_support_scheme, misalignment, &gs_info, - slp_node, cost_vec); - else - vect_build_gather_load_calls (vinfo, stmt_info, gsi, vec_stmt, &gs_info, - mask); + vect_build_gather_load_calls (vinfo, stmt_info, gsi, vec_stmt, &gs_info, + mask, cost_vec); return true; } From patchwork Tue Jun 13 02:03:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107017 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251049vqr; Mon, 12 Jun 2023 19:05:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Ym5/5ET+RUUW7bUae6ycTh7GdH0Od8QrZg1iCO8jjp2RfqD3fTtAPfGn+bJ2G+cvTdrYC X-Received: by 2002:a17:907:2d86:b0:96a:ee54:9f19 with SMTP id gt6-20020a1709072d8600b0096aee549f19mr12346657ejc.48.1686621931831; Mon, 12 Jun 2023 19:05:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686621931; cv=none; d=google.com; s=arc-20160816; b=QeLw9M7Svh24K7UZ1FwfgOemwfTFrxxrY/1jGgD0uSg33b00KDTeCTlay+6WBUcwY+ eT163ydC7Y5CuTnM6PRPjmt4t33t5xuQVT6V+wJlLpziWPQygbcLP2zDXKAOx1+j3xQO 5iTIwTIGSpbtqSHBJUpnQasbgBv7H4QY1e7BBbX4CBwdW4Wd7UUSR40W9KszoUig+kZY H3ZceTt29ccxfkgl144gUiYhfBX8tUiqsipXglStgFXaVUIv/ZhPnKpNPrmfdCrgaYIg C8jHPd0+Hf6yMtQbSjKxm8tH0mKmSLW5NLEahkUge4cxnBTSmTKROEM5s3nKj/1Cy4C/ AK2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=PlXErXM0Hq1vo6prvL+ycIHRqez5rE/rBL/Xl+0sxWw=; b=E4nGovAhEsXKu4/uoKIRGIhphE+62BjBHs4G9EII9/fjSOi1rgEJyDOtX53m6Ts8zJ rQtZ+0ojLyClu/sB79ZtBluJLL4J6T9KgR+SpTSGb8huF/F6E+axL8S0IN2GuiJBnjdI DSNLNx4Xct5qV2R4eRjsTMaVIHZe13axWf2E9WAflT+absVoPCBpa+dH0D5z7hqB0jRD PjHmh/Sntqt7P3yXDdx/uGvRKj8V2Kd4TpqvPyC2XOy6aJkW3PzJBpUtWdiM7XK+IKwE v0cwufZq2EoqXf2UWyUdfIs1JKSvyN7HNvxL1sptcTy0/lWRQiabc+/IoCKWq7KDaQ2Z ia8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=QMKjKezd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gf5-20020a170906e20500b00977c9989f33si5815833ejb.858.2023.06.12.19.05.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:05:31 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=QMKjKezd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 35948385701D for ; Tue, 13 Jun 2023 02:05:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 35948385701D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621900; bh=PlXErXM0Hq1vo6prvL+ycIHRqez5rE/rBL/Xl+0sxWw=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=QMKjKezdF7ITEXukOjAZwW2uLx2RTohvN+F5VavzPBecjWPhD6nMBZqdeuqav4FC5 lqixC4rw4KIlcFFtPPyQUaGiDi4sYU66GtFHpXtnC55VkKmNkSn44AG43gp/Y1UyVY vFkLNsnoGsPQcH2qv/BJ+f6v3BtGf9ZtcyWmw618= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B20243858CDA for ; Tue, 13 Jun 2023 02:03:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B20243858CDA Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D217x5020227; Tue, 13 Jun 2023 02:03:55 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa282c7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:54 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D21wpp022474; Tue, 13 Jun 2023 02:03:54 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fa2829h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:54 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1DYxK002195; Tue, 13 Jun 2023 02:03:48 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma02fra.de.ibm.com (PPS) with ESMTPS id 3r4gt51bmq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:47 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23jH947448400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:45 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9B1792004B; Tue, 13 Jun 2023 02:03:45 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AD31920040; Tue, 13 Jun 2023 02:03:44 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:44 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 3/9] vect: Adjust vectorizable_load costing on VMAT_INVARIANT Date: Mon, 12 Jun 2023 21:03:24 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: SKYxW2H_i4FvV8m2LrYlsFHky2p7KqAY X-Proofpoint-GUID: 3Pp6oZ4jh4EjJbbvEbk4PZOxCmtFUejG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 impostorscore=0 suspectscore=0 spamscore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551278629007679?= X-GMAIL-MSGID: =?utf-8?q?1768551278629007679?= This patch adjusts the cost handling on VMAT_INVARIANT in function vectorizable_load. We don't call function vect_model_load_cost for it any more. To make the costing on VMAT_INVARIANT better, this patch is to query hoist_defs_of_uses for hoisting decision, and add costs for different "where" based on it. Currently function hoist_defs_of_uses would always hoist the defs of all SSA uses, adding one argument HOIST_P aims to avoid the actual hoisting during costing phase. gcc/ChangeLog: * tree-vect-stmts.cc (hoist_defs_of_uses): Add one argument HOIST_P. (vectorizable_load): Adjust the handling on VMAT_INVARIANT to respect hoisting decision and without calling vect_model_load_cost. (vect_model_load_cost): Assert it won't get VMAT_INVARIANT any more and remove VMAT_INVARIANT related handlings. --- gcc/tree-vect-stmts.cc | 61 +++++++++++++++++++++++++++--------------- 1 file changed, 39 insertions(+), 22 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 744cdf40e26..19c61d703c8 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1135,7 +1135,8 @@ vect_model_load_cost (vec_info *vinfo, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { - gcc_assert (memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl); + gcc_assert ((memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl) + && memory_access_type != VMAT_INVARIANT); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1238,16 +1239,6 @@ vect_model_load_cost (vec_info *vinfo, ncopies * assumed_nunits, scalar_load, stmt_info, 0, vect_body); } - else if (memory_access_type == VMAT_INVARIANT) - { - /* Invariant loads will ideally be hoisted and splat to a vector. */ - prologue_cost += record_stmt_cost (cost_vec, 1, - scalar_load, stmt_info, 0, - vect_prologue); - prologue_cost += record_stmt_cost (cost_vec, 1, - scalar_to_vec, stmt_info, 0, - vect_prologue); - } else vect_get_load_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, misalignment, first_stmt_p, @@ -9121,10 +9112,11 @@ permute_vec_elements (vec_info *vinfo, /* Hoist the definitions of all SSA uses on STMT_INFO out of the loop LOOP, inserting them on the loops preheader edge. Returns true if we were successful in doing so (and thus STMT_INFO can be moved then), - otherwise returns false. */ + otherwise returns false. HOIST_P indicates if we want to hoist the + definitions of all SSA uses, it would be false when we are costing. */ static bool -hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop) +hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p) { ssa_op_iter i; tree op; @@ -9158,6 +9150,9 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop) if (!any) return true; + if (!hoist_p) + return true; + FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE) { gimple *def_stmt = SSA_NAME_DEF_STMT (op); @@ -9510,14 +9505,6 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_INVARIANT) { - if (costing_p) - { - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, - memory_access_type, alignment_support_scheme, - misalignment, &gs_info, slp_node, cost_vec); - return true; - } - gcc_assert (!grouped_load && !mask && !bb_vinfo); /* If we have versioned for aliasing or the loop doesn't have any data dependencies that would preclude this, @@ -9525,7 +9512,37 @@ vectorizable_load (vec_info *vinfo, thus we can insert it on the preheader edge. */ bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) && !nested_in_vect_loop - && hoist_defs_of_uses (stmt_info, loop)); + && hoist_defs_of_uses (stmt_info, loop, !costing_p)); + if (costing_p) + { + if (hoist_p) + { + unsigned int prologue_cost; + prologue_cost = record_stmt_cost (cost_vec, 1, scalar_load, + stmt_info, 0, vect_prologue); + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = 0, " + "prologue_cost = %d .\n", + prologue_cost); + } + else + { + unsigned int inside_cost; + inside_cost = record_stmt_cost (cost_vec, 1, scalar_load, + stmt_info, 0, vect_body); + inside_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_body); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %d, " + "prologue_cost = 0 .\n", + inside_cost); + } + return true; + } if (hoist_p) { gassign *stmt = as_a (stmt_info->stmt); From patchwork Tue Jun 13 02:03:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107022 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp252013vqr; Mon, 12 Jun 2023 19:08:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7h1UKOt5jSMCyO2Wq5sWW+JeGcD5hLbea5rbFHrqnhCWLJTxGIQ711wZIwezyWgr7jBXiy X-Received: by 2002:a17:907:e90:b0:96a:4f89:3916 with SMTP id ho16-20020a1709070e9000b0096a4f893916mr10970824ejc.58.1686622088226; Mon, 12 Jun 2023 19:08:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622088; cv=none; d=google.com; s=arc-20160816; b=a68R0fHJ6FKmN0onKJ9HFy6dXZzZKXyAaSMXD/+UGl/MEsqmyPMvFV7oSSsUistubH sIDOs9/JbIdLazUvt9siikCv5+wOEdJekZtItC4IvD4HBjAeKwBirNcu1bwi83prC8OK YDe3PFYe3DmmSzU6vwX00aa0JLQ10G3ev0y+47+zuHKdEv/DJig/jjPC5rCsDMGUX0tl EJnadTgC1Cfxreleq1x1ZJ9l5D81QUSbx77fnNVTG+T8PIFr0ti+DSANRfEbfFdMxyHb JtOiauTVmDn6+8Zakl0y+fP+mCbvXyt9BSKhhZ7SLQkQy/I6jzfvjBsxt5VtHw86hhVp Q1tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Wk2efnAObf7dh8ft7Yn4+GyyR4hRUw1UbaEwS3COqQY=; b=hetxwy98w+8V+kgZLvi2YD77q353r/hMqJZU/L49QXlF6uaDAF/VK5AzRg7Sv1KK8c RVxVFe7a/def+ZfF0yXEZPdrRrNsyd8wHt1x0+92wsyVmmUteEGsz3U+BbgH5NEVMXx1 +x0NyrCIA2RvMTMK6LVnP+36jDvolx9wZSkodn0D1o0lDmaiPClCBviQgnD9SYSFh5KJ m9YCqPXQ9gNnBuQktckyAVNLlng/56fr3b8NhO/lKL2YCzV/uZkEF7ZmczxdzFPvSZ/e BxytG1fzj+sGeT1Wf0jg+wQBw6nlrGztQa6zbz8zrGwmJ8BVDI9Wni/esbE38aDJYzLI KpPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=w29f6Q9d; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r21-20020a1709064d1500b00977d602118bsi6297657eju.202.2023.06.12.19.08.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:08:08 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=w29f6Q9d; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8CB683856DF4 for ; Tue, 13 Jun 2023 02:07:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8CB683856DF4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686622042; bh=Wk2efnAObf7dh8ft7Yn4+GyyR4hRUw1UbaEwS3COqQY=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=w29f6Q9dG5WCXasd92Mowz6ozDbcDEAOB1eVu+BNBYt/lu27e7pbKzaXI3BxC6t5l Gn/Ri9rVxW4gF5MW5+5VTOUXoeAQRdRU5fFQv7y4SZZ1KmPdia3Z6z2StRTWWwcAV1 UcSjYeItFyr9VUmeTSdp4Va0Rp5rBi4bR6v8Rewk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id C41C83858023 for ; Tue, 13 Jun 2023 02:06:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C41C83858023 Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1lNlZ010299; Tue, 13 Jun 2023 02:06:28 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3gg9s6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:06:27 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D26RN9001455; Tue, 13 Jun 2023 02:06:27 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3gg9m3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:06:26 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D0duK9010323; Tue, 13 Jun 2023 02:03:49 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma06fra.de.ibm.com (PPS) with ESMTPS id 3r4gedsc3q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:49 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23kb920382260 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:46 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C64F520043; Tue, 13 Jun 2023 02:03:46 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D866B2004B; Tue, 13 Jun 2023 02:03:45 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:45 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP Date: Mon, 12 Jun 2023 21:03:25 -0500 Message-Id: <0281a2a022869efe379130aea6e0782e4827ef61.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: reORvQXV-4stvmIX0PZ1UjFtbz4KYBRh X-Proofpoint-ORIG-GUID: -MoQxbBkL_e8Nlig1u4CFiiyPTzmr-Va X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 phishscore=0 spamscore=0 adultscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551442764712674?= X-GMAIL-MSGID: =?utf-8?q?1768551442764712674?= This patch adjusts the cost handling on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP in function vectorizable_load. We don't call function vect_model_load_cost for them any more. As PR82255 shows, we don't always need a vector construction there, moving costing next to the transform can make us only cost for vector construction when it's actually needed. Besides, it can count the number of loads consistently for some cases. PR tree-optimization/82255 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Adjust the cost handling on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP without calling vect_model_load_cost. (vect_model_load_cost): Assert it won't get VMAT_ELEMENTWISE and VMAT_STRIDED_SLP any more, and remove their related handlings. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c: New test. 2023-06-13 Bill Schmidt Kewen Lin --- .../vect/costmodel/ppc/costmodel-pr82255.c | 31 ++++ gcc/tree-vect-stmts.cc | 170 +++++++++++------- 2 files changed, 134 insertions(+), 67 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c new file mode 100644 index 00000000000..9317ee2e15b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ + +/* PR82255: Ensure we don't require a vec_construct cost when we aren't + going to generate a strided load. */ + +extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) +__attribute__ ((__const__)); + +static int +foo (unsigned char *w, int i, unsigned char *x, int j) +{ + int tot = 0; + for (int a = 0; a < 16; a++) + { +#pragma GCC unroll 16 + for (int b = 0; b < 16; b++) + tot += abs (w[b] - x[b]); + w += i; + x += j; + } + return tot; +} + +void +bar (unsigned char *w, unsigned char *x, int i, int *result) +{ + *result = foo (w, 16, x, i); +} + +/* { dg-final { scan-tree-dump-times "vec_construct" 0 "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 19c61d703c8..651dc800380 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1136,7 +1136,9 @@ vect_model_load_cost (vec_info *vinfo, stmt_vector_for_cost *cost_vec) { gcc_assert ((memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl) - && memory_access_type != VMAT_INVARIANT); + && memory_access_type != VMAT_INVARIANT + && memory_access_type != VMAT_ELEMENTWISE + && memory_access_type != VMAT_STRIDED_SLP); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1221,8 +1223,7 @@ vect_model_load_cost (vec_info *vinfo, } /* The loads themselves. */ - if (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_GATHER_SCATTER) + if (memory_access_type == VMAT_GATHER_SCATTER) { tree vectype = STMT_VINFO_VECTYPE (stmt_info); unsigned int assumed_nunits = vect_nunits_for_cost (vectype); @@ -1244,10 +1245,10 @@ vect_model_load_cost (vec_info *vinfo, alignment_support_scheme, misalignment, first_stmt_p, &inside_cost, &prologue_cost, cost_vec, cost_vec, true); - if (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_STRIDED_SLP - || (memory_access_type == VMAT_GATHER_SCATTER - && gs_info->ifn == IFN_LAST && !gs_info->decl)) + + if (memory_access_type == VMAT_GATHER_SCATTER + && gs_info->ifn == IFN_LAST + && !gs_info->decl) inside_cost += record_stmt_cost (cost_vec, ncopies, vec_construct, stmt_info, 0, vect_body); @@ -9591,14 +9592,6 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { - if (costing_p) - { - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, - memory_access_type, alignment_support_scheme, - misalignment, &gs_info, slp_node, cost_vec); - return true; - } - gimple_stmt_iterator incr_gsi; bool insert_after; tree offvar; @@ -9610,6 +9603,7 @@ vectorizable_load (vec_info *vinfo, unsigned int const_nunits = nunits.to_constant (); unsigned HOST_WIDE_INT cst_offset = 0; tree dr_offset; + unsigned int inside_cost = 0; gcc_assert (!LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)); gcc_assert (!nested_in_vect_loop); @@ -9624,6 +9618,7 @@ vectorizable_load (vec_info *vinfo, first_stmt_info = stmt_info; first_dr_info = dr_info; } + if (slp && grouped_load) { group_size = DR_GROUP_SIZE (first_stmt_info); @@ -9640,43 +9635,44 @@ vectorizable_load (vec_info *vinfo, ref_type = reference_alias_ptr_type (DR_REF (dr_info->dr)); } - dr_offset = get_dr_vinfo_offset (vinfo, first_dr_info); - stride_base - = fold_build_pointer_plus - (DR_BASE_ADDRESS (first_dr_info->dr), - size_binop (PLUS_EXPR, - convert_to_ptrofftype (dr_offset), - convert_to_ptrofftype (DR_INIT (first_dr_info->dr)))); - stride_step = fold_convert (sizetype, DR_STEP (first_dr_info->dr)); + if (!costing_p) + { + dr_offset = get_dr_vinfo_offset (vinfo, first_dr_info); + stride_base = fold_build_pointer_plus ( + DR_BASE_ADDRESS (first_dr_info->dr), + size_binop (PLUS_EXPR, convert_to_ptrofftype (dr_offset), + convert_to_ptrofftype (DR_INIT (first_dr_info->dr)))); + stride_step = fold_convert (sizetype, DR_STEP (first_dr_info->dr)); - /* For a load with loop-invariant (but other than power-of-2) - stride (i.e. not a grouped access) like so: + /* For a load with loop-invariant (but other than power-of-2) + stride (i.e. not a grouped access) like so: - for (i = 0; i < n; i += stride) - ... = array[i]; + for (i = 0; i < n; i += stride) + ... = array[i]; - we generate a new induction variable and new accesses to - form a new vector (or vectors, depending on ncopies): + we generate a new induction variable and new accesses to + form a new vector (or vectors, depending on ncopies): - for (j = 0; ; j += VF*stride) - tmp1 = array[j]; - tmp2 = array[j + stride]; - ... - vectemp = {tmp1, tmp2, ...} - */ + for (j = 0; ; j += VF*stride) + tmp1 = array[j]; + tmp2 = array[j + stride]; + ... + vectemp = {tmp1, tmp2, ...} + */ - ivstep = fold_build2 (MULT_EXPR, TREE_TYPE (stride_step), stride_step, - build_int_cst (TREE_TYPE (stride_step), vf)); + ivstep = fold_build2 (MULT_EXPR, TREE_TYPE (stride_step), stride_step, + build_int_cst (TREE_TYPE (stride_step), vf)); - standard_iv_increment_position (loop, &incr_gsi, &insert_after); + standard_iv_increment_position (loop, &incr_gsi, &insert_after); - stride_base = cse_and_gimplify_to_preheader (loop_vinfo, stride_base); - ivstep = cse_and_gimplify_to_preheader (loop_vinfo, ivstep); - create_iv (stride_base, PLUS_EXPR, ivstep, NULL, - loop, &incr_gsi, insert_after, - &offvar, NULL); + stride_base = cse_and_gimplify_to_preheader (loop_vinfo, stride_base); + ivstep = cse_and_gimplify_to_preheader (loop_vinfo, ivstep); + create_iv (stride_base, PLUS_EXPR, ivstep, NULL, + loop, &incr_gsi, insert_after, + &offvar, NULL); - stride_step = cse_and_gimplify_to_preheader (loop_vinfo, stride_step); + stride_step = cse_and_gimplify_to_preheader (loop_vinfo, stride_step); + } running_off = offvar; alias_off = build_int_cst (ref_type, 0); @@ -9743,11 +9739,23 @@ vectorizable_load (vec_info *vinfo, unsigned int n_groups = 0; for (j = 0; j < ncopies; j++) { - if (nloads > 1) + if (nloads > 1 && !costing_p) vec_alloc (v, nloads); gimple *new_stmt = NULL; for (i = 0; i < nloads; i++) { + if (costing_p) + { + if (VECTOR_TYPE_P (ltype)) + vect_get_load_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + false, &inside_cost, nullptr, cost_vec, + cost_vec, true); + else + inside_cost += record_stmt_cost (cost_vec, 1, scalar_load, + stmt_info, 0, vect_body); + continue; + } tree this_off = build_int_cst (TREE_TYPE (alias_off), group_el * elsz + cst_offset); tree data_ref = build2 (MEM_REF, ltype, running_off, this_off); @@ -9778,42 +9786,70 @@ vectorizable_load (vec_info *vinfo, group_el = 0; } } + if (nloads > 1) { - tree vec_inv = build_constructor (lvectype, v); - new_temp = vect_init_vector (vinfo, stmt_info, - vec_inv, lvectype, gsi); - new_stmt = SSA_NAME_DEF_STMT (new_temp); - if (lvectype != vectype) + if (costing_p) + inside_cost += record_stmt_cost (cost_vec, 1, vec_construct, + stmt_info, 0, vect_body); + else { - new_stmt = gimple_build_assign (make_ssa_name (vectype), - VIEW_CONVERT_EXPR, - build1 (VIEW_CONVERT_EXPR, - vectype, new_temp)); - vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + tree vec_inv = build_constructor (lvectype, v); + new_temp = vect_init_vector (vinfo, stmt_info, vec_inv, + lvectype, gsi); + new_stmt = SSA_NAME_DEF_STMT (new_temp); + if (lvectype != vectype) + { + new_stmt + = gimple_build_assign (make_ssa_name (vectype), + VIEW_CONVERT_EXPR, + build1 (VIEW_CONVERT_EXPR, + vectype, new_temp)); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, + gsi); + } } } - if (slp) + if (!costing_p) { - if (slp_perm) - dr_chain.quick_push (gimple_assign_lhs (new_stmt)); + if (slp) + { + if (slp_perm) + dr_chain.quick_push (gimple_assign_lhs (new_stmt)); + else + SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); + } else - SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); - } - else - { - if (j == 0) - *vec_stmt = new_stmt; - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + { + if (j == 0) + *vec_stmt = new_stmt; + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } } } if (slp_perm) { unsigned n_perms; - vect_transform_slp_perm_load (vinfo, slp_node, dr_chain, gsi, vf, - false, &n_perms); + if (costing_p) + { + unsigned n_loads; + vect_transform_slp_perm_load (vinfo, slp_node, vNULL, NULL, vf, + true, &n_perms, &n_loads); + inside_cost += record_stmt_cost (cost_vec, n_perms, vec_perm, + first_stmt_info, 0, vect_body); + } + else + vect_transform_slp_perm_load (vinfo, slp_node, dr_chain, gsi, vf, + false, &n_perms); } + + if (costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = 0 .\n", + inside_cost); + return true; } From patchwork Tue Jun 13 02:03:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107023 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp252315vqr; Mon, 12 Jun 2023 19:09:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6u0lQL2yaWHj6Kyg92shGGhmjIco8UOS/wVUqCkgPphAfggOq4msch7ihiIz9lWA8czxNN X-Received: by 2002:a05:6402:403:b0:514:9e26:1f4b with SMTP id q3-20020a056402040300b005149e261f4bmr6404566edv.0.1686622140594; Mon, 12 Jun 2023 19:09:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622140; cv=none; d=google.com; s=arc-20160816; b=kPQKeCUEhLFDL93G/LTA17GMg/LBNnBh8WvyS2Ff9sAT27q32ojUl3qwS9H7cL0xsT v7aFfsWHGbzdkKfj5hi8OrkxSVipcH1C9atZyqqKBsDTs20yxRd0l33ap9SMo1ZHUImh fGUInNixTHZOsYYg++pKnkuiDRNDASb9eImUk5Wnw8a5vZ5ByLzlN8aw5mhSqMacFh8u 4xPAA59Wy09B+Qe7a9og1NDRGDzo7MYhHMrG1GVHBcrhqQfCWCm1NoMjgPd9+jWkHQMQ 5XTucmZLIBg86BlTX4rMm5LZhsoLmUV45yk5JtdSkPh3j5Bu58qmhaXb3diU5yvpzrgB kmrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=2iIuUoiMNVt4R8YqrRR2tVZ7ts5u/gTwBTY9G88/4xo=; b=OzsdxIrxKIjNqMuw8mX/qrneo7pVPcQiLyrXuA/WeBVr/L7AtWb7iNHrnbEqLhhP+P nZ6CcjB9xVa8H+bGZ2YSGx7RySStQEkcq3PT0u7Q2gX6KO8lX0Z+mM09/8a+xl5QKLYm e8QzkWO7itWFSMn8rlZNQpLL8piPmLZo6ZLKEzRz2yZcEjvn9Je8C+BrLyJ4GqvBybQV 289zJgFaldsJwvyLIXdNIETVrOQTIfz435F07hatqk5LHovvW/w7+KTBoQpyv8ooKdx7 jpk1WqIIZfY7Md3niW0Gpo7mH6lYv2MTaABHBhs06gq14K7TM2+zFijKIZEe7bUiSl+T 4PFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hlE2eGsc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id k17-20020aa7c051000000b0051830e3862fsi3767771edo.4.2023.06.12.19.09.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:09:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hlE2eGsc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EE8A0385660B for ; Tue, 13 Jun 2023 02:07:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EE8A0385660B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686622080; bh=2iIuUoiMNVt4R8YqrRR2tVZ7ts5u/gTwBTY9G88/4xo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=hlE2eGsc9tlnFS2lg1x/ZdHIvtOrtgimVy/OnOsCEuU37P018Ep15VBh8MSm1a5pi ZomvkBGGtNiLThCcNtMhmfsvSazdqkDW2iPByojuPWHR5GlVhJFYAeIknyXkE5V318 cxdP+BnSWc5YxDyBvhxrlN5pfVCG0cspPTIYR72E= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 92CCB3858422 for ; Tue, 13 Jun 2023 02:06:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92CCB3858422 Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1lPuM010320; Tue, 13 Jun 2023 02:06:23 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3gg9ry-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:06:22 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D26LVM000454; Tue, 13 Jun 2023 02:06:21 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3gg9mc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:06:21 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1H3kD031882; Tue, 13 Jun 2023 02:03:50 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3r4gt51ups-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:50 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23mCV42861034 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:48 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F14372004B; Tue, 13 Jun 2023 02:03:47 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0FC3520040; Tue, 13 Jun 2023 02:03:47 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:46 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 5/9] vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER Date: Mon, 12 Jun 2023 21:03:26 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: QOsdEONhD17RqhKwfUcm6s0-o-RyokJw X-Proofpoint-ORIG-GUID: tjEdt0Kgl_RJ_cnf9S09cB94IAnyuwSA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 phishscore=0 spamscore=0 adultscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551497947852469?= X-GMAIL-MSGID: =?utf-8?q?1768551497947852469?= This patch adjusts the cost handling on VMAT_GATHER_SCATTER in function vectorizable_load. We don't call function vect_model_load_cost for it any more. It's mainly for gather loads with IFN or emulated gather loads, it follows the handlings in function vect_model_load_cost. This patch shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Adjust the cost handling on VMAT_GATHER_SCATTER without calling vect_model_load_cost. (vect_model_load_cost): Adjut the assertion on VMAT_GATHER_SCATTER, remove VMAT_GATHER_SCATTER related handlings and the related parameter gs_info. --- gcc/tree-vect-stmts.cc | 123 +++++++++++++++++++++++++---------------- 1 file changed, 75 insertions(+), 48 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 651dc800380..a3fd0bf879e 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1131,11 +1131,10 @@ vect_model_load_cost (vec_info *vinfo, vect_memory_access_type memory_access_type, dr_alignment_support alignment_support_scheme, int misalignment, - gather_scatter_info *gs_info, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { - gcc_assert ((memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl) + gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_INVARIANT && memory_access_type != VMAT_ELEMENTWISE && memory_access_type != VMAT_STRIDED_SLP); @@ -1222,35 +1221,9 @@ vect_model_load_cost (vec_info *vinfo, group_size); } - /* The loads themselves. */ - if (memory_access_type == VMAT_GATHER_SCATTER) - { - tree vectype = STMT_VINFO_VECTYPE (stmt_info); - unsigned int assumed_nunits = vect_nunits_for_cost (vectype); - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info->ifn == IFN_LAST && !gs_info->decl) - /* For emulated gathers N offset vector element extracts - (we assume the scalar scaling and ptr + offset add is consumed by - the load). */ - inside_cost += record_stmt_cost (cost_vec, ncopies * assumed_nunits, - vec_to_scalar, stmt_info, 0, - vect_body); - /* N scalar loads plus gathering them into a vector. */ - inside_cost += record_stmt_cost (cost_vec, - ncopies * assumed_nunits, - scalar_load, stmt_info, 0, vect_body); - } - else - vect_get_load_cost (vinfo, stmt_info, ncopies, - alignment_support_scheme, misalignment, first_stmt_p, - &inside_cost, &prologue_cost, - cost_vec, cost_vec, true); - - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info->ifn == IFN_LAST - && !gs_info->decl) - inside_cost += record_stmt_cost (cost_vec, ncopies, vec_construct, - stmt_info, 0, vect_body); + vect_get_load_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, + misalignment, first_stmt_p, &inside_cost, &prologue_cost, + cost_vec, cost_vec, true); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, @@ -10137,6 +10110,7 @@ vectorizable_load (vec_info *vinfo, } tree vec_mask = NULL_TREE; poly_uint64 group_elt = 0; + unsigned int inside_cost = 0; for (j = 0; j < ncopies; j++) { /* 1. Create the vector or array pointer update chain. */ @@ -10268,23 +10242,25 @@ vectorizable_load (vec_info *vinfo, /* Record that VEC_ARRAY is now dead. */ vect_clobber_variable (vinfo, stmt_info, gsi, vec_array); } - else if (!costing_p) + else { for (i = 0; i < vec_num; i++) { tree final_mask = NULL_TREE; - if (loop_masks - && memory_access_type != VMAT_INVARIANT) - final_mask = vect_get_loop_mask (gsi, loop_masks, - vec_num * ncopies, - vectype, vec_num * j + i); - if (vec_mask) - final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, - final_mask, vec_mask, gsi); - - if (i > 0 && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, - gsi, stmt_info, bump); + if (!costing_p) + { + if (loop_masks && memory_access_type != VMAT_INVARIANT) + final_mask + = vect_get_loop_mask (gsi, loop_masks, vec_num * ncopies, + vectype, vec_num * j + i); + if (vec_mask) + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, + final_mask, vec_mask, gsi); + + if (i > 0 && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, + gsi, stmt_info, bump); + } /* 2. Create the vector-load in the loop. */ switch (alignment_support_scheme) @@ -10298,6 +10274,16 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER && gs_info.ifn != IFN_LAST) { + if (costing_p) + { + unsigned int cnunits + = vect_nunits_for_cost (vectype); + inside_cost + = record_stmt_cost (cost_vec, cnunits, + scalar_load, stmt_info, 0, + vect_body); + goto vec_num_loop_costing_end; + } if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) vec_offset = vec_offsets[vec_num * j + i]; tree zero = build_zero_cst (vectype); @@ -10322,6 +10308,25 @@ vectorizable_load (vec_info *vinfo, gcc_assert (!final_mask); unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); + if (costing_p) + { + /* For emulated gathers N offset vector element + offset add is consumed by the load). */ + inside_cost + = record_stmt_cost (cost_vec, const_nunits, + vec_to_scalar, stmt_info, 0, + vect_body); + /* N scalar loads plus gathering them into a + vector. */ + inside_cost + = record_stmt_cost (cost_vec, const_nunits, + scalar_load, stmt_info, 0, + vect_body); + inside_cost + = record_stmt_cost (cost_vec, 1, vec_construct, + stmt_info, 0, vect_body); + goto vec_num_loop_costing_end; + } unsigned HOST_WIDE_INT const_offset_nunits = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype) .to_constant (); @@ -10374,6 +10379,9 @@ vectorizable_load (vec_info *vinfo, break; } + if (costing_p) + goto vec_num_loop_costing_end; + align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) @@ -10544,6 +10552,8 @@ vectorizable_load (vec_info *vinfo, } case dr_explicit_realign: { + if (costing_p) + goto vec_num_loop_costing_end; tree ptr, bump; tree vs = size_int (TYPE_VECTOR_SUBPARTS (vectype)); @@ -10606,6 +10616,8 @@ vectorizable_load (vec_info *vinfo, } case dr_explicit_realign_optimized: { + if (costing_p) + goto vec_num_loop_costing_end; if (TREE_CODE (dataref_ptr) == SSA_NAME) new_temp = copy_ssa_name (dataref_ptr); else @@ -10702,10 +10714,14 @@ vectorizable_load (vec_info *vinfo, gsi, stmt_info, bump); group_elt = 0; } +vec_num_loop_costing_end: + ; } /* Bump the vector pointer to account for a gap or for excess elements loaded for a permuted SLP load. */ - if (maybe_ne (group_gap_adj, 0U) && slp_perm) + if (!costing_p + && maybe_ne (group_gap_adj, 0U) + && slp_perm) { poly_wide_int bump_val = (wi::to_wide (TYPE_SIZE_UNIT (elem_type)) @@ -10754,9 +10770,20 @@ vectorizable_load (vec_info *vinfo, *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; if (costing_p) - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, - alignment_support_scheme, misalignment, &gs_info, - slp_node, cost_vec); + { + if (memory_access_type == VMAT_GATHER_SCATTER) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = 0 .\n", + inside_cost); + } + else + vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, + alignment_support_scheme, misalignment, slp_node, + cost_vec); + } return true; } From patchwork Tue Jun 13 02:03:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107018 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251063vqr; Mon, 12 Jun 2023 19:05:34 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7il/W9YemQpCtm4kr4Ooykb1Gnn8Ix8B5yv4YfWLPfz3uZY77C+IWJWLpcFb+w+bsQ3DS0 X-Received: by 2002:aa7:cd6f:0:b0:514:a110:6bed with SMTP id ca15-20020aa7cd6f000000b00514a1106bedmr6601274edb.27.1686621933808; Mon, 12 Jun 2023 19:05:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686621933; cv=none; d=google.com; s=arc-20160816; b=Vn1MnUI0urVosHxDZArQ8olzeM43HK5502FDYXkKLqKtLpMuLs0GrUnNwOknRIr6Kl HVBrygvcGePzsN0wcVBXFi2EgiZq1xBISKlFJgKqxzZO4h2aKTjW63YIVGOOVCQHdBrn F5rn9yzrd0f6BMYjWO9+ndOKgD3rznhGUI2iY7xok/E+ibavMKvhZZ4IwArGtJiJ1TcY 1u6qz6m8fh3xUMEDgW6aTw2v/5psFfgKYH0glKaRaRqZAXSa1FN6QbbDuawLOEfsuR6i MF89QFPrSstPo9xCnP1OHMDMamIwB+sPUoMvXAd/NhwvP4VHAMj+vMrGuJlUvbBCeNZC yhkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=USFAO+TfZHyeJuqDINwWdcUMUJcmsagSoosUry4cx+o=; b=Yd5XZHTmo/irgxV0MtJW//YzG71mmy4jGGcf9hVDJmUWumjjQmJ295UA7cPYQwh60r mAfXL/p8rhxYF280WITrs7MOeo7hUiX7jSMRvC98Gotnc/oDAag1VsO9U9bl60/kH7oM ZK4tGYc7ArCQz36OLf1KEHDnMCFAZNqBEQkwxFSmVqbJ8ZCD5sGBgwNycj7Xbt9EwLpl o2x3E8+rS9SMvdgPqPt2DNC/1o85qJC+FfQZTm2ktnwIJG5andA0Xj0e7+04zVhchzW+ JroaP20AsgUASs0oOPcBlK8FihwiHxrs3Cyavw7+KU7Mcc3C7fy2xvEq8Tsc9UrbhSL8 7DTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PUJw18N3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b7-20020a056402138700b00514b1f2da27si6375565edv.222.2023.06.12.19.05.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:05:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PUJw18N3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D44623856601 for ; Tue, 13 Jun 2023 02:05:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D44623856601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621900; bh=USFAO+TfZHyeJuqDINwWdcUMUJcmsagSoosUry4cx+o=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=PUJw18N3ssWldr0Dp+Wz4zfF62GklGSTr9ySj1ilAwk4yUfyFrgl5VZA1zuaL+8lO CPjzCEkj998y3g58pCthzXaJi77ZKnZiDjSmPILwuhjMTv1glxp9qma7nDwzty3Fh3 BjiNbEyWLoTgtL/YuyyUt+kusdBNlYD+c5nQl3Rg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id F35A43858D38 for ; Tue, 13 Jun 2023 02:03:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F35A43858D38 Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1qaB4008139; Tue, 13 Jun 2023 02:03:54 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:54 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D1ujaR017694; Tue, 13 Jun 2023 02:03:53 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886n0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:53 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1qLEv027734; Tue, 13 Jun 2023 02:03:51 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3r4gt4sbt2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:51 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23nvI58458608 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:49 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2722020040; Tue, 13 Jun 2023 02:03:49 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A62B20043; Tue, 13 Jun 2023 02:03:48 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:48 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 6/9] vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES Date: Mon, 12 Jun 2023 21:03:27 -0500 Message-Id: <1a263aa46335ad08c0cd198b4c2075560a3ed44d.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 6T6bzLIMIc7ASoN7NIpRuh_KhZJsOiUF X-Proofpoint-GUID: pSeY0fP0Wl69qI97tKf14nsnH8so7R2h X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 suspectscore=0 spamscore=0 mlxlogscore=999 impostorscore=0 priorityscore=1501 clxscore=1015 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551280943865089?= X-GMAIL-MSGID: =?utf-8?q?1768551280943865089?= This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES in function vectorizable_load. We don't call function vect_model_load_cost for it any more. It follows what we do in the function vect_model_load_cost, and shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Adjust the cost handling on VMAT_LOAD_STORE_LANES without calling vect_model_load_cost. (vectorizable_load): Remove VMAT_LOAD_STORE_LANES related handling and assert it will never get VMAT_LOAD_STORE_LANES. --- gcc/tree-vect-stmts.cc | 73 ++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a3fd0bf879e..4c5ce2ab278 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1137,7 +1137,8 @@ vect_model_load_cost (vec_info *vinfo, gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_INVARIANT && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP); + && memory_access_type != VMAT_STRIDED_SLP + && memory_access_type != VMAT_LOAD_STORE_LANES); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1176,31 +1177,6 @@ vect_model_load_cost (vec_info *vinfo, once per group anyhow. */ bool first_stmt_p = (first_stmt_info == stmt_info); - /* An IFN_LOAD_LANES will load all its vector results, regardless of which - ones we actually need. Account for the cost of unused results. */ - if (first_stmt_p && !slp_node && memory_access_type == VMAT_LOAD_STORE_LANES) - { - unsigned int gaps = DR_GROUP_SIZE (first_stmt_info); - stmt_vec_info next_stmt_info = first_stmt_info; - do - { - gaps -= 1; - next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); - } - while (next_stmt_info); - if (gaps) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: %d unused vectors.\n", - gaps); - vect_get_load_cost (vinfo, stmt_info, ncopies * gaps, - alignment_support_scheme, misalignment, false, - &inside_cost, &prologue_cost, - cost_vec, cost_vec, true); - } - } - /* We assume that the cost of a single load-lanes instruction is equivalent to the cost of DR_GROUP_SIZE separate loads. If a grouped access is instead being provided by a load-and-permute operation, @@ -10110,7 +10086,7 @@ vectorizable_load (vec_info *vinfo, } tree vec_mask = NULL_TREE; poly_uint64 group_elt = 0; - unsigned int inside_cost = 0; + unsigned int inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) { /* 1. Create the vector or array pointer update chain. */ @@ -10190,8 +10166,42 @@ vectorizable_load (vec_info *vinfo, dr_chain.create (vec_num); gimple *new_stmt = NULL; - if (memory_access_type == VMAT_LOAD_STORE_LANES && !costing_p) + if (memory_access_type == VMAT_LOAD_STORE_LANES) { + if (costing_p) + { + /* An IFN_LOAD_LANES will load all its vector results, + regardless of which ones we actually need. Account + for the cost of unused results. */ + if (grouped_load && first_stmt_info == stmt_info) + { + unsigned int gaps = DR_GROUP_SIZE (first_stmt_info); + stmt_vec_info next_stmt_info = first_stmt_info; + do + { + gaps -= 1; + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); + } + while (next_stmt_info); + if (gaps) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: %d " + "unused vectors.\n", + gaps); + vect_get_load_cost (vinfo, stmt_info, gaps, + alignment_support_scheme, + misalignment, false, &inside_cost, + &prologue_cost, cost_vec, cost_vec, + true); + } + } + vect_get_load_cost (vinfo, stmt_info, 1, alignment_support_scheme, + misalignment, false, &inside_cost, + &prologue_cost, cost_vec, cost_vec, true); + continue; + } tree vec_array; vec_array = create_vector_array (vectype, vec_num); @@ -10771,13 +10781,14 @@ vec_num_loop_costing_end: if (costing_p) { - if (memory_access_type == VMAT_GATHER_SCATTER) + if (memory_access_type == VMAT_GATHER_SCATTER + || memory_access_type == VMAT_LOAD_STORE_LANES) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "vect_model_load_cost: inside_cost = %u, " - "prologue_cost = 0 .\n", - inside_cost); + "prologue_cost = %u .\n", + inside_cost, prologue_cost); } else vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, From patchwork Tue Jun 13 02:03:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107024 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp252558vqr; Mon, 12 Jun 2023 19:09:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5UuyCwkVzmpWr+yiD3LKQApV5BDzwdyAPOp9DZW7j6cekBoGNPjmEovaicMg9fFhaRb15S X-Received: by 2002:a17:907:6d9e:b0:97e:a917:e6a5 with SMTP id sb30-20020a1709076d9e00b0097ea917e6a5mr8229256ejc.19.1686622178802; Mon, 12 Jun 2023 19:09:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622178; cv=none; d=google.com; s=arc-20160816; b=iKCcidGZAoLJ/hmcEHmACxYk6Qw0V+qcpA9fWrt3YAwDt3P5NQobOsrEG/suA4X8RT cGXHVhsrVe1NyDnJDDAPVlBQgMeNtG3fPITwwhFZt+nmuikYr2AqOBk4YRWHh7v98Jr+ KA7xRgD7fnOl2uiankk60cC/2bL+6ZcqSUFyuhV4Hn/o8QVLiAWjLD9gdPL0gWpw178y 4bfbhxNmd3zore6vTR4eM8/oJbaaUIE7OD7ayWHYafWayR2t4XA9JmVujhjmwHzrBuKZ oOqs37jwrI88KqnpCqtQt/DCGh/ed8LLgBhsEipUhmC1gfQg05ogm3VMJZU1LNzhfr5L hxRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=ImoyiwXX94q+FFMQvdFVUKR8M60ZthY2akheBRg/UW4=; b=MaF76/YB2vvtkWDBQlJYnDEtkLMhX/O5vZM5I6x/aFTuEToOS5ACsBj0VbJElFyEmX nF+5p36WvzzETp5tf4Lr2iD2swI0IHiHVH7xk5lkv9LYVVke8c0JX62FbfgnOBNiMUGY g2FXt2twpciIGDKidQ4vN1y7gLXHJZAFZ6bEyug8lgtgJkdFW8U5wqECJooH/89HlqBP vWgSn5DQ9bWeNorySdrmFdtJ3u24O30wTG3Sv/WPdyWGM5y9cQBnNxm2tOP2PdXBi2/J VtSoaEiX1IeXN4BS/dxlFpE09uPSVPkvEITA3zkQifTN0+22Rcbv29nRIAlBjRY+amA0 Ivfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=DbIpmLC5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a19-20020a17090682d300b0096f79e9c5efsi2674366ejy.590.2023.06.12.19.09.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:09:38 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=DbIpmLC5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B63153855886 for ; Tue, 13 Jun 2023 02:08:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B63153855886 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686622101; bh=ImoyiwXX94q+FFMQvdFVUKR8M60ZthY2akheBRg/UW4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=DbIpmLC54hNpaEQQb9ceBX+n5Lt3isPIIXTyCO9I5Ygts6YBTJYpdvJO33FrLNPzX osW2s7Dr71F/wOFgR7I9MZ4Te7P5CVKH0v2yj5HoW73/QnciLEAf0r29oonFd+IBOm aC9ecZGOTlWC6v2PWT5S4xQ4QG0zuHq2w5I64Dy0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 8BBC8385C6E8 for ; Tue, 13 Jun 2023 02:07:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8BBC8385C6E8 Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D27QHs012364; Tue, 13 Jun 2023 02:07:31 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3era6r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:07:30 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D27TVU012849; Tue, 13 Jun 2023 02:07:29 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3era21-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:07:28 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D0J485032234; Tue, 13 Jun 2023 02:03:52 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3r4gt51u1c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:52 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23oxv27197988 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:50 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5314C20043; Tue, 13 Jun 2023 02:03:50 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 64FF720040; Tue, 13 Jun 2023 02:03:49 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:49 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 7/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE Date: Mon, 12 Jun 2023 21:03:28 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: LTRSoTFJCf6CNUvPZkwnE0L90vWKn63m X-Proofpoint-ORIG-GUID: CXPBCiOfpGBWTpc4KvoEfczgtoeTXfEf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 spamscore=0 malwarescore=0 adultscore=0 suspectscore=0 mlxlogscore=999 clxscore=1015 impostorscore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551537289399543?= X-GMAIL-MSGID: =?utf-8?q?1768551537289399543?= This patch adjusts the cost handling on VMAT_CONTIGUOUS_REVERSE in function vectorizable_load. We don't call function vect_model_load_cost for it any more. This change makes us not miscount some required vector permutation as the associated test case shows. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_load_cost): Assert it won't get VMAT_CONTIGUOUS_REVERSE any more. (vectorizable_load): Adjust the costing handling on VMAT_CONTIGUOUS_REVERSE without calling vect_model_load_cost. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c: New test. --- .../costmodel/ppc/costmodel-vect-reversed.c | 22 ++++ gcc/tree-vect-stmts.cc | 109 ++++++++++++------ 2 files changed, 93 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c new file mode 100644 index 00000000000..651274be038 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-additional-options "-mvsx" } */ + +/* Verify we do cost the required vec_perm. */ + +int x[1024], y[1024]; + +void +foo () +{ + for (int i = 0; i < 512; ++i) + { + x[2 * i] = y[1023 - (2 * i)]; + x[2 * i + 1] = y[1023 - (2 * i + 1)]; + } +} +/* The reason why it doesn't check the exact count is that + retrying for the epilogue with partial vector capability + like Power10 can result in more than 1 vec_perm. */ +/* { dg-final { scan-tree-dump {\mvec_perm\M} "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 4c5ce2ab278..7f8d9db5363 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1134,11 +1134,8 @@ vect_model_load_cost (vec_info *vinfo, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { - gcc_assert (memory_access_type != VMAT_GATHER_SCATTER - && memory_access_type != VMAT_INVARIANT - && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP - && memory_access_type != VMAT_LOAD_STORE_LANES); + gcc_assert (memory_access_type == VMAT_CONTIGUOUS + || memory_access_type == VMAT_CONTIGUOUS_PERMUTE); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -10292,7 +10289,7 @@ vectorizable_load (vec_info *vinfo, = record_stmt_cost (cost_vec, cnunits, scalar_load, stmt_info, 0, vect_body); - goto vec_num_loop_costing_end; + break; } if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) vec_offset = vec_offsets[vec_num * j + i]; @@ -10335,7 +10332,7 @@ vectorizable_load (vec_info *vinfo, inside_cost = record_stmt_cost (cost_vec, 1, vec_construct, stmt_info, 0, vect_body); - goto vec_num_loop_costing_end; + break; } unsigned HOST_WIDE_INT const_offset_nunits = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype) @@ -10390,7 +10387,7 @@ vectorizable_load (vec_info *vinfo, } if (costing_p) - goto vec_num_loop_costing_end; + break; align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); @@ -10563,7 +10560,7 @@ vectorizable_load (vec_info *vinfo, case dr_explicit_realign: { if (costing_p) - goto vec_num_loop_costing_end; + break; tree ptr, bump; tree vs = size_int (TYPE_VECTOR_SUBPARTS (vectype)); @@ -10627,7 +10624,7 @@ vectorizable_load (vec_info *vinfo, case dr_explicit_realign_optimized: { if (costing_p) - goto vec_num_loop_costing_end; + break; if (TREE_CODE (dataref_ptr) == SSA_NAME) new_temp = copy_ssa_name (dataref_ptr); else @@ -10650,22 +10647,37 @@ vectorizable_load (vec_info *vinfo, default: gcc_unreachable (); } - vec_dest = vect_create_destination_var (scalar_dest, vectype); - /* DATA_REF is null if we've already built the statement. */ - if (data_ref) + + /* One common place to cost the above vect load for different + alignment support schemes. */ + if (costing_p) { - vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); - new_stmt = gimple_build_assign (vec_dest, data_ref); + if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) + vect_get_load_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + false, &inside_cost, &prologue_cost, + cost_vec, cost_vec, true); + } + else + { + vec_dest = vect_create_destination_var (scalar_dest, vectype); + /* DATA_REF is null if we've already built the statement. */ + if (data_ref) + { + vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); + new_stmt = gimple_build_assign (vec_dest, data_ref); + } + new_temp = make_ssa_name (vec_dest, new_stmt); + gimple_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } - new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_set_lhs (new_stmt, new_temp); - vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); /* 3. Handle explicit realignment if necessary/supported. Create in loop: vec_dest = realign_load (msq, lsq, realignment_token) */ - if (alignment_support_scheme == dr_explicit_realign_optimized - || alignment_support_scheme == dr_explicit_realign) + if (!costing_p + && (alignment_support_scheme == dr_explicit_realign_optimized + || alignment_support_scheme == dr_explicit_realign)) { lsq = gimple_assign_lhs (new_stmt); if (!realignment_token) @@ -10690,26 +10702,34 @@ vectorizable_load (vec_info *vinfo, if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) { - tree perm_mask = perm_mask_for_reverse (vectype); - new_temp = permute_vec_elements (vinfo, new_temp, new_temp, - perm_mask, stmt_info, gsi); - new_stmt = SSA_NAME_DEF_STMT (new_temp); + if (costing_p) + inside_cost = record_stmt_cost (cost_vec, 1, vec_perm, + stmt_info, 0, vect_body); + else + { + tree perm_mask = perm_mask_for_reverse (vectype); + new_temp + = permute_vec_elements (vinfo, new_temp, new_temp, + perm_mask, stmt_info, gsi); + new_stmt = SSA_NAME_DEF_STMT (new_temp); + } } /* Collect vector loads and later create their permutation in vect_transform_grouped_load (). */ - if (grouped_load || slp_perm) + if (!costing_p && (grouped_load || slp_perm)) dr_chain.quick_push (new_temp); /* Store vector loads in the corresponding SLP_NODE. */ - if (slp && !slp_perm) + if (!costing_p && slp && !slp_perm) SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); /* With SLP permutation we load the gaps as well, without we need to skip the gaps after we manage to fully load all elements. group_gap_adj is DR_GROUP_SIZE here. */ group_elt += nunits; - if (maybe_ne (group_gap_adj, 0U) + if (!costing_p + && maybe_ne (group_gap_adj, 0U) && !slp_perm && known_eq (group_elt, group_size - group_gap_adj)) { @@ -10724,8 +10744,6 @@ vectorizable_load (vec_info *vinfo, gsi, stmt_info, bump); group_elt = 0; } -vec_num_loop_costing_end: - ; } /* Bump the vector pointer to account for a gap or for excess elements loaded for a permuted SLP load. */ @@ -10748,18 +10766,30 @@ vec_num_loop_costing_end: if (slp && !slp_perm) continue; - if (slp_perm && !costing_p) - { + if (slp_perm) + { unsigned n_perms; /* For SLP we know we've seen all possible uses of dr_chain so direct vect_transform_slp_perm_load to DCE the unused parts. ??? This is a hack to prevent compile-time issues as seen in PR101120 and friends. */ - bool ok = vect_transform_slp_perm_load (vinfo, slp_node, dr_chain, - gsi, vf, false, &n_perms, - nullptr, true); - gcc_assert (ok); - } + if (costing_p + && memory_access_type != VMAT_CONTIGUOUS + && memory_access_type != VMAT_CONTIGUOUS_PERMUTE) + { + vect_transform_slp_perm_load (vinfo, slp_node, vNULL, nullptr, vf, + true, &n_perms, nullptr); + inside_cost = record_stmt_cost (cost_vec, n_perms, vec_perm, + stmt_info, 0, vect_body); + } + else if (!costing_p) + { + bool ok = vect_transform_slp_perm_load (vinfo, slp_node, dr_chain, + gsi, vf, false, &n_perms, + nullptr, true); + gcc_assert (ok); + } + } else if (!costing_p) { if (grouped_load) @@ -10781,8 +10811,11 @@ vec_num_loop_costing_end: if (costing_p) { - if (memory_access_type == VMAT_GATHER_SCATTER - || memory_access_type == VMAT_LOAD_STORE_LANES) + gcc_assert (memory_access_type != VMAT_INVARIANT + && memory_access_type != VMAT_ELEMENTWISE + && memory_access_type != VMAT_STRIDED_SLP); + if (memory_access_type != VMAT_CONTIGUOUS + && memory_access_type != VMAT_CONTIGUOUS_PERMUTE) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, From patchwork Tue Jun 13 02:03:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107021 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251841vqr; Mon, 12 Jun 2023 19:07:40 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6W1LNx4gCFwVKoDd8BfiB8BWW1gp52aiKcVM7mnsJnKme2knEL36V+yFvUf9SMTuf8ZBe1 X-Received: by 2002:a17:907:36c8:b0:973:d84a:33a4 with SMTP id bj8-20020a17090736c800b00973d84a33a4mr10787351ejc.6.1686622060344; Mon, 12 Jun 2023 19:07:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622060; cv=none; d=google.com; s=arc-20160816; b=HQ7KJuuOkJTy7Kp8uFCQUudTUpLJOVjWCkyJ5JzTcsTDajuPH/H2c5aOt93V32WDWm BWNeAhzhnmeFyqHwGlb8tvI+ETWebwX1m0McuOfB3Nw/JhTdT3IzHd+/b7CVHBw1x/F6 cdzwH6Yfi/ChFUzTmP9/BEJ1SHPvicP3YJtqY8eA1MxtWGGZqCO69caP4+CAcCNbzjd/ D83JrdY1A81yqe7K0q6GhtOoWWjONeFPbOwTcZD3V3wUOY15mvbfi5GGv6ucrUGlqFlZ 2ABQOpNJdTgt+n0jhfOa79uf9zrrz1gsxfxUTpFbTj6d0ItVr29Wmkm6PW6ZwcCOJ6M9 qD7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=UWeazaYN0EcO47hx+2yvOIhXof+YK144D1RZcXN1UYo=; b=B7kr8Yyl163cHjy1pqP6fLtSXSnxa6VXbXHvFQvHm7h9JL6Iq32+FqW3OtPAr46zvz oKS5YWgetSW/sZV39LiFzjAugQSwRiN7nMAzz1ecYph7bEgACdIDPCqHEsVlAt/xY2mG Wop7O8ER4oVpNKFufgWmRD96ZuMe1atphV33hADHoufnp0xkHVHxDLNk4RQGt9qlU6Jd q5Ne5HyLekgHNVJJBSN82SZD1O/6s+pCHyaFlR0D2ZxZVsq2GGO+jku0S+8DoqOoAhw4 RCOw2YUVDTOZUB83e30AFIt+dXpmHvlqaxlUi5AkRNzrEuyHrTjO3GvZVrN6dIpaDs+1 dQ2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eFZfcJWL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y15-20020aa7cccf000000b0051062e32fd2si6565682edt.68.2023.06.12.19.07.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:07:40 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eFZfcJWL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9199438555A3 for ; Tue, 13 Jun 2023 02:06:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9199438555A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686622017; bh=UWeazaYN0EcO47hx+2yvOIhXof+YK144D1RZcXN1UYo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=eFZfcJWLt0XzigAUFgvrcLxgetTtNise68r99T2Z9w4ABLZw7K6CBlQ+h+u0gMtjP R4gpY4y/FYigZWcOv44HDlvnb6/H1+wvL0lTK4V8Y2WZKhZoX/IDbvaPlgjoAPEi5P LZ6aA0ATXhIOUraT2sh0FeX5oqELFwwPkqrDcHKY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 2ACD93858C74 for ; Tue, 13 Jun 2023 02:03:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2ACD93858C74 Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1qaB5008139; Tue, 13 Jun 2023 02:03:57 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886pw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:56 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D1xV8A026350; Tue, 13 Jun 2023 02:03:56 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f6886p7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:56 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D1uSHa031197; Tue, 13 Jun 2023 02:03:54 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3r4gt4sbt3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:54 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23phU18547298 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:51 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BBA062004B; Tue, 13 Jun 2023 02:03:51 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 906F720040; Tue, 13 Jun 2023 02:03:50 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:50 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com, ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 8/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE Date: Mon, 12 Jun 2023 21:03:29 -0500 Message-Id: <216bf6e61d4fe2caa6b87ae1e5c8e15b6d31c409.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: J5OwV_98bKJCcbKliKwJrpaeYtsci4Fn X-Proofpoint-GUID: Z8UQk-ZCwXKPGzI_xQ3vJWIl2fGKb1Yp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 suspectscore=0 spamscore=0 mlxlogscore=999 impostorscore=0 priorityscore=1501 clxscore=1011 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551413403506804?= X-GMAIL-MSGID: =?utf-8?q?1768551413403506804?= This patch adjusts the cost handling on VMAT_CONTIGUOUS_PERMUTE in function vectorizable_load. We don't call function vect_model_load_cost for it any more. As the affected test case gcc.target/i386/pr70021.c shows, the previous costing can under-cost the total generated vector loads as for VMAT_CONTIGUOUS_PERMUTE function vect_model_load_cost doesn't consider the group size which is considered as vec_num during the transformation. This patch makes the count of vector load in costing become consistent with what we generates during the transformation. To be more specific, for the given test case, for memory access b[i_20], it costed for 2 vector loads before, with this patch it costs 8 instead, it matches the final count of generated vector loads basing from b. This costing change makes cost model analysis feel it's not profitable to vectorize the first loop, so this patch adjusts the test case without vect cost model any more. But note that this test case also exposes something we can improve further is that although the number of vector permutation what we costed and generated are consistent, but DCE can further optimize some unused permutation out, it would be good if we can predict that and generate only those necessary permutations. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_load_cost): Assert this function only handle memory_access_type VMAT_CONTIGUOUS, remove some VMAT_CONTIGUOUS_PERMUTE related handlings. (vectorizable_load): Adjust the cost handling on VMAT_CONTIGUOUS_PERMUTE without calling vect_model_load_cost. gcc/testsuite/ChangeLog: * gcc.target/i386/pr70021.c: Adjust with -fno-vect-cost-model. --- gcc/testsuite/gcc.target/i386/pr70021.c | 2 +- gcc/tree-vect-stmts.cc | 88 ++++++++++++++----------- 2 files changed, 51 insertions(+), 39 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/pr70021.c b/gcc/testsuite/gcc.target/i386/pr70021.c index 6562c0f2bd0..d509583601e 100644 --- a/gcc/testsuite/gcc.target/i386/pr70021.c +++ b/gcc/testsuite/gcc.target/i386/pr70021.c @@ -1,7 +1,7 @@ /* PR target/70021 */ /* { dg-do run } */ /* { dg-require-effective-target avx2 } */ -/* { dg-options "-O2 -ftree-vectorize -mavx2 -fdump-tree-vect-details -mtune=skylake" } */ +/* { dg-options "-O2 -ftree-vectorize -mavx2 -fdump-tree-vect-details -mtune=skylake -fno-vect-cost-model" } */ #include "avx2-check.h" diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 7f8d9db5363..e7a97dbe05d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1134,8 +1134,7 @@ vect_model_load_cost (vec_info *vinfo, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { - gcc_assert (memory_access_type == VMAT_CONTIGUOUS - || memory_access_type == VMAT_CONTIGUOUS_PERMUTE); + gcc_assert (memory_access_type == VMAT_CONTIGUOUS); unsigned int inside_cost = 0, prologue_cost = 0; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1174,26 +1173,6 @@ vect_model_load_cost (vec_info *vinfo, once per group anyhow. */ bool first_stmt_p = (first_stmt_info == stmt_info); - /* We assume that the cost of a single load-lanes instruction is - equivalent to the cost of DR_GROUP_SIZE separate loads. If a grouped - access is instead being provided by a load-and-permute operation, - include the cost of the permutes. */ - if (first_stmt_p - && memory_access_type == VMAT_CONTIGUOUS_PERMUTE) - { - /* Uses an even and odd extract operations or shuffle operations - for each needed permute. */ - int group_size = DR_GROUP_SIZE (first_stmt_info); - int nstmts = ncopies * ceil_log2 (group_size) * group_size; - inside_cost += record_stmt_cost (cost_vec, nstmts, vec_perm, - stmt_info, 0, vect_body); - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: strided group_size = %d .\n", - group_size); - } - vect_get_load_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, misalignment, first_stmt_p, &inside_cost, &prologue_cost, cost_vec, cost_vec, true); @@ -10652,11 +10631,22 @@ vectorizable_load (vec_info *vinfo, alignment support schemes. */ if (costing_p) { - if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) + /* For VMAT_CONTIGUOUS_PERMUTE if it's grouped load, we + only need to take care of the first stmt, whose + stmt_info is first_stmt_info, vec_num iterating on it + will cover the cost for the remaining, it's consistent + with transforming. For the prologue cost for realign, + we only need to count it once for the whole group. */ + bool first_stmt_info_p = first_stmt_info == stmt_info; + bool add_realign_cost = first_stmt_info_p && i == 0; + if (memory_access_type == VMAT_CONTIGUOUS_REVERSE + || (memory_access_type == VMAT_CONTIGUOUS_PERMUTE + && (!grouped_load || first_stmt_info_p))) vect_get_load_cost (vinfo, stmt_info, 1, alignment_support_scheme, misalignment, - false, &inside_cost, &prologue_cost, - cost_vec, cost_vec, true); + add_realign_cost, &inside_cost, + &prologue_cost, cost_vec, cost_vec, + true); } else { @@ -10774,8 +10764,7 @@ vectorizable_load (vec_info *vinfo, ??? This is a hack to prevent compile-time issues as seen in PR101120 and friends. */ if (costing_p - && memory_access_type != VMAT_CONTIGUOUS - && memory_access_type != VMAT_CONTIGUOUS_PERMUTE) + && memory_access_type != VMAT_CONTIGUOUS) { vect_transform_slp_perm_load (vinfo, slp_node, vNULL, nullptr, vf, true, &n_perms, nullptr); @@ -10790,20 +10779,44 @@ vectorizable_load (vec_info *vinfo, gcc_assert (ok); } } - else if (!costing_p) + else { if (grouped_load) { if (memory_access_type != VMAT_LOAD_STORE_LANES) - vect_transform_grouped_load (vinfo, stmt_info, dr_chain, - group_size, gsi); - *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; - } - else - { - STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + { + gcc_assert (memory_access_type == VMAT_CONTIGUOUS_PERMUTE); + /* We assume that the cost of a single load-lanes instruction + is equivalent to the cost of DR_GROUP_SIZE separate loads. + If a grouped access is instead being provided by a + load-and-permute operation, include the cost of the + permutes. */ + if (costing_p && first_stmt_info == stmt_info) + { + /* Uses an even and odd extract operations or shuffle + operations for each needed permute. */ + int group_size = DR_GROUP_SIZE (first_stmt_info); + int nstmts = ceil_log2 (group_size) * group_size; + inside_cost + += record_stmt_cost (cost_vec, nstmts, vec_perm, + stmt_info, 0, vect_body); + + if (dump_enabled_p ()) + dump_printf_loc ( + MSG_NOTE, vect_location, + "vect_model_load_cost: strided group_size = %d .\n", + group_size); + } + else if (!costing_p) + vect_transform_grouped_load (vinfo, stmt_info, dr_chain, + group_size, gsi); + } + if (!costing_p) + *vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; } - } + else if (!costing_p) + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } dr_chain.release (); } if (!slp && !costing_p) @@ -10814,8 +10827,7 @@ vectorizable_load (vec_info *vinfo, gcc_assert (memory_access_type != VMAT_INVARIANT && memory_access_type != VMAT_ELEMENTWISE && memory_access_type != VMAT_STRIDED_SLP); - if (memory_access_type != VMAT_CONTIGUOUS - && memory_access_type != VMAT_CONTIGUOUS_PERMUTE) + if (memory_access_type != VMAT_CONTIGUOUS) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, From patchwork Tue Jun 13 02:03:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 107020 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp251511vqr; Mon, 12 Jun 2023 19:06:48 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4kXf5Q045AN6B2oNlknynXUgz67bpVLL82ojE8rEMO98uUcK8aJ5eU4ajHh+BBfjjqrqVW X-Received: by 2002:a17:907:742:b0:971:5a79:29ff with SMTP id xc2-20020a170907074200b009715a7929ffmr9235370ejb.48.1686622008306; Mon, 12 Jun 2023 19:06:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622008; cv=none; d=google.com; s=arc-20160816; b=JwLo/g19OJpiLZNYPQn09O/wUdfZQvt3hEGUkbxYL3yReS0q7G/j3H7iMlzW85M+Py wdBR45AABtbgWfSzjcEpP1VFgNbcwa6aDn3wPvKa9sMZ4GMTd2iT6x8ojSj756UkVHfY N81kwyzx9D0yVhUc0Q7XIT5YSqFw8CBCXG7fZzf/cPVuk+64mAjfA65waAJ0WWQiLfmV 678NB08goZHMXjU5MguDxGPI/G6kcCh94jf1R92hbLHWVHCl3kn4hKfm1BHXxb+sFvKj cfJ9gV1Ik/RPXo0Wz3oXEKSrwoOXj2A5C23wRP0YLawxXeZe0sSfzz63rUMH4VFHeOPQ lLNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=KMYcvLTeXv9fOzBkMngu3Q31sMhdAR+5kddROD1r5V4=; b=yKU+OrocXAkKhNg7s9DTalMoPaY//lsCQ/FMBL8nh46Id9y3WuwA9xRDZE6qunMw+B g8WxZYrHhVXm1rLPt2AtW2eWCTdbr5GTbjkJgovtbIPbp2POKA1KwRjkdbSKcvVnN6Yq Rp6jHWgeRoQyi0XHQvYhJFxr5qGMnY2movSVAoOJEC6jych0rkUeFSFivGJv900G8f8f 0GvaAyX0VdLpY8DhnfgGe8v0DHwEFkMHAuKSelod6rRbVjytZoPyv7aGtumGUqCWo5aS jYnmKmE9BEPtECpUK+0nmj55waLAMPvqI6buQAzq7EZTS4ofEn96bxX9BSmwUjMkSK1B yF4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=d45K4LVU; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f14-20020a17090624ce00b0094f7edff334si5901758ejb.688.2023.06.12.19.06.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:06:48 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=d45K4LVU; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 010EE3854E56 for ; Tue, 13 Jun 2023 02:05:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 010EE3854E56 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686621958; bh=KMYcvLTeXv9fOzBkMngu3Q31sMhdAR+5kddROD1r5V4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=d45K4LVUSgcgqigHGVG8B2YLUISxrW98ep5L2mcNSYk5xp5S/18fxUQkvnBIbzViA krqhrr/URTwOwpk3r1b7KQox7Knt0za/r10hRx0Jspo1YiPUMN7G4g8ugj+HRYojhK g29cyeHBNYOlMz1sJcK1h3xg0+4BmOX6KqKkEaaQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 045AD38582A3 for ; Tue, 13 Jun 2023 02:04:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 045AD38582A3 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D21fHv005028; Tue, 13 Jun 2023 02:03:59 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fag81ry-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:58 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D223ve006045; Tue, 13 Jun 2023 02:03:58 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6fag81r0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:57 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D12lgc026986; Tue, 13 Jun 2023 02:03:55 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma02fra.de.ibm.com (PPS) with ESMTPS id 3r4gt51bmr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:55 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23rOL52101622 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:53 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E68C220043; Tue, 13 Jun 2023 02:03:52 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 050BF2004B; Tue, 13 Jun 2023 02:03:52 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:51 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 9/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS Date: Mon, 12 Jun 2023 21:03:30 -0500 Message-Id: <625eccff9102ffe35497ad03ebd8242d6d6b06a4.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: p1cueWveSHt7twT9OiOa-AXj9dS7R63W X-Proofpoint-GUID: nohtEyf5Duo8jh7GliekJv7y3HkwwH1N X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 priorityscore=1501 impostorscore=0 mlxscore=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 lowpriorityscore=0 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551358785653944?= X-GMAIL-MSGID: =?utf-8?q?1768551358785653944?= This patch adjusts the cost handling on VMAT_CONTIGUOUS in function vectorizable_load. We don't call function vect_model_load_cost for it any more. It removes function vect_model_load_cost which becomes useless and unreachable now. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_load_cost): Remove. (vectorizable_load): Adjust the cost handling on VMAT_CONTIGUOUS without calling vect_model_load_cost. --- gcc/tree-vect-stmts.cc | 92 +++++------------------------------------- 1 file changed, 9 insertions(+), 83 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index e7a97dbe05d..be3b277e8e1 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1117,73 +1117,6 @@ vect_get_store_cost (vec_info *, stmt_vec_info stmt_info, int ncopies, } } - -/* Function vect_model_load_cost - - Models cost for loads. In the case of grouped accesses, one access has - the overhead of the grouped access attributed to it. Since unaligned - accesses are supported for loads, we also account for the costs of the - access scheme chosen. */ - -static void -vect_model_load_cost (vec_info *vinfo, - stmt_vec_info stmt_info, unsigned ncopies, poly_uint64 vf, - vect_memory_access_type memory_access_type, - dr_alignment_support alignment_support_scheme, - int misalignment, - slp_tree slp_node, - stmt_vector_for_cost *cost_vec) -{ - gcc_assert (memory_access_type == VMAT_CONTIGUOUS); - - unsigned int inside_cost = 0, prologue_cost = 0; - bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); - - gcc_assert (cost_vec); - - /* ??? Somehow we need to fix this at the callers. */ - if (slp_node) - ncopies = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); - - if (slp_node && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()) - { - /* If the load is permuted then the alignment is determined by - the first group element not by the first scalar stmt DR. */ - stmt_vec_info first_stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); - /* Record the cost for the permutation. */ - unsigned n_perms, n_loads; - vect_transform_slp_perm_load (vinfo, slp_node, vNULL, NULL, - vf, true, &n_perms, &n_loads); - inside_cost += record_stmt_cost (cost_vec, n_perms, vec_perm, - first_stmt_info, 0, vect_body); - - /* And adjust the number of loads performed. This handles - redundancies as well as loads that are later dead. */ - ncopies = n_loads; - } - - /* Grouped loads read all elements in the group at once, - so we want the DR for the first statement. */ - stmt_vec_info first_stmt_info = stmt_info; - if (!slp_node && grouped_access_p) - first_stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); - - /* True if we should include any once-per-group costs as well as - the cost of the statement itself. For SLP we only get called - once per group anyhow. */ - bool first_stmt_p = (first_stmt_info == stmt_info); - - vect_get_load_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, - misalignment, first_stmt_p, &inside_cost, &prologue_cost, - cost_vec, cost_vec, true); - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: inside_cost = %d, " - "prologue_cost = %d .\n", inside_cost, prologue_cost); -} - - /* Calculate cost of DR's memory access. */ void vect_get_load_cost (vec_info *, stmt_vec_info stmt_info, int ncopies, @@ -10639,7 +10572,8 @@ vectorizable_load (vec_info *vinfo, we only need to count it once for the whole group. */ bool first_stmt_info_p = first_stmt_info == stmt_info; bool add_realign_cost = first_stmt_info_p && i == 0; - if (memory_access_type == VMAT_CONTIGUOUS_REVERSE + if (memory_access_type == VMAT_CONTIGUOUS + || memory_access_type == VMAT_CONTIGUOUS_REVERSE || (memory_access_type == VMAT_CONTIGUOUS_PERMUTE && (!grouped_load || first_stmt_info_p))) vect_get_load_cost (vinfo, stmt_info, 1, @@ -10763,15 +10697,14 @@ vectorizable_load (vec_info *vinfo, direct vect_transform_slp_perm_load to DCE the unused parts. ??? This is a hack to prevent compile-time issues as seen in PR101120 and friends. */ - if (costing_p - && memory_access_type != VMAT_CONTIGUOUS) + if (costing_p) { vect_transform_slp_perm_load (vinfo, slp_node, vNULL, nullptr, vf, true, &n_perms, nullptr); inside_cost = record_stmt_cost (cost_vec, n_perms, vec_perm, stmt_info, 0, vect_body); } - else if (!costing_p) + else { bool ok = vect_transform_slp_perm_load (vinfo, slp_node, dr_chain, gsi, vf, false, &n_perms, @@ -10827,18 +10760,11 @@ vectorizable_load (vec_info *vinfo, gcc_assert (memory_access_type != VMAT_INVARIANT && memory_access_type != VMAT_ELEMENTWISE && memory_access_type != VMAT_STRIDED_SLP); - if (memory_access_type != VMAT_CONTIGUOUS) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_load_cost: inside_cost = %u, " - "prologue_cost = %u .\n", - inside_cost, prologue_cost); - } - else - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type, - alignment_support_scheme, misalignment, slp_node, - cost_vec); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: inside_cost = %u, " + "prologue_cost = %u .\n", + inside_cost, prologue_cost); } return true;