From patchwork Thu Sep 14 03:11:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139274 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80522vqi; Wed, 13 Sep 2023 20:16:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF06NCbix3Y0i6MAEDd8KstI8UdP5GOd+bCBKp/ceB+aFsyI7dIe0AVI5j13Xo5+ikVnBrc X-Received: by 2002:a05:6402:4307:b0:523:72fe:a3c4 with SMTP id m7-20020a056402430700b0052372fea3c4mr885428edc.0.1694661378585; Wed, 13 Sep 2023 20:16:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661378; cv=none; d=google.com; s=arc-20160816; b=XkfP96FANgzUsD5ss4SsvErhwLUFYRstB+Y8PXtJP7o/BJ/49H0PoHuSbPgHHm+gcQ +dDUFMdMDs4gp6tTk4i4eBwwrz+Ex/IzKS7R7jKW+TeDCVh3dUrMPogrM9PTiXzvV8c+ cGFcixFVq0cD0idCONYtMtdE8V7UBPxMRjHkEczcTG80yMPfJYuj0FFVw7NuaR/Wwl3h Tt/4/CXOmDa8r2EUyFHBy8nMDCxHw7dBJfoEbba8WHqBeUriUGpJdPO9EooXWOOCpgqW lDSe84Pf6wmVhKoL5C8pECMa5SliH7blyOeWUo/mju7bHmJHA9zMRAeCDJzGOEuo7e3+ 9fHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=pFu4B7Y/fwvGUPyoPOxg6dny9rtausyBqCWslV2S1+4=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=GaOS5EBZMxV/DYGsuPK5PHiMkCWOsm0GOR1YmyfUxH0XbEN9HOlaU3E+i4Fyn46gec D3bdp6jWvIbAxZcCIhwWAn9mgWBA4RtvNtcTWKnKUBnoiECk13u5lUg13FKaV2RI3hWW r25xXMpIQ5Wst6PXu8/ZLfvnfzzDIFITh7QNfGkX3y6zjw+HoTWXlo7Kb2VO46k1aO4X p8t3kDc6UIQIg4RYeM/2MTfEkeGxCTjUUH/3l0A93v72UnQYZ3B+shE7ndtNTsu73Y+X hOO0k/YUZcCHBkE9hiOkTozXV2bq4wnDHwSRhZdZceHA0tAnR3r8rBVFZGqzJEzWxCeR c98A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=JzJLKPxp; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bm11-20020a0564020b0b00b0052a4985b5afsi548436edb.515.2023.09.13.20.16.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:16:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=JzJLKPxp; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 884F23883020 for ; Thu, 14 Sep 2023 03:14:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 884F23883020 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661241; bh=pFu4B7Y/fwvGUPyoPOxg6dny9rtausyBqCWslV2S1+4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=JzJLKPxpVq6QwLbYJa5+r9lijBn7uuz1I0Qo6EhweW3RIY312zojOQ1TlcAX62oCq yaN9hmFhUmkJhJpMlVj+o0d/VW42qyQ7Dlo17aKVa2a6targWUb3T6GiSu8DquD9yV 3CxVkEkiUYuwDKfjqrO/tla48g02KWzRCPi/KLRA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 8088D3858C5F for ; Thu, 14 Sep 2023 03:12:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8088D3858C5F Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E38tnq005534; Thu, 14 Sep 2023 03:12:07 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sd5ru58-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:06 +0000 Received: from m0353727.ppops.net (m0353727.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3BvkM015388; Thu, 14 Sep 2023 03:12:06 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sd5ru4u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:06 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E1WMBF012069; Thu, 14 Sep 2023 03:12:05 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3t13e002hh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:04 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C2qZ45220532 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:03 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 978F920049; Thu, 14 Sep 2023 03:12:02 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D9A9120040; Thu, 14 Sep 2023 03:12:01 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:01 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 01/10] vect: Ensure vect store is supported for some VMAT_ELEMENTWISE case Date: Wed, 13 Sep 2023 22:11:50 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: rAt9XSGBWAHWWkYhun0-IP1eGt5n3NsJ X-Proofpoint-GUID: e98G5SqHweh76m49-6joZhdEmPJ81a6P X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 impostorscore=0 mlxscore=0 priorityscore=1501 clxscore=1015 malwarescore=0 mlxlogscore=999 spamscore=0 lowpriorityscore=0 suspectscore=0 adultscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981249861684733 X-GMAIL-MSGID: 1776981249861684733 When making/testing patches to move costing next to the transform code for vectorizable_store, some ICEs got exposed when I further refined the costing handlings on VMAT_ELEMENTWISE. The apparent cause is triggering the assertion in rs6000 specific function for costing rs6000_builtin_vectorization_cost: if (TARGET_ALTIVEC) /* Misaligned stores are not supported. */ gcc_unreachable (); I used vect_get_store_cost instead of the original way by record_stmt_cost with scalar_store for costing, that is to use one unaligned_store instead, it matches what we use in transforming, it's a vector store as below: else if (group_size >= const_nunits && group_size % const_nunits == 0) { nstores = 1; lnel = const_nunits; ltype = vectype; lvectype = vectype; } So IMHO it's more consistent with vector store instead of scalar store, with the given compilation option -mno-allow-movmisalign, the misaligned vector store is unexpected to be used in vectorizer, but why it's still adopted? In the current implementation of function get_group_load_store_type, we always set alignment support scheme as dr_unaligned_supported for VMAT_ELEMENTWISE, it is true if we always adopt scalar stores, but as the above code shows, we could use vector stores for some cases, so we should use the correct alignment support scheme for it. This patch is to ensure the vector store is supported by further checking with vect_supportable_dr_alignment. The ICEs got exposed with patches moving costing next to the transform but they haven't been landed, the test coverage would be there once they get landed. The affected test cases are: - gcc.dg/vect/slp-45.c - gcc.dg/vect/vect-alias-check-{10,11,12}.c btw, I tried to make some correctness test case, but I realized that -mno-allow-movmisalign is mainly for noting movmisalign optab and it doesn't guard for the actual hw vector memory access insns, so I failed to make it unless I also altered some conditions for them as it. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Ensure the generated vector store for some case of VMAT_ELEMENTWISE is supported. --- gcc/tree-vect-stmts.cc | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index cd7c1090d88..a5caaf0bca2 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8558,10 +8558,18 @@ vectorizable_store (vec_info *vinfo, else if (group_size >= const_nunits && group_size % const_nunits == 0) { - nstores = 1; - lnel = const_nunits; - ltype = vectype; - lvectype = vectype; + int mis_align = dr_misalignment (first_dr_info, vectype); + dr_alignment_support dr_align + = vect_supportable_dr_alignment (vinfo, dr_info, vectype, + mis_align); + if (dr_align == dr_aligned + || dr_align == dr_unaligned_supported) + { + nstores = 1; + lnel = const_nunits; + ltype = vectype; + lvectype = vectype; + } } ltype = build_aligned_type (ltype, TYPE_ALIGN (elem_type)); ncopies = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); From patchwork Thu Sep 14 03:11:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139269 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp79676vqi; Wed, 13 Sep 2023 20:14:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGcrFfj4W6Eapbe0U8qUuukifh9VE5MwCPEJeqacOHSQqwzQbVgWSOfLg2WxXnV1ahVebT+ X-Received: by 2002:a17:906:3282:b0:9a2:120a:5779 with SMTP id 2-20020a170906328200b009a2120a5779mr3163326ejw.60.1694661242936; Wed, 13 Sep 2023 20:14:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661242; cv=none; d=google.com; s=arc-20160816; b=DVSJ+W9X4R3dgCMaucwrFHIBwVPJJe77JQKj36JtoAQu0tEvPf1aYAj6ILFQpLxpcy KphqFB3Nfq2dFrhzsMitMwIU9Wsx4OYt5tz8feI0IqMGJVnTK1ZJtHfia5A1eisgONUh ZsqDpaZdC+gfseJ3qXAdlsKGqxMiT1YaNVFIyTbO1deAUhKVost8K4b1j82rg1v6lsUz UkthlscIehAkor/Var9wnJ82rl9xZA73U/+0ttoGjGW2l7gqn6dj5XtTCU7eaQ4R89sL F9yKQ3h+lWXbR22uL4nfMHoEx2FYhFeUWokn3jijpOnywzCIqwqOOa3CNixXGJiawCet Gc/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=hBnjte7FFsy6RRoK9snmkszsO0pbNv4vGDg463ngIMw=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=RRPAWzdwlmf1m2ASouKV9i5AnElsI4/Zg11/peGmfsrEtZWBjR3hPsB9ejnvvhek0w 75ouo3KWj2yXK2i9UwPffJdojcZUOyNuituhhP1sVGUHku9DKxlGaRT//A9t/PL/aB3r 063EbXc8U7Fomc/gbQCkHTrkf5IEDjmOfqhJpS4qEBlgC/+xA2n3YkAo1w+ZzfOIk5er 8Oh/smLLUvVg4EmiMJGUpJ/wCyqisR4C3RxfvAWl2OjS+Al7ayzNEemUPTCui1bsFcjz 74+XLm7txPyWx5JOJwhk9YseRH4BdD1XoUuVuUX72Q48hY9p1eo8WtpcybdHf7zJ5VdX aIHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TJivs19w; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f25-20020a170906085900b0098283e90548si485917ejd.570.2023.09.13.20.14.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:14:02 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TJivs19w; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC5683839DDF for ; Thu, 14 Sep 2023 03:13:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC5683839DDF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661180; bh=hBnjte7FFsy6RRoK9snmkszsO0pbNv4vGDg463ngIMw=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=TJivs19wfyfA9I8pVlFu7nVNmJDj4Y8sfCN79g8o856xLqjZDnthVhMNH5ZVRSdSp qeMAr4v1fWiXAf6s/cC9D/X8AbnxA6SselWmquqkqZUAW3VMZ6D1zpcQIZzONF6E3a 8/cOKtbABJ4c/9Lzl6jfhJbl8Tz46V+I2gkop4zw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id EC26C3858C2B for ; Thu, 14 Sep 2023 03:12:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EC26C3858C2B Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E38tU5028540; Thu, 14 Sep 2023 03:12:07 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sq3rar4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:06 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3ALtU002343; Thu, 14 Sep 2023 03:12:06 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sq3raqy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:06 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E18D4F024088; Thu, 14 Sep 2023 03:12:05 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7j6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:05 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C3hq16056960 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:03 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B0D5E20040; Thu, 14 Sep 2023 03:12:03 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E89932004B; Thu, 14 Sep 2023 03:12:02 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:02 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 02/10] vect: Move vect_model_store_cost next to the transform in vectorizable_store Date: Wed, 13 Sep 2023 22:11:51 -0500 Message-Id: <1539ec7d34af4e38467420b3aed342d708a64a48.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: DF7KICfyTP2M0VtLkqkznXEtcQJoW0n4 X-Proofpoint-ORIG-GUID: cpsy1P73M4FF_amiWGjShHfXRIcpDQe7 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 suspectscore=0 adultscore=0 spamscore=0 clxscore=1015 malwarescore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 impostorscore=0 mlxlogscore=973 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981107466135062 X-GMAIL-MSGID: 1776981107466135062 This patch is an initial patch to move costing next to the transform, it still adopts vect_model_store_cost for costing but moves and duplicates it down according to the handlings of different vect_memory_access_types or some special handling need, hope it can make the subsequent patches easy to review. This patch should not have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Move and duplicate the call to vect_model_store_cost down to some different transform paths according to the handlings of different vect_memory_access_types or some special handling need. --- gcc/tree-vect-stmts.cc | 79 ++++++++++++++++++++++++++++++++---------- 1 file changed, 60 insertions(+), 19 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a5caaf0bca2..36f7c5b9f4b 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8372,7 +8372,8 @@ vectorizable_store (vec_info *vinfo, return false; } - if (!vec_stmt) /* transformation not required. */ + bool costing_p = !vec_stmt; + if (costing_p) /* transformation not required. */ { STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) = memory_access_type; @@ -8401,11 +8402,6 @@ vectorizable_store (vec_info *vinfo, "Vectorizing an unaligned access.\n"); STMT_VINFO_TYPE (stmt_info) = store_vec_info_type; - vect_model_store_cost (vinfo, stmt_info, ncopies, - memory_access_type, &gs_info, - alignment_support_scheme, - misalignment, vls_type, slp_node, cost_vec); - return true; } gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); @@ -8415,12 +8411,27 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER && gs_info.decl) { - vect_build_scatter_store_calls (vinfo, stmt_info, gsi, vec_stmt, - &gs_info, mask); + if (costing_p) + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, misalignment, + vls_type, slp_node, cost_vec); + else + vect_build_scatter_store_calls (vinfo, stmt_info, gsi, vec_stmt, + &gs_info, mask); return true; } else if (STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) >= 3) - return vectorizable_scan_store (vinfo, stmt_info, gsi, vec_stmt, ncopies); + { + gcc_assert (memory_access_type == VMAT_CONTIGUOUS); + if (costing_p) + { + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, + misalignment, vls_type, slp_node, cost_vec); + return true; + } + return vectorizable_scan_store (vinfo, stmt_info, gsi, vec_stmt, ncopies); + } if (grouped_store) { @@ -8449,13 +8460,21 @@ vectorizable_store (vec_info *vinfo, else ref_type = reference_alias_ptr_type (DR_REF (first_dr_info->dr)); - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "transform store. ncopies = %d\n", ncopies); + if (!costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, "transform store. ncopies = %d\n", + ncopies); if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { + if (costing_p) + { + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, + misalignment, vls_type, slp_node, cost_vec); + return true; + } + gimple_stmt_iterator incr_gsi; bool insert_after; gimple *incr; @@ -8718,8 +8737,9 @@ vectorizable_store (vec_info *vinfo, else if (memory_access_type == VMAT_GATHER_SCATTER) { aggr_type = elem_type; - vect_get_strided_load_store_ops (stmt_info, loop_vinfo, gsi, &gs_info, - &bump, &vec_offset, loop_lens); + if (!costing_p) + vect_get_strided_load_store_ops (stmt_info, loop_vinfo, gsi, &gs_info, + &bump, &vec_offset, loop_lens); } else { @@ -8731,7 +8751,7 @@ vectorizable_store (vec_info *vinfo, memory_access_type, loop_lens); } - if (mask) + if (mask && !costing_p) LOOP_VINFO_HAS_MASK_STORE (loop_vinfo) = true; /* In case the vectorization factor (VF) is bigger than the number @@ -8782,6 +8802,13 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_LOAD_STORE_LANES) { gcc_assert (!slp && grouped_store); + if (costing_p) + { + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, + misalignment, vls_type, slp_node, cost_vec); + return true; + } for (j = 0; j < ncopies; j++) { gimple *new_stmt; @@ -8927,6 +8954,13 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER) { gcc_assert (!slp && !grouped_store); + if (costing_p) + { + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, + misalignment, vls_type, slp_node, cost_vec); + return true; + } auto_vec vec_offsets; for (j = 0; j < ncopies; j++) { @@ -9091,7 +9125,7 @@ vectorizable_store (vec_info *vinfo, for (j = 0; j < ncopies; j++) { gimple *new_stmt; - if (j == 0) + if (j == 0 && !costing_p) { if (slp) { @@ -9158,7 +9192,7 @@ vectorizable_store (vec_info *vinfo, offset, &dummy, gsi, &ptr_incr, simd_lane_access_p, bump); } - else + else if (!costing_p) { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); /* DR_CHAIN is then used as an input to vect_permute_store_chain(). @@ -9179,7 +9213,7 @@ vectorizable_store (vec_info *vinfo, } new_stmt = NULL; - if (grouped_store) + if (!costing_p && grouped_store) /* Permute. */ vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, gsi, &result_chain); @@ -9187,6 +9221,8 @@ vectorizable_store (vec_info *vinfo, stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < vec_num; i++) { + if (costing_p) + continue; unsigned misalign; unsigned HOST_WIDE_INT align; @@ -9361,7 +9397,7 @@ vectorizable_store (vec_info *vinfo, if (!next_stmt_info) break; } - if (!slp) + if (!slp && !costing_p) { if (j == 0) *vec_stmt = new_stmt; @@ -9369,6 +9405,11 @@ vectorizable_store (vec_info *vinfo, } } + if (costing_p) + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + &gs_info, alignment_support_scheme, misalignment, + vls_type, slp_node, cost_vec); + return true; } From patchwork Thu Sep 14 03:11:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139278 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp81051vqi; Wed, 13 Sep 2023 20:17:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHZhAUlmv7VZPNuEY9nCYRIgKOBU0chIrkZ0ohLOYMsvvBZZoQEwUsNDXjLiWL6O4MciLvH X-Received: by 2002:a5d:5486:0:b0:31f:a259:733 with SMTP id h6-20020a5d5486000000b0031fa2590733mr3383847wrv.20.1694661462199; Wed, 13 Sep 2023 20:17:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661462; cv=none; d=google.com; s=arc-20160816; b=tcb1kgGVg4CRs5Ibrn8zfcmcKY0CnAR4JxXHriaA6UHe3KkADu05XQewiCAVMEEcvx Pt7vr8TmxkRKNhJlCpgSZDsgIGqBKgT7034D1XXbs2mfDAZksIolg7CJfPp3fOBej8If Wjh4UCFDLzWs78YIUOEjubL2aHgF9IMiQviA44EyQopNAwEs/WKlP7keoZCSR4ScK9Fw DNI+i5V7XZmr3TRy8wJWgFJemKf+Q1PzKCTsXbY4OSHMBeur89SpFiXnqSFcoH8xhRH8 Z8rX9B9tBuwrnsyAyF/LKjieOSnoHZMNLBun/y9k2myCQxL+4iJHIu2BRtM7TVwkbttY JY7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=uL6bd/ub+xytwJBUOufIlU+oclxP38BlDoWzO935S6I=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=VI9XcH+wizyKutygG9R+QpzU/zz7kmiWbpzf0C6Hx/UDoGr8QsLu/65A9sAi4plPGK sVdWmF73h6yg12cSLivaNPmXo21iLhb2N4zyu3d94gADUj1HEJhuBoC/GmNDve+FNGjq /v/eBkja6hU24eLoVllcjDDMcUNyVCWCBY3xrw/9ceZKFQGUfnHEvfhdua9EKB9zV1n6 bZR3jpHKtdBQT199XmRAeJfTzcBWeKgGvybxCe/iBwJWOPd9q7kKE6ikNMFwsBfV84zU /17bXU/GUoHknA3k93SjeNL8AR4grCwmOxOMPEemjhM4Qxgtqv6eDCCRxKfEjhBlBkAM g0OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rXBVObg6; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e6-20020a50fb86000000b0052f5f8a9db6si547619edq.91.2023.09.13.20.17.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:17:42 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rXBVObg6; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AC43388202D for ; Thu, 14 Sep 2023 03:15:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8AC43388202D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661300; bh=uL6bd/ub+xytwJBUOufIlU+oclxP38BlDoWzO935S6I=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=rXBVObg67NTxbF+M62Y9nR5yE7fGGEHaCPjJf8xILzbTabiHLdgrCIyCUnt+6zhur w9buVoOP4gaXe1mW29Pcr6lPW+u6BQ2IYTrsH7i+mzIzt3VMWYFhlqlaElXuth64dn 9kMy0SIe8Q1dZLVOP6G2f5133qB6yjJwaAw/6fz8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 6BB023858C3A for ; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6BB023858C3A Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E37MKY022234; Thu, 14 Sep 2023 03:12:08 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3ssm0c0v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:07 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E38CMv027867; Thu, 14 Sep 2023 03:12:07 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3ssm0c0m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:07 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E198SP024021; Thu, 14 Sep 2023 03:12:06 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7j8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:06 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C4G344368432 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:04 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C038220049; Thu, 14 Sep 2023 03:12:04 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0546720040; Thu, 14 Sep 2023 03:12:04 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:03 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 03/10] vect: Adjust vectorizable_store costing on VMAT_GATHER_SCATTER Date: Wed, 13 Sep 2023 22:11:52 -0500 Message-Id: <8abc6ddb4683d9058ffb48eb54f3a717e655efb4.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Sxgm0zRklnKuXp0jRp_qzNmrYjYLBWx2 X-Proofpoint-GUID: LCQS5XHNY2je7-0FR9EsxVbOMutU0xQs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 mlxscore=0 bulkscore=0 spamscore=0 impostorscore=0 lowpriorityscore=0 suspectscore=0 clxscore=1015 mlxlogscore=999 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981337479286747 X-GMAIL-MSGID: 1776981337479286747 This patch adjusts the cost handling on VMAT_GATHER_SCATTER in function vectorizable_store (all three cases), then we won't depend on vect_model_load_store for its costing any more. This patch shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get VMAT_GATHER_SCATTER any more, remove VMAT_GATHER_SCATTER related handlings and the related parameter gs_info. (vect_build_scatter_store_calls): Add the handlings on costing with one more argument cost_vec. (vectorizable_store): Adjust the cost handling on VMAT_GATHER_SCATTER without calling vect_model_store_cost any more. --- gcc/tree-vect-stmts.cc | 188 ++++++++++++++++++++++++++--------------- 1 file changed, 118 insertions(+), 70 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 36f7c5b9f4b..3f908242fee 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -959,12 +959,12 @@ cfun_returns (tree decl) static void vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, vect_memory_access_type memory_access_type, - gather_scatter_info *gs_info, dr_alignment_support alignment_support_scheme, int misalignment, vec_load_store_type vls_type, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { + gcc_assert (memory_access_type != VMAT_GATHER_SCATTER); unsigned int inside_cost = 0, prologue_cost = 0; stmt_vec_info first_stmt_info = stmt_info; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1012,18 +1012,9 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, tree vectype = STMT_VINFO_VECTYPE (stmt_info); /* Costs of the stores. */ - if (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_GATHER_SCATTER) + if (memory_access_type == VMAT_ELEMENTWISE) { unsigned int assumed_nunits = vect_nunits_for_cost (vectype); - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info->ifn == IFN_LAST && !gs_info->decl) - /* For emulated scatter N offset vector element extracts - (we assume the scalar scaling and ptr + offset add is consumed by - the load). */ - inside_cost += record_stmt_cost (cost_vec, ncopies * assumed_nunits, - vec_to_scalar, stmt_info, 0, - vect_body); /* N scalar stores plus extracting the elements. */ inside_cost += record_stmt_cost (cost_vec, ncopies * assumed_nunits, @@ -1034,9 +1025,7 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, misalignment, &inside_cost, cost_vec); if (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_STRIDED_SLP - || (memory_access_type == VMAT_GATHER_SCATTER - && gs_info->ifn == IFN_LAST && !gs_info->decl)) + || memory_access_type == VMAT_STRIDED_SLP) { /* N scalar stores plus extracting the elements. */ unsigned int assumed_nunits = vect_nunits_for_cost (vectype); @@ -2999,7 +2988,8 @@ vect_build_gather_load_calls (vec_info *vinfo, stmt_vec_info stmt_info, static void vect_build_scatter_store_calls (vec_info *vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, gimple **vec_stmt, - gather_scatter_info *gs_info, tree mask) + gather_scatter_info *gs_info, tree mask, + stmt_vector_for_cost *cost_vec) { loop_vec_info loop_vinfo = dyn_cast (vinfo); tree vectype = STMT_VINFO_VECTYPE (stmt_info); @@ -3009,6 +2999,30 @@ vect_build_scatter_store_calls (vec_info *vinfo, stmt_vec_info stmt_info, poly_uint64 scatter_off_nunits = TYPE_VECTOR_SUBPARTS (gs_info->offset_vectype); + /* FIXME: Keep the previous costing way in vect_model_store_cost by + costing N scalar stores, but it should be tweaked to use target + specific costs on related scatter store calls. */ + if (cost_vec) + { + tree op = vect_get_store_rhs (stmt_info); + enum vect_def_type dt; + gcc_assert (vect_is_simple_use (op, vinfo, &dt)); + unsigned int inside_cost, prologue_cost = 0; + if (dt == vect_constant_def || dt == vect_external_def) + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); + unsigned int assumed_nunits = vect_nunits_for_cost (vectype); + inside_cost = record_stmt_cost (cost_vec, ncopies * assumed_nunits, + scalar_store, stmt_info, 0, vect_body); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return; + } + tree perm_mask = NULL_TREE, mask_halfvectype = NULL_TREE; if (known_eq (nunits, scatter_off_nunits)) modifier = NONE; @@ -8411,13 +8425,8 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER && gs_info.decl) { - if (costing_p) - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); - else - vect_build_scatter_store_calls (vinfo, stmt_info, gsi, vec_stmt, - &gs_info, mask); + vect_build_scatter_store_calls (vinfo, stmt_info, gsi, vec_stmt, &gs_info, + mask, cost_vec); return true; } else if (STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) >= 3) @@ -8426,8 +8435,8 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, - misalignment, vls_type, slp_node, cost_vec); + alignment_support_scheme, misalignment, + vls_type, slp_node, cost_vec); return true; } return vectorizable_scan_store (vinfo, stmt_info, gsi, vec_stmt, ncopies); @@ -8470,8 +8479,8 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, - misalignment, vls_type, slp_node, cost_vec); + alignment_support_scheme, misalignment, + vls_type, slp_node, cost_vec); return true; } @@ -8805,8 +8814,8 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, - misalignment, vls_type, slp_node, cost_vec); + alignment_support_scheme, misalignment, + vls_type, slp_node, cost_vec); return true; } for (j = 0; j < ncopies; j++) @@ -8954,49 +8963,50 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_GATHER_SCATTER) { gcc_assert (!slp && !grouped_store); - if (costing_p) - { - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, - misalignment, vls_type, slp_node, cost_vec); - return true; - } auto_vec vec_offsets; + unsigned int inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) { gimple *new_stmt; if (j == 0) { - /* Since the store is not grouped, DR_GROUP_SIZE is 1, and - DR_CHAIN is of size 1. */ - gcc_assert (group_size == 1); - op = vect_get_store_rhs (first_stmt_info); - vect_get_vec_defs_for_operand (vinfo, first_stmt_info, ncopies, - op, gvec_oprnds[0]); - vec_oprnd = (*gvec_oprnds[0])[0]; - dr_chain.quick_push (vec_oprnd); - if (mask) + if (costing_p && vls_type == VLS_STORE_INVARIANT) + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); + else if (!costing_p) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, - mask, &vec_masks, - mask_vectype); - vec_mask = vec_masks[0]; - } + /* Since the store is not grouped, DR_GROUP_SIZE is 1, and + DR_CHAIN is of size 1. */ + gcc_assert (group_size == 1); + op = vect_get_store_rhs (first_stmt_info); + vect_get_vec_defs_for_operand (vinfo, first_stmt_info, + ncopies, op, gvec_oprnds[0]); + vec_oprnd = (*gvec_oprnds[0])[0]; + dr_chain.quick_push (vec_oprnd); + if (mask) + { + vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, + mask, &vec_masks, + mask_vectype); + vec_mask = vec_masks[0]; + } - /* We should have catched mismatched types earlier. */ - gcc_assert (useless_type_conversion_p (vectype, - TREE_TYPE (vec_oprnd))); - if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, - slp_node, &gs_info, &dataref_ptr, - &vec_offsets); - else - dataref_ptr - = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, - NULL, offset, &dummy, gsi, - &ptr_incr, false, bump); + /* We should have catched mismatched types earlier. */ + gcc_assert ( + useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, + slp_node, &gs_info, + &dataref_ptr, &vec_offsets); + else + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, + aggr_type, NULL, offset, + &dummy, gsi, &ptr_incr, false, + bump); + } } - else + else if (!costing_p) { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); vec_oprnd = (*gvec_oprnds[0])[j]; @@ -9013,15 +9023,27 @@ vectorizable_store (vec_info *vinfo, tree final_mask = NULL_TREE; tree final_len = NULL_TREE; tree bias = NULL_TREE; - if (loop_masks) - final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, - ncopies, vectype, j); - if (vec_mask) - final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, - vec_mask, gsi); + if (!costing_p) + { + if (loop_masks) + final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + ncopies, vectype, j); + if (vec_mask) + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, + final_mask, vec_mask, gsi); + } if (gs_info.ifn != IFN_LAST) { + if (costing_p) + { + unsigned int cnunits = vect_nunits_for_cost (vectype); + inside_cost + += record_stmt_cost (cost_vec, cnunits, scalar_store, + stmt_info, 0, vect_body); + continue; + } + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) vec_offset = vec_offsets[j]; tree scale = size_int (gs_info.scale); @@ -9067,6 +9089,25 @@ vectorizable_store (vec_info *vinfo, { /* Emulated scatter. */ gcc_assert (!final_mask); + if (costing_p) + { + unsigned int cnunits = vect_nunits_for_cost (vectype); + /* For emulated scatter N offset vector element extracts + (we assume the scalar scaling and ptr + offset add is + consumed by the load). */ + inside_cost + += record_stmt_cost (cost_vec, cnunits, vec_to_scalar, + stmt_info, 0, vect_body); + /* N scalar stores plus extracting the elements. */ + inside_cost + += record_stmt_cost (cost_vec, cnunits, vec_to_scalar, + stmt_info, 0, vect_body); + inside_cost + += record_stmt_cost (cost_vec, cnunits, scalar_store, + stmt_info, 0, vect_body); + continue; + } + unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); unsigned HOST_WIDE_INT const_offset_nunits = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype).to_constant (); @@ -9117,6 +9158,13 @@ vectorizable_store (vec_info *vinfo, *vec_stmt = new_stmt; STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } + + if (costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return true; } @@ -9407,8 +9455,8 @@ vectorizable_store (vec_info *vinfo, if (costing_p) vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - &gs_info, alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); + alignment_support_scheme, misalignment, vls_type, + slp_node, cost_vec); return true; } From patchwork Thu Sep 14 03:11:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139270 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp79793vqi; Wed, 13 Sep 2023 20:14:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcdH0HE6LoiO64J2zf685pm9Eb9PHgocdFuyNuhf1ywO2QjM3h2jIm76FDFc5kXF7Ve6Yf X-Received: by 2002:a17:906:cc45:b0:9a5:d095:a8e5 with SMTP id mm5-20020a170906cc4500b009a5d095a8e5mr3275439ejb.37.1694661260043; Wed, 13 Sep 2023 20:14:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661260; cv=none; d=google.com; s=arc-20160816; b=ScJsGn54aVafUDXoRldFOdZ9WxmiBKIlV3GPlL3QSGOOK3uanybhaozBlybzuJOZQN kA4dpD8OLLDo8kFFo/dyPE6e3+gVbVyoPTxKnxC245x80OJ7qs/VQjTfj1Qggz/D5nUg MdkSITXzglRI4vJDCl3UgSzbZkCO+8IRl8sjNg+5Q2zAEKo0X+6bdqnu7kn8W5c0KdID kHWVFdbhL12tC8JfpxPeTyRb3J51oKPCoPTrKpJmOxDdHYIp+hA9Fqs1pGCTKTZolNBW gyTLEVve3TDIjeSIuwmDsJRUBz1BIwLIUVgQJVOI2AXwBRQ/MeftCnk/LfgKVB2+1My8 KwNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=mTMQbMfxroKSjB8WIwwg2xt6fQit19HwdedlWzdn2uo=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=jfF/hm3v12NEi3XoVd9pRNjCJqEz+pZiX9/HA5n5f5500FukNTkIrBBFDPEraE+Cf6 kk3vJPW6k3IkLFWYLUWR4YsAmM674DO1NY1qVNCQ1UfNF8CfiB9hLgtsM1kgAFXsVqIO kpLa1IjnLvfAk5/mG2DXGEsxIqdT53gse368Dt86qCc1gX/BEca6ARoPBOyiqi7GsxxK NUfFQ+cxHl8NAVm0feQ9GyjXSHk+yGpCUWLSQyNhSSKqoMn8kryYeeXxApEWsjKfWJkl ARJH8KD0vkcS++KN7SYUlwrWBAKMTKOQD8a9YX9JB2URI9PR47eQpJzwLEyuG9idA620 RjSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rpXOR7si; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id rs5-20020a170907036500b0099283fccb49si479307ejb.232.2023.09.13.20.14.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:14:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rpXOR7si; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DF17E3831381 for ; Thu, 14 Sep 2023 03:13:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DF17E3831381 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661186; bh=mTMQbMfxroKSjB8WIwwg2xt6fQit19HwdedlWzdn2uo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=rpXOR7siNbpqbz8GEPpQswM2RkLvJOV7z/z3C8kGRb7TGRdMHbUwzwHpHq6BeWa7O YDbx6gQ4PcF5WmZpa1q9z/iueDZgwvmO8j39hjEKNGC2O8xUH7eHpM/Wc7fbNGadOx h2QApagYKX/AIjEs8tpII5bmVJ5R5uZd1nmV9J/c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 09A0D3857C45 for ; Thu, 14 Sep 2023 03:12:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 09A0D3857C45 Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E36bLC031967; Thu, 14 Sep 2023 03:12:10 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11dc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:10 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E36qIb001496; Thu, 14 Sep 2023 03:12:09 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11cw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:09 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E2RMS3023152; Thu, 14 Sep 2023 03:12:08 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3t141nyum1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:07 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C5XC60817728 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:06 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CF18220049; Thu, 14 Sep 2023 03:12:05 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 11B6120040; Thu, 14 Sep 2023 03:12:05 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:04 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 04/10] vect: Simplify costing on vectorizable_scan_store Date: Wed, 13 Sep 2023 22:11:53 -0500 Message-Id: <308240b9aff98d1edc15bcba7a2f015e42cdc371.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lY6jVRNfy0m73rHMalfNnBD31Cy1ETJc X-Proofpoint-GUID: fly-IviufND_3jxUv2hNdlsiOlEbObef X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 priorityscore=1501 impostorscore=0 spamscore=0 mlxlogscore=999 phishscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981125337941546 X-GMAIL-MSGID: 1776981125337941546 This patch is to simplify the costing on the case vectorizable_scan_store without calling function vect_model_store_cost any more. I considered if moving the costing into function vectorizable_scan_store is a good idea, for doing that, we have to pass several variables down which are only used for costing, and for now we just want to keep the costing as the previous, haven't tried to make this costing consistent with what the transforming does, so I think we can leave it for now. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Adjust costing on vectorizable_scan_store without calling vect_model_store_cost any more. --- gcc/tree-vect-stmts.cc | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3f908242fee..048c14d291c 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8432,11 +8432,23 @@ vectorizable_store (vec_info *vinfo, else if (STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) >= 3) { gcc_assert (memory_access_type == VMAT_CONTIGUOUS); + gcc_assert (!slp); if (costing_p) { - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); + unsigned int inside_cost = 0, prologue_cost = 0; + if (vls_type == VLS_STORE_INVARIANT) + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); + vect_get_store_cost (vinfo, stmt_info, ncopies, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return true; } return vectorizable_scan_store (vinfo, stmt_info, gsi, vec_stmt, ncopies); From patchwork Thu Sep 14 03:11:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139272 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80071vqi; Wed, 13 Sep 2023 20:15:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGcJgAAPUapiqN+tiIsvqmfWqEtTZVYuN54Fn92F76GkiDzdtKWiUrxPBsUfZBsk6NgYZpd X-Received: by 2002:a2e:809a:0:b0:2bf:7905:12c3 with SMTP id i26-20020a2e809a000000b002bf790512c3mr3795818ljg.40.1694661305327; Wed, 13 Sep 2023 20:15:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661305; cv=none; d=google.com; s=arc-20160816; b=CqQD13LN50VFXfXx66V0cl26lgQa/IquDZyezGJLJLakhF124oxkZBTvw0/3TtTbwg 5zA3gXDsecQu1xq2VY4WpUjX6IVVQnQaLbwlinVgJgn5RN2/4F44elAC7gRCJAcj1ifd MiXPHQHZoy3s16PxhP+2sYRCBikBeqHtXW7jQY55KATHo+HRgwbKsGfVtvp7sKNOHpm7 ks1ht3GXXyPUZhgunPdKRGqvP5cQytkRYpf3f0Qy0eyT5Y4Fr/ku90xcmqZw9M+dJ7Cq TgDDy57/ypIa2zZQoX9x2m5mo5A4CbroPu0cgCqWSuoQifAcAsArBWp9LSt0+eBo3/fS f0PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=2gtAuGwjK5byV+aXTzqLrKTXQsQlUooJHhtV5KqlxNU=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=jS/7HOFqQvnxbF2QBFulQRPeVRQEoFJrTijAXfshJdRgzUt0OuVLYqjdIIpYZAyXqC wS/KeGOZ+LHVf6Ozvz9uDEogNQxZS74jid2nU3X874Wb44bLNvLEKuLbQyHG4Sz/Bb0P R77efyMrcbZwNavCLAlB18uw1yuWx9YnwrVkDSEp344TANzFbcMApzF/KXpKvdrIBrJV WDOGVRCWyjhGPLyVFfEw6c3Oxmkt+5nnKLv1NLq10YyzSAg24tCGcuourSrqZSLHN10F jGa0lI15u8HBtc34WbVWayO8yJMJWhP+C9yLtH5vP0cAz51lOdGmb6W89xBUZuD1Og1b QBvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=QMt7CUwL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id fy22-20020a170906b7d600b0098debb6fa67si444673ejb.495.2023.09.13.20.15.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:15:05 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=QMt7CUwL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2679B3858414 for ; Thu, 14 Sep 2023 03:13:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2679B3858414 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661204; bh=2gtAuGwjK5byV+aXTzqLrKTXQsQlUooJHhtV5KqlxNU=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=QMt7CUwLp3VuPjqPOqOeo0GaqiosuJ2lMTtDtgVWHkGCQbJIXGW7hCRe6RC+NzcWq 8YwNQf+zfU4PLSCcfv/cpue7AUaDDV2ARkpzQcXBN15U0RxzDvXm9X2qbUb/c/m83r 9NpQimekmly261G8eBLD3aUVlydHsj8YhiW4flMI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id CE8FA385842C for ; Thu, 14 Sep 2023 03:12:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE8FA385842C Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E36h4Y011974; Thu, 14 Sep 2023 03:12:11 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3st2rdc9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:10 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3779o015668; Thu, 14 Sep 2023 03:12:09 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3st2rdby-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:09 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E16WU8011980; Thu, 14 Sep 2023 03:12:08 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t15r27b1v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:08 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C78l22610664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:07 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EC5F620049; Thu, 14 Sep 2023 03:12:06 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2E48320040; Thu, 14 Sep 2023 03:12:06 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:05 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 05/10] vect: Adjust vectorizable_store costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP Date: Wed, 13 Sep 2023 22:11:54 -0500 Message-Id: <2adef8b10433859b6642282b03a11df33c732d11.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: yeA2gKqtTBGEbibTmtBvE8-nb3XC44uw X-Proofpoint-ORIG-GUID: XYGBd0vX17AYdtgKu1StvDPTSNgs5i43 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 spamscore=0 impostorscore=0 phishscore=0 priorityscore=1501 adultscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981172541121887 X-GMAIL-MSGID: 1776981172541121887 This patch adjusts the cost handling on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP in function vectorizable_store. We don't call function vect_model_store_cost for them any more. Like what we improved for PR82255 on load side, this change helps us to get rid of unnecessary vec_to_scalar costing for some case with VMAT_STRIDED_SLP. One typical test case gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c has been associated. And it helps some cases with some inconsistent costing too. Besides, this also special-cases the interleaving stores for these two affected memory access types, since for the interleaving stores the whole chain is vectorized when the last store in the chain is reached, the other stores in the group would be skipped. To keep consistent with this and follows the transforming handlings like iterating the whole group, it only costs for the first store in the group. Ideally we can only cost for the last one but it's not trivial and using the first one is actually equivalent. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get VMAT_ELEMENTWISE and VMAT_STRIDED_SLP any more, and remove their related handlings. (vectorizable_store): Adjust the cost handling on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP without calling vect_model_store_cost. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c: New test. --- .../costmodel/ppc/costmodel-vect-store-1.c | 23 +++ gcc/tree-vect-stmts.cc | 160 +++++++++++------- 2 files changed, 120 insertions(+), 63 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c new file mode 100644 index 00000000000..ab5f3301492 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-additional-options "-O3" } + +/* This test case is partially extracted from case + gcc.dg/vect/vect-avg-16.c, it's to verify we don't + cost a store with vec_to_scalar when we shouldn't. */ + +void +test (signed char *restrict a, signed char *restrict b, signed char *restrict c, + int n) +{ + for (int j = 0; j < n; ++j) + { + for (int i = 0; i < 16; ++i) + a[i] = (b[i] + c[i]) >> 1; + a += 20; + b += 20; + c += 20; + } +} + +/* { dg-final { scan-tree-dump-times "vec_to_scalar" 0 "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 048c14d291c..3d01168080a 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -964,7 +964,9 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, vec_load_store_type vls_type, slp_tree slp_node, stmt_vector_for_cost *cost_vec) { - gcc_assert (memory_access_type != VMAT_GATHER_SCATTER); + gcc_assert (memory_access_type != VMAT_GATHER_SCATTER + && memory_access_type != VMAT_ELEMENTWISE + && memory_access_type != VMAT_STRIDED_SLP); unsigned int inside_cost = 0, prologue_cost = 0; stmt_vec_info first_stmt_info = stmt_info; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -1010,29 +1012,9 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, group_size); } - tree vectype = STMT_VINFO_VECTYPE (stmt_info); /* Costs of the stores. */ - if (memory_access_type == VMAT_ELEMENTWISE) - { - unsigned int assumed_nunits = vect_nunits_for_cost (vectype); - /* N scalar stores plus extracting the elements. */ - inside_cost += record_stmt_cost (cost_vec, - ncopies * assumed_nunits, - scalar_store, stmt_info, 0, vect_body); - } - else - vect_get_store_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, - misalignment, &inside_cost, cost_vec); - - if (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_STRIDED_SLP) - { - /* N scalar stores plus extracting the elements. */ - unsigned int assumed_nunits = vect_nunits_for_cost (vectype); - inside_cost += record_stmt_cost (cost_vec, - ncopies * assumed_nunits, - vec_to_scalar, stmt_info, 0, vect_body); - } + vect_get_store_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, + misalignment, &inside_cost, cost_vec); /* When vectorizing a store into the function result assign a penalty if the function returns in a multi-register location. @@ -8416,6 +8398,18 @@ vectorizable_store (vec_info *vinfo, "Vectorizing an unaligned access.\n"); STMT_VINFO_TYPE (stmt_info) = store_vec_info_type; + + /* As function vect_transform_stmt shows, for interleaving stores + the whole chain is vectorized when the last store in the chain + is reached, the other stores in the group are skipped. So we + want to only cost the last one here, but it's not trivial to + get the last, as it's equivalent to use the first one for + costing, use the first one instead. */ + if (grouped_store + && !slp + && first_stmt_info != stmt_info + && memory_access_type == VMAT_ELEMENTWISE) + return true; } gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); @@ -8488,14 +8482,7 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { - if (costing_p) - { - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); - return true; - } - + unsigned inside_cost = 0, prologue_cost = 0; gimple_stmt_iterator incr_gsi; bool insert_after; gimple *incr; @@ -8503,7 +8490,7 @@ vectorizable_store (vec_info *vinfo, tree ivstep; tree running_off; tree stride_base, stride_step, alias_off; - tree vec_oprnd; + tree vec_oprnd = NULL_TREE; tree dr_offset; unsigned int g; /* Checked by get_load_store_type. */ @@ -8609,26 +8596,30 @@ vectorizable_store (vec_info *vinfo, lnel = const_nunits; ltype = vectype; lvectype = vectype; + alignment_support_scheme = dr_align; + misalignment = mis_align; } } ltype = build_aligned_type (ltype, TYPE_ALIGN (elem_type)); ncopies = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); } - ivstep = stride_step; - ivstep = fold_build2 (MULT_EXPR, TREE_TYPE (ivstep), ivstep, - build_int_cst (TREE_TYPE (ivstep), vf)); + if (!costing_p) + { + ivstep = stride_step; + ivstep = fold_build2 (MULT_EXPR, TREE_TYPE (ivstep), ivstep, + build_int_cst (TREE_TYPE (ivstep), vf)); - standard_iv_increment_position (loop, &incr_gsi, &insert_after); + standard_iv_increment_position (loop, &incr_gsi, &insert_after); - stride_base = cse_and_gimplify_to_preheader (loop_vinfo, stride_base); - ivstep = cse_and_gimplify_to_preheader (loop_vinfo, ivstep); - create_iv (stride_base, PLUS_EXPR, ivstep, NULL, - loop, &incr_gsi, insert_after, - &offvar, NULL); - incr = gsi_stmt (incr_gsi); + stride_base = cse_and_gimplify_to_preheader (loop_vinfo, stride_base); + ivstep = cse_and_gimplify_to_preheader (loop_vinfo, ivstep); + create_iv (stride_base, PLUS_EXPR, ivstep, NULL, loop, &incr_gsi, + insert_after, &offvar, NULL); + incr = gsi_stmt (incr_gsi); - stride_step = cse_and_gimplify_to_preheader (loop_vinfo, stride_step); + stride_step = cse_and_gimplify_to_preheader (loop_vinfo, stride_step); + } alias_off = build_int_cst (ref_type, 0); stmt_vec_info next_stmt_info = first_stmt_info; @@ -8636,39 +8627,76 @@ vectorizable_store (vec_info *vinfo, for (g = 0; g < group_size; g++) { running_off = offvar; - if (g) + if (!costing_p) { - tree size = TYPE_SIZE_UNIT (ltype); - tree pos = fold_build2 (MULT_EXPR, sizetype, size_int (g), - size); - tree newoff = copy_ssa_name (running_off, NULL); - incr = gimple_build_assign (newoff, POINTER_PLUS_EXPR, - running_off, pos); - vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); - running_off = newoff; + if (g) + { + tree size = TYPE_SIZE_UNIT (ltype); + tree pos + = fold_build2 (MULT_EXPR, sizetype, size_int (g), size); + tree newoff = copy_ssa_name (running_off, NULL); + incr = gimple_build_assign (newoff, POINTER_PLUS_EXPR, + running_off, pos); + vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); + running_off = newoff; + } } if (!slp) op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs (vinfo, next_stmt_info, slp_node, ncopies, - op, &vec_oprnds); + if (!costing_p) + vect_get_vec_defs (vinfo, next_stmt_info, slp_node, ncopies, op, + &vec_oprnds); + else if (!slp) + { + enum vect_def_type cdt; + gcc_assert (vect_is_simple_use (op, vinfo, &cdt)); + if (cdt == vect_constant_def || cdt == vect_external_def) + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); + } unsigned int group_el = 0; unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); for (j = 0; j < ncopies; j++) { - vec_oprnd = vec_oprnds[j]; - /* Pun the vector to extract from if necessary. */ - if (lvectype != vectype) + if (!costing_p) { - tree tem = make_ssa_name (lvectype); - gimple *pun - = gimple_build_assign (tem, build1 (VIEW_CONVERT_EXPR, - lvectype, vec_oprnd)); - vect_finish_stmt_generation (vinfo, stmt_info, pun, gsi); - vec_oprnd = tem; + vec_oprnd = vec_oprnds[j]; + /* Pun the vector to extract from if necessary. */ + if (lvectype != vectype) + { + tree tem = make_ssa_name (lvectype); + tree cvt + = build1 (VIEW_CONVERT_EXPR, lvectype, vec_oprnd); + gimple *pun = gimple_build_assign (tem, cvt); + vect_finish_stmt_generation (vinfo, stmt_info, pun, gsi); + vec_oprnd = tem; + } } for (i = 0; i < nstores; i++) { + if (costing_p) + { + /* Only need vector extracting when there are more + than one stores. */ + if (nstores > 1) + inside_cost + += record_stmt_cost (cost_vec, 1, vec_to_scalar, + stmt_info, 0, vect_body); + /* Take a single lane vector type store as scalar + store to avoid ICE like 110776. */ + if (VECTOR_TYPE_P (ltype) + && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U)) + vect_get_store_cost (vinfo, stmt_info, 1, + alignment_support_scheme, + misalignment, &inside_cost, + cost_vec); + else + inside_cost + += record_stmt_cost (cost_vec, 1, scalar_store, + stmt_info, 0, vect_body); + continue; + } tree newref, newoff; gimple *incr, *assign; tree size = TYPE_SIZE (ltype); @@ -8719,6 +8747,12 @@ vectorizable_store (vec_info *vinfo, break; } + if (costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return true; } From patchwork Thu Sep 14 03:11:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139271 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80025vqi; Wed, 13 Sep 2023 20:15:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFr8xg9SO+K/e6qL9XxswYRpoewyQesvucv5ASAFd6JQgDLSeeHeCRIDmAt/IU8HLAnXjGO X-Received: by 2002:ac2:4565:0:b0:500:c589:95fb with SMTP id k5-20020ac24565000000b00500c58995fbmr3214230lfm.55.1694661300205; Wed, 13 Sep 2023 20:15:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661300; cv=none; d=google.com; s=arc-20160816; b=BUODpvnw1lcjHfwlSHPNFpmFd+ouZ2NY/J99F8wfcKGcYQbQvN+WjhQySBfvQD0IyI lcf+bZCstYvlH0a3Zl9U43po4zbR1WKh1bjQARG0UG+2u4vMZgTMFx7AdLOdy/BZTN5d XRA9t7qDnsYLN2y5eZN4gTsh+of9hD2XF95xNZ7AxfsgF2YSltH3SnAwQ/P/EwSQQ6bP soEpvLtuaz+Vc/t7nqIPZuN8jwVFObVdsQhPsqGhg0hgigjbsNOQbR48XqHoIsPtD4pz O2pJzL2VxvJumREVawxBQMqMrimagHobqNwSSbET0u/ALXNkIr5s/sVCMJOQT252Tpym srqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=OCN7nnYwV6qFjUL82UyY/RtJ08VxPJHNZxK9lc4zI8U=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=CVy61DOM5nvbB7IL1KbZotUkNNqnsI78dIyczB/w5oCPRXkP+u8kDPEh2Rs8p9lGFg t7XgwZsMKRRMhDJkme/FbSAU3wlMmmVyfOIEH+GY2kJr7ihN0illBAxoLlYX0ijcmjGU hCV6r0VgTAmb+rs2k8CqfSSlCFPfEmJgDDPHo3ct8+Qcj4pMXwwyNllGhk/RPCJtaj6o /whioHa5VyY0qDOfdPNv8bysjolXNTRFO7/Du7CzaQUpA0uGIkv3Kaio6jfPJTNVxjL+ g6wIEvyLVheO5DR0qp4pivb1Ne8WztPIHBzF/ElLbpbVeGqXnZH1q8t0ReYYKpt6IlwO Aykg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="C1Iw0h2/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g15-20020aa7c58f000000b0052c946c8359si531953edq.516.2023.09.13.20.14.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:15:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="C1Iw0h2/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 420023857714 for ; Thu, 14 Sep 2023 03:13:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 420023857714 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661201; bh=OCN7nnYwV6qFjUL82UyY/RtJ08VxPJHNZxK9lc4zI8U=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=C1Iw0h2/cJBX/1u6Sd59UHfJ6AHvKrXjiuUApx4KDGI/vAPld4OdoFKQQz5DOKiYR UBiM/7rQhgSIU13WMvw/yFilO/xdp3pX43w5z8J4sNjuY7yHGU4U1h6Q+NnURVrHHb YawnFl554RTYtEGg+NPtcGkl5amWRN7aXLFRlUGg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id C2E073858291 for ; Thu, 14 Sep 2023 03:12:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C2E073858291 Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E36bte032012; Thu, 14 Sep 2023 03:12:11 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11dn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:11 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E37Icu005936; Thu, 14 Sep 2023 03:12:10 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s7f11dg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:10 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E1PXuk024034; Thu, 14 Sep 2023 03:12:09 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:09 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C88f18350734 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:08 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 062C52004E; Thu, 14 Sep 2023 03:12:08 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DEAB20049; Thu, 14 Sep 2023 03:12:07 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:07 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 06/10] vect: Adjust vectorizable_store costing on VMAT_LOAD_STORE_LANES Date: Wed, 13 Sep 2023 22:11:55 -0500 Message-Id: <048c90cf62145799aa31e3ca4edd6f7adc911a6c.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: sAgY-3QliisiifjQwavUUxRNuyqibSgB X-Proofpoint-GUID: GCN-3zzrF0nHHfa6wWoBU74RLbO6TKOI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 priorityscore=1501 impostorscore=0 spamscore=0 mlxlogscore=837 phishscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981167195433866 X-GMAIL-MSGID: 1776981167195433866 This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES in function vectorizable_store. We don't call function vect_model_store_cost for it any more. It's the case of interleaving stores, so it skips all stmts excepting for first_stmt_info, consider the whole group when costing first_stmt_info. This patch shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Assert it will never get VMAT_LOAD_STORE_LANES. (vectorizable_store): Adjust the cost handling on VMAT_LOAD_STORE_LANES without calling vect_model_store_cost. Factor out new lambda function update_prologue_cost. --- gcc/tree-vect-stmts.cc | 110 ++++++++++++++++++++++++++++------------- 1 file changed, 75 insertions(+), 35 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3d01168080a..fbd16b8a487 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -966,7 +966,8 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, { gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP); + && memory_access_type != VMAT_STRIDED_SLP + && memory_access_type != VMAT_LOAD_STORE_LANES); unsigned int inside_cost = 0, prologue_cost = 0; stmt_vec_info first_stmt_info = stmt_info; bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); @@ -8408,7 +8409,8 @@ vectorizable_store (vec_info *vinfo, if (grouped_store && !slp && first_stmt_info != stmt_info - && memory_access_type == VMAT_ELEMENTWISE) + && (memory_access_type == VMAT_ELEMENTWISE + || memory_access_type == VMAT_LOAD_STORE_LANES)) return true; } gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); @@ -8479,6 +8481,31 @@ vectorizable_store (vec_info *vinfo, dump_printf_loc (MSG_NOTE, vect_location, "transform store. ncopies = %d\n", ncopies); + /* Check if we need to update prologue cost for invariant, + and update it accordingly if so. If it's not for + interleaving store, we can just check vls_type; but if + it's for interleaving store, need to check the def_type + of the stored value since the current vls_type is just + for first_stmt_info. */ + auto update_prologue_cost = [&](unsigned *prologue_cost, tree store_rhs) + { + gcc_assert (costing_p); + if (slp) + return; + if (grouped_store) + { + gcc_assert (store_rhs); + enum vect_def_type cdt; + gcc_assert (vect_is_simple_use (store_rhs, vinfo, &cdt)); + if (cdt != vect_constant_def && cdt != vect_external_def) + return; + } + else if (vls_type != VLS_STORE_INVARIANT) + return; + *prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, stmt_info, + 0, vect_prologue); + }; + if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) { @@ -8646,14 +8673,8 @@ vectorizable_store (vec_info *vinfo, if (!costing_p) vect_get_vec_defs (vinfo, next_stmt_info, slp_node, ncopies, op, &vec_oprnds); - else if (!slp) - { - enum vect_def_type cdt; - gcc_assert (vect_is_simple_use (op, vinfo, &cdt)); - if (cdt == vect_constant_def || cdt == vect_external_def) - prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, - stmt_info, 0, vect_prologue); - } + else + update_prologue_cost (&prologue_cost, op); unsigned int group_el = 0; unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); @@ -8857,13 +8878,7 @@ vectorizable_store (vec_info *vinfo, if (memory_access_type == VMAT_LOAD_STORE_LANES) { gcc_assert (!slp && grouped_store); - if (costing_p) - { - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, - vls_type, slp_node, cost_vec); - return true; - } + unsigned inside_cost = 0, prologue_cost = 0; for (j = 0; j < ncopies; j++) { gimple *new_stmt; @@ -8879,29 +8894,39 @@ vectorizable_store (vec_info *vinfo, DR_GROUP_SIZE is the exact number of stmts in the chain. Therefore, NEXT_STMT_INFO can't be NULL_TREE. */ op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, - op, gvec_oprnds[i]); - vec_oprnd = (*gvec_oprnds[i])[0]; - dr_chain.quick_push (vec_oprnd); + if (costing_p) + update_prologue_cost (&prologue_cost, op); + else + { + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, + ncopies, op, + gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); + } next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); } - if (mask) + + if (!costing_p) { - vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, - mask, &vec_masks, - mask_vectype); - vec_mask = vec_masks[0]; - } + if (mask) + { + vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, + mask, &vec_masks, + mask_vectype); + vec_mask = vec_masks[0]; + } - /* We should have catched mismatched types earlier. */ - gcc_assert ( - useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); - dataref_ptr - = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, - NULL, offset, &dummy, gsi, - &ptr_incr, false, bump); + /* We should have catched mismatched types earlier. */ + gcc_assert ( + useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, + aggr_type, NULL, offset, &dummy, + gsi, &ptr_incr, false, bump); + } } - else + else if (!costing_p) { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); /* DR_CHAIN is then used as an input to @@ -8917,6 +8942,15 @@ vectorizable_store (vec_info *vinfo, stmt_info, bump); } + if (costing_p) + { + for (i = 0; i < vec_num; i++) + vect_get_store_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + continue; + } + /* Get an array into which we can store the individual vectors. */ tree vec_array = create_vector_array (vectype, vec_num); @@ -9003,6 +9037,12 @@ vectorizable_store (vec_info *vinfo, STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } + if (costing_p && dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + return true; } From patchwork Thu Sep 14 03:11:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139277 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80715vqi; Wed, 13 Sep 2023 20:16:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGwS+IjeItLmkC/3okz4N5PQt5LuK3m404ZJcW+nnQuQR3wj5JxhypvtP1f4lW8eOei0UGR X-Received: by 2002:a17:907:2cf2:b0:9a9:e53d:5d5b with SMTP id hz18-20020a1709072cf200b009a9e53d5d5bmr3411210ejc.41.1694661412841; Wed, 13 Sep 2023 20:16:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661412; cv=none; d=google.com; s=arc-20160816; b=xRvlYQYzaYPEj73yf/M+yHENmofaj3il9uiJgzgTqQ+bbBaaiul598O8v5XnQ3msOq 78vwd39vDD/LVgl4ZFe2nkTRa+4IAMk3HQvvUrNvC/BQVCP4vDPR25BnNPeiu3Q3Teu6 EjPYxurbDGG+zTTXdTZXhqxCUtZltxPWqrzB87nAj+ZO8TrP9OmTNx6uxNA9yOOuItvH nXmilXoiwi0gM4QZB8cjAO0KgGPe/lXkjyEGXX8JUuQpWBS7XrDEbiJ20VPpqPhafwjF eIMBHWDI7c84rm6isNFQ028m5YLGC1dsc1tWN+DJf9FGoR5QZQg3ogh0he1R7Sro9/86 FTYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=yYh0FyzWyXBm7mpdnmIpfTcS3TOiit4V6OIoPGWJJ5U=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=k0LjsMpOG+t2vdZZgZLgsYbYYyFDi30iQUD+/+1C3miNf3fRPwhLRhrtcKqBo3cZ8V ft4j+ugf1wyisAxJqzkDK5mnP3On6nP4hoX+sX1lRtNKoHTPX+j6iOEWQknJAYOjczba zilVn16WgH5y4hZSglM9TbeQ37jnb1e6u0gGAx60YSImUFbl/uXiBzT8Wj8FSvzitjNT XEavS09o3vQtGrw3GVGAp7Q+1ylAf9a3aIGYNAa/IYxeSpyp9Zg+6/ytKoQzBmCbweVA KuiH3BuuiRtqBR3AjEAcYCjh1CHNZG/2P94sw+ysKEWjuQuKL55YJpv/eOSgyD8jbV5v mKvg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=mVGG+2DP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id dx11-20020a170906a84b00b009939cd92a18si487724ejb.73.2023.09.13.20.16.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:16:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=mVGG+2DP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A24538319C9 for ; Thu, 14 Sep 2023 03:14:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1A24538319C9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661261; bh=yYh0FyzWyXBm7mpdnmIpfTcS3TOiit4V6OIoPGWJJ5U=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=mVGG+2DPxOOLFd71MF+rIjYcNFMp6X30T7d7xnN8eJIHzae6D499ht4beCEu2Jbzo x6nYaQA4pNXuq7o7KsSHyOadqjyJgJyCYXvno4hQqit3Zp/ZCHEhuEJGZ16d7l4fO8 rH1Y7koDudDx5hfgmJwDB/VCTMUkVljvhnCoRQr0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 098BE3858C2F for ; Thu, 14 Sep 2023 03:12:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 098BE3858C2F Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E394do028954; Thu, 14 Sep 2023 03:12:12 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sq3rat7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:12 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3CBSW006555; Thu, 14 Sep 2023 03:12:12 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3sq3rasu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:11 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E37pTk002779; Thu, 14 Sep 2023 03:12:11 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3t14hm7q3s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:11 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3C9IE33161942 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:09 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 158DE20040; Thu, 14 Sep 2023 03:12:09 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4EBF820049; Thu, 14 Sep 2023 03:12:08 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:08 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 07/10] vect: Adjust vectorizable_store costing on VMAT_CONTIGUOUS_PERMUTE Date: Wed, 13 Sep 2023 22:11:56 -0500 Message-Id: <03074b183ea6c016691e6174a331de1443bdf326.1694657494.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: A2JfLfMfARBZUkf4s45XnWoCAw9YMSnj X-Proofpoint-ORIG-GUID: aqmzHGyxKzfHsu3um06hgj3w4Z32xtXO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 suspectscore=0 adultscore=0 spamscore=0 clxscore=1015 malwarescore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 priorityscore=1501 impostorscore=0 mlxlogscore=626 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981285147165987 X-GMAIL-MSGID: 1776981285147165987 This patch adjusts the cost handling on VMAT_CONTIGUOUS_PERMUTE in function vectorizable_store. We don't call function vect_model_store_cost for it any more. It's the case of interleaving stores, so it skips all stmts excepting for first_stmt_info, consider the whole group when costing first_stmt_info. This patch shouldn't have any functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Assert it will never get VMAT_CONTIGUOUS_PERMUTE and remove VMAT_CONTIGUOUS_PERMUTE related handlings. (vectorizable_store): Adjust the cost handling on VMAT_CONTIGUOUS_PERMUTE without calling vect_model_store_cost. --- gcc/tree-vect-stmts.cc | 128 ++++++++++++++++++++++++----------------- 1 file changed, 74 insertions(+), 54 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index fbd16b8a487..e3ba8077091 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -967,10 +967,10 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, gcc_assert (memory_access_type != VMAT_GATHER_SCATTER && memory_access_type != VMAT_ELEMENTWISE && memory_access_type != VMAT_STRIDED_SLP - && memory_access_type != VMAT_LOAD_STORE_LANES); + && memory_access_type != VMAT_LOAD_STORE_LANES + && memory_access_type != VMAT_CONTIGUOUS_PERMUTE); + unsigned int inside_cost = 0, prologue_cost = 0; - stmt_vec_info first_stmt_info = stmt_info; - bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info); /* ??? Somehow we need to fix this at the callers. */ if (slp_node) @@ -983,35 +983,6 @@ vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, stmt_info, 0, vect_prologue); } - /* Grouped stores update all elements in the group at once, - so we want the DR for the first statement. */ - if (!slp_node && grouped_access_p) - first_stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); - - /* True if we should include any once-per-group costs as well as - the cost of the statement itself. For SLP we only get called - once per group anyhow. */ - bool first_stmt_p = (first_stmt_info == stmt_info); - - /* We assume that the cost of a single store-lanes instruction is - equivalent to the cost of DR_GROUP_SIZE separate stores. If a grouped - access is instead being provided by a permute-and-store operation, - include the cost of the permutes. */ - if (first_stmt_p - && memory_access_type == VMAT_CONTIGUOUS_PERMUTE) - { - /* Uses a high and low interleave or shuffle operations for each - needed permute. */ - int group_size = DR_GROUP_SIZE (first_stmt_info); - int nstmts = ncopies * ceil_log2 (group_size) * group_size; - inside_cost = record_stmt_cost (cost_vec, nstmts, vec_perm, - stmt_info, 0, vect_body); - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_store_cost: strided group_size = %d .\n", - group_size); - } /* Costs of the stores. */ vect_get_store_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, @@ -8408,9 +8379,7 @@ vectorizable_store (vec_info *vinfo, costing, use the first one instead. */ if (grouped_store && !slp - && first_stmt_info != stmt_info - && (memory_access_type == VMAT_ELEMENTWISE - || memory_access_type == VMAT_LOAD_STORE_LANES)) + && first_stmt_info != stmt_info) return true; } gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)); @@ -9254,14 +9223,15 @@ vectorizable_store (vec_info *vinfo, return true; } + unsigned inside_cost = 0, prologue_cost = 0; auto_vec result_chain (group_size); auto_vec vec_oprnds; for (j = 0; j < ncopies; j++) { gimple *new_stmt; - if (j == 0 && !costing_p) + if (j == 0) { - if (slp) + if (slp && !costing_p) { /* Get vectorized arguments for SLP_NODE. */ vect_get_vec_defs (vinfo, stmt_info, slp_node, 1, op, @@ -9287,13 +9257,20 @@ vectorizable_store (vec_info *vinfo, that there is no interleaving, DR_GROUP_SIZE is 1, and only one iteration of the loop will be executed. */ op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, - op, gvec_oprnds[i]); - vec_oprnd = (*gvec_oprnds[i])[0]; - dr_chain.quick_push (vec_oprnd); + if (costing_p + && memory_access_type == VMAT_CONTIGUOUS_PERMUTE) + update_prologue_cost (&prologue_cost, op); + else if (!costing_p) + { + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, + ncopies, op, + gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); + } next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); } - if (mask) + if (mask && !costing_p) { vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, mask, &vec_masks, @@ -9303,11 +9280,13 @@ vectorizable_store (vec_info *vinfo, } /* We should have catched mismatched types earlier. */ - gcc_assert (useless_type_conversion_p (vectype, - TREE_TYPE (vec_oprnd))); + gcc_assert (costing_p + || useless_type_conversion_p (vectype, + TREE_TYPE (vec_oprnd))); bool simd_lane_access_p = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; - if (simd_lane_access_p + if (!costing_p + && simd_lane_access_p && !loop_masks && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) @@ -9319,7 +9298,7 @@ vectorizable_store (vec_info *vinfo, dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr)); dataref_offset = build_int_cst (ref_type, 0); } - else + else if (!costing_p) dataref_ptr = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, simd_lane_access_p ? loop : NULL, @@ -9347,16 +9326,46 @@ vectorizable_store (vec_info *vinfo, } new_stmt = NULL; - if (!costing_p && grouped_store) - /* Permute. */ - vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, gsi, - &result_chain); + if (grouped_store) + { + /* Permute. */ + gcc_assert (memory_access_type == VMAT_CONTIGUOUS_PERMUTE); + if (costing_p) + { + int group_size = DR_GROUP_SIZE (first_stmt_info); + int nstmts = ceil_log2 (group_size) * group_size; + inside_cost += record_stmt_cost (cost_vec, nstmts, vec_perm, + stmt_info, 0, vect_body); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: " + "strided group_size = %d .\n", + group_size); + } + else + vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, + gsi, &result_chain); + } stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < vec_num; i++) { if (costing_p) - continue; + { + if (memory_access_type == VMAT_CONTIGUOUS_PERMUTE) + vect_get_store_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); + + if (!slp) + { + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); + if (!next_stmt_info) + break; + } + + continue; + } unsigned misalign; unsigned HOST_WIDE_INT align; @@ -9540,9 +9549,20 @@ vectorizable_store (vec_info *vinfo, } if (costing_p) - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, vls_type, - slp_node, cost_vec); + { + if (memory_access_type == VMAT_CONTIGUOUS_PERMUTE) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); + } + else + vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, + alignment_support_scheme, misalignment, vls_type, + slp_node, cost_vec); + } return true; } From patchwork Thu Sep 14 03:11:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139275 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp80590vqi; Wed, 13 Sep 2023 20:16:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IElfx0Brl73xu6FLGyrOL2aWbN0pCSGEVqsuZ/hVPW3WmrLehC2oVQ9syacdeHEPy/PfBEA X-Received: by 2002:a05:6512:6c9:b0:4fd:d213:dfd0 with SMTP id u9-20020a05651206c900b004fdd213dfd0mr4965905lff.11.1694661390279; Wed, 13 Sep 2023 20:16:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661390; cv=none; d=google.com; s=arc-20160816; b=lZyh4Oz814/uRNHX+HgdhETIiUXaDTJ+GZyKClfMEYUqsjeuicCwqN8mK00KrF3HqM UftPhS/kcd2rL2fZrNT/8/Rv3jgcreg86vO6p516tEsNii81vnT2ElHKvjmd4cJNEclx AZv2QM+u38jxZGg3PBw2gets6h9b0tIUsqbBEFWi/uo4vJ9ZfZm2QSf5kKyl4h8PpwbM he45EJwe0kKC+OICucgCe7y6sBeRZy7L/AChobFs+Nxy41uMjy4ptphIOgovTZROWfNh pQRdQusGl4ZLnOGVQRznSB6CIA9EYNpVNZ9J3MRhHyfUTv15lnXfDb8B/741YShlHpD2 KX4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=aM6exNuyL4wstzLrtxZ3GzqDd9LIaDzDSWBtBTG+skU=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=grVSr7DDn1CPnBrCUDfWycTi2+dvtYDAfUbTgtHfKFbSOGMyrId+iKczlt3yHoAzhn NGkMWVC6psnV86CzR02itYcV00ZkEGw7leGIp69ma40GemjMe0QLnK53DU5bOMH9wC/G aMuZw4mBvXHzbO9LGQ5OFQCntk3PhI92JXbZtgKupehYksLB9JUIyf4SLOcXwBFojtd7 NBHUUsDRs/gEmdC1L18v35RGGU+mGwLIRNGD5m/cRIUGhbqWDl5ZOETjttMzlYdbupFI tLqqVTQEEcVqjyKqakGaG3TrOsUX+eZM0EUFIgMMVJtiV5Ed6x/7Moe5moBZHh71A9NX KnRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=qkcbRGws; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id j21-20020a1709064b5500b00988afd7e58asi437221ejv.350.2023.09.13.20.16.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:16:30 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=qkcbRGws; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 51163382E692 for ; Thu, 14 Sep 2023 03:14:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 51163382E692 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661247; bh=aM6exNuyL4wstzLrtxZ3GzqDd9LIaDzDSWBtBTG+skU=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=qkcbRGwsE2QAD2n+OOp0fzaOJN4omRGJVDnBDx1lsLQy/YAGtol/H0+NMzpJfTjxP UTA9OHigYvDxuHpmz3tHdipYD9EJVwDx2FRI4SQBPKxYx3KCp3+M3BamY+yqisjiA5 IQ+GtoAG/+FId4NxiY2FYyLO4Vx3AEO2TmFv3fpk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 1377B3857342 for ; Thu, 14 Sep 2023 03:12:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1377B3857342 Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E38UG1009801; Thu, 14 Sep 2023 03:12:13 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3smp8ff6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:13 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3AKYx017675; Thu, 14 Sep 2023 03:12:12 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3smp8fey-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:12 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E37MhL002755; Thu, 14 Sep 2023 03:12:12 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3t14hm7q3x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:12 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3CA8B24445606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:10 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 27DB42004B; Thu, 14 Sep 2023 03:12:10 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5DAC320040; Thu, 14 Sep 2023 03:12:09 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:09 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH/RFC 08/10] aarch64: Don't use CEIL for vector_store in aarch64_stp_sequence_cost Date: Wed, 13 Sep 2023 22:11:57 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ghYjvqLm2oR4ipJNVh3E3aSbxyYV8aSp X-Proofpoint-GUID: 91DDTUS805zRxtpW6HV-A0AIV2XXbB_Y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 bulkscore=0 clxscore=1015 spamscore=0 priorityscore=1501 impostorscore=0 mlxlogscore=613 adultscore=0 malwarescore=0 mlxscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981261948631962 X-GMAIL-MSGID: 1776981261948631962 This costing adjustment patch series exposes one issue in aarch64 specific costing adjustment for STP sequence. It causes the below test cases to fail: - gcc/testsuite/gcc.target/aarch64/ldp_stp_15.c - gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c - gcc/testsuite/gcc.target/aarch64/ldp_stp_17.c - gcc/testsuite/gcc.target/aarch64/ldp_stp_18.c Take the below function extracted from ldp_stp_15.c as example: void dup_8_int32_t (int32_t *x, int32_t val) { for (int i = 0; i < 8; ++i) x[i] = val; } Without my patch series, during slp1 it gets: val_8(D) 2 times unaligned_store (misalign -1) costs 2 in body node 0x10008c85e38 1 times scalar_to_vec costs 1 in prologue then the final vector cost is 3. With my patch series, during slp1 it gets: val_8(D) 1 times unaligned_store (misalign -1) costs 1 in body val_8(D) 1 times unaligned_store (misalign -1) costs 1 in body node 0x10004cc5d88 1 times scalar_to_vec costs 1 in prologue but the final vector cost is 17. The unaligned_store count is actually unchanged, but the final vector costs become different, it's because the below aarch64 special handling makes the different costs: /* Apply the heuristic described above m_stp_sequence_cost. */ if (m_stp_sequence_cost != ~0U) { uint64_t cost = aarch64_stp_sequence_cost (count, kind, stmt_info, vectype); m_stp_sequence_cost = MIN (m._stp_sequence_cost + cost, ~0U); } For the former, since the count is 2, function aarch64_stp_sequence_cost returns 2 as "CEIL (count, 2) * 2". While for the latter, it's separated into twice calls with count 1, aarch64_stp_sequence_cost returns 2 for each time, so it returns 4 in total. For this case, the stmt with scalar_to_vec also contributes 4 to m_stp_sequence_cost, then the final m_stp_sequence_cost are 6 (2+4) vs. 8 (4+4). Considering scalar_costs->m_stp_sequence_cost is 8 and below checking and re-assigning: else if (m_stp_sequence_cost >= scalar_costs->m_stp_sequence_cost) m_costs[vect_body] = 2 * scalar_costs->total_cost (); For the former, the body cost of vector isn't changed; but for the latter, the body cost of vector is double of scalar cost which is 8 for this case, then it becomes 16 which is bigger than what we expect. I'm not sure why it adopts CEIL for the return value for case unaligned_store in function aarch64_stp_sequence_cost, but I tried to modify it with "return count;" (as it can get back to previous cost), there is no failures exposed in regression testing. I expected that if the previous unaligned_store count is even, this adjustment doesn't change anything, if it's odd, the adjustment may reduce it by one, but I'd guess it would be few. Besides, as the comments for m_stp_sequence_cost, the current handlings seems temporary, maybe a tweak like this can be accepted, so I posted this RFC/PATCH to request comments. this one line change is considered. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_stp_sequence_cost): Return count directly instead of the adjusted value computed with CEIL. --- gcc/config/aarch64/aarch64.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 37d414021ca..9fb4fbd883d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -17051,7 +17051,7 @@ aarch64_stp_sequence_cost (unsigned int count, vect_cost_for_stmt kind, if (!aarch64_aligned_constant_offset_p (stmt_info, size)) return count * 2; } - return CEIL (count, 2) * 2; + return count; case scalar_store: if (stmt_info && STMT_VINFO_DATA_REF (stmt_info)) From patchwork Thu Sep 14 03:11:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139279 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp81218vqi; Wed, 13 Sep 2023 20:18:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE61akwXdTsaVKZvzqex5acUBucdr+k0cuV3gS95v3Xz2C7TvCx8INdZlPg/lU1EZb8HBeo X-Received: by 2002:a17:906:ef8b:b0:9a5:cab0:b061 with SMTP id ze11-20020a170906ef8b00b009a5cab0b061mr3549157ejb.51.1694661483050; Wed, 13 Sep 2023 20:18:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661483; cv=none; d=google.com; s=arc-20160816; b=Zg6gphRCripOebQCyGu4ocs848crl6iIVOTdkvQ+JEoWfdR32T4mxncZRxKsmAI401 R6tNNZeOWB7X+/EHZpkQuMjZ4w0y16Z5+cgBjwQ+wcH8Dl7kENbxB7Mg4kPHelt7esAo BXMYFRxeu4uh8BnLXpE7d4inRzXyLQ61aL2+i7X0NJ/3cdHDc96Cw/IpgTWFdm2309TB fPx1zNS8ax3EkB25SHQBw8wIaCbNYwavwq7Tq3R9vxuqTUkhYzQlcoXkBtU3P9BqlLfN HHE0nwM1SeVGLlYYd4I8+yODb/NYPHAN9S/B/E8V6Qi6NGDU1qFDGNwMPagHA/RRsJJL PJsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=6l68TugUHT9n0p00kuhrHFYdDmKgcJC1VWtgmGj7kuc=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=XjRVWUsvoQZ0YddPP/gVWG2LswGGYFzMOCWZcwLPjQdlrKsmZZaSPdg3UTfvXqh/ew 8+jPnQZkQlKxQNmt5todnrwsLD5e4uETkpj5SCjsih8kaFvPfT0x7u6BHgwWPNNccvgU HeEtxPO7kdh3Ho1p6ruZZOubc+vx+UPJI27FH4n3oNV8i+qoJVUaZy50AcE1ENIPyZXr iJVT0kWv3XiqRawYAWre/YnWhiiY7VbxHLByl8qxCse2/OMjArr23GOvwgt73ankXvx8 Evl/38MkWVo85CWLazymZ+4HC8qleB8BLh0rW2q2TRrfBfRqpnKczwap/go9M+/6JB71 Eu0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=R1ctldg1; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id dk6-20020a170906f0c600b0099d61f05e05si414535ejb.1023.2023.09.13.20.18.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:18:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=R1ctldg1; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 179DC3882107 for ; Thu, 14 Sep 2023 03:15:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 179DC3882107 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661319; bh=6l68TugUHT9n0p00kuhrHFYdDmKgcJC1VWtgmGj7kuc=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=R1ctldg1YhaJEs9vjGhhsxXEC6260gHqaHZn7crQADeZVjy0TSmoUsUxch3kEEphB ySU8gLA/Nrw1pK6eNAZxp8KiYL+RIqpnxJIV/IWCfZJL+ws10KtBx+DBdjoCbwRPKB frMQe9Hq2t+a+hJeEySmIFxsc6JnpO15QYy68wj4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 1E111385842A for ; Thu, 14 Sep 2023 03:12:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E111385842A Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E39boq021124; Thu, 14 Sep 2023 03:12:14 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s628wmc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:14 +0000 Received: from m0360083.ppops.net (m0360083.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E3AIjV023165; Thu, 14 Sep 2023 03:12:14 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3s628wm4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:13 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E0S9kT002352; Thu, 14 Sep 2023 03:12:13 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t158kffj1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:12 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3CBx044827092 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:11 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 392CA20040; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6EFCB2004B; Thu, 14 Sep 2023 03:12:10 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:10 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 09/10] vect: Get rid of vect_model_store_cost Date: Wed, 13 Sep 2023 22:11:58 -0500 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: _ljovXMqTxoxOOpH2eryTEMea9vxCyYr X-Proofpoint-ORIG-GUID: W3DzvZd6zhSwjPuWNtOU-mxbGdJGKxcq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 mlxscore=0 bulkscore=0 suspectscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 clxscore=1015 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981359273686726 X-GMAIL-MSGID: 1776981359273686726 This patch is to eventually get rid of vect_model_store_cost, it adjusts the costing for the remaining memory access types VMAT_CONTIGUOUS{, _DOWN, _REVERSE} by moving costing close to the transform code. Note that in vect_model_store_cost, there is one special handling for vectorizing a store into the function result, since it's extra penalty and the transform part doesn't have it, this patch keep it alone. gcc/ChangeLog: * tree-vect-stmts.cc (vect_model_store_cost): Remove. (vectorizable_store): Adjust the costing for the remaining memory access types VMAT_CONTIGUOUS{, _DOWN, _REVERSE}. --- gcc/tree-vect-stmts.cc | 137 +++++++++++++---------------------------- 1 file changed, 44 insertions(+), 93 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index e3ba8077091..3d451c80bca 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -951,81 +951,6 @@ cfun_returns (tree decl) return false; } -/* Function vect_model_store_cost - - Models cost for stores. In the case of grouped accesses, one access - has the overhead of the grouped access attributed to it. */ - -static void -vect_model_store_cost (vec_info *vinfo, stmt_vec_info stmt_info, int ncopies, - vect_memory_access_type memory_access_type, - dr_alignment_support alignment_support_scheme, - int misalignment, - vec_load_store_type vls_type, slp_tree slp_node, - stmt_vector_for_cost *cost_vec) -{ - gcc_assert (memory_access_type != VMAT_GATHER_SCATTER - && memory_access_type != VMAT_ELEMENTWISE - && memory_access_type != VMAT_STRIDED_SLP - && memory_access_type != VMAT_LOAD_STORE_LANES - && memory_access_type != VMAT_CONTIGUOUS_PERMUTE); - - unsigned int inside_cost = 0, prologue_cost = 0; - - /* ??? Somehow we need to fix this at the callers. */ - if (slp_node) - ncopies = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); - - if (vls_type == VLS_STORE_INVARIANT) - { - if (!slp_node) - prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, - stmt_info, 0, vect_prologue); - } - - - /* Costs of the stores. */ - vect_get_store_cost (vinfo, stmt_info, ncopies, alignment_support_scheme, - misalignment, &inside_cost, cost_vec); - - /* When vectorizing a store into the function result assign - a penalty if the function returns in a multi-register location. - In this case we assume we'll end up with having to spill the - vector result and do piecewise loads as a conservative estimate. */ - tree base = get_base_address (STMT_VINFO_DATA_REF (stmt_info)->ref); - if (base - && (TREE_CODE (base) == RESULT_DECL - || (DECL_P (base) && cfun_returns (base))) - && !aggregate_value_p (base, cfun->decl)) - { - rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1); - /* ??? Handle PARALLEL in some way. */ - if (REG_P (reg)) - { - int nregs = hard_regno_nregs (REGNO (reg), GET_MODE (reg)); - /* Assume that a single reg-reg move is possible and cheap, - do not account for vector to gp register move cost. */ - if (nregs > 1) - { - /* Spill. */ - prologue_cost += record_stmt_cost (cost_vec, ncopies, - vector_store, - stmt_info, 0, vect_epilogue); - /* Loads. */ - prologue_cost += record_stmt_cost (cost_vec, ncopies * nregs, - scalar_load, - stmt_info, 0, vect_epilogue); - } - } - } - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_store_cost: inside_cost = %d, " - "prologue_cost = %d .\n", inside_cost, prologue_cost); -} - - /* Calculate cost of DR's memory access. */ void vect_get_store_cost (vec_info *, stmt_vec_info stmt_info, int ncopies, @@ -9223,6 +9148,11 @@ vectorizable_store (vec_info *vinfo, return true; } + gcc_assert (memory_access_type == VMAT_CONTIGUOUS + || memory_access_type == VMAT_CONTIGUOUS_DOWN + || memory_access_type == VMAT_CONTIGUOUS_PERMUTE + || memory_access_type == VMAT_CONTIGUOUS_REVERSE); + unsigned inside_cost = 0, prologue_cost = 0; auto_vec result_chain (group_size); auto_vec vec_oprnds; @@ -9257,10 +9187,9 @@ vectorizable_store (vec_info *vinfo, that there is no interleaving, DR_GROUP_SIZE is 1, and only one iteration of the loop will be executed. */ op = vect_get_store_rhs (next_stmt_info); - if (costing_p - && memory_access_type == VMAT_CONTIGUOUS_PERMUTE) + if (costing_p) update_prologue_cost (&prologue_cost, op); - else if (!costing_p) + else { vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, op, @@ -9352,10 +9281,9 @@ vectorizable_store (vec_info *vinfo, { if (costing_p) { - if (memory_access_type == VMAT_CONTIGUOUS_PERMUTE) - vect_get_store_cost (vinfo, stmt_info, 1, - alignment_support_scheme, misalignment, - &inside_cost, cost_vec); + vect_get_store_cost (vinfo, stmt_info, 1, + alignment_support_scheme, misalignment, + &inside_cost, cost_vec); if (!slp) { @@ -9550,18 +9478,41 @@ vectorizable_store (vec_info *vinfo, if (costing_p) { - if (memory_access_type == VMAT_CONTIGUOUS_PERMUTE) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "vect_model_store_cost: inside_cost = %d, " - "prologue_cost = %d .\n", - inside_cost, prologue_cost); + /* When vectorizing a store into the function result assign + a penalty if the function returns in a multi-register location. + In this case we assume we'll end up with having to spill the + vector result and do piecewise loads as a conservative estimate. */ + tree base = get_base_address (STMT_VINFO_DATA_REF (stmt_info)->ref); + if (base + && (TREE_CODE (base) == RESULT_DECL + || (DECL_P (base) && cfun_returns (base))) + && !aggregate_value_p (base, cfun->decl)) + { + rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1); + /* ??? Handle PARALLEL in some way. */ + if (REG_P (reg)) + { + int nregs = hard_regno_nregs (REGNO (reg), GET_MODE (reg)); + /* Assume that a single reg-reg move is possible and cheap, + do not account for vector to gp register move cost. */ + if (nregs > 1) + { + /* Spill. */ + prologue_cost + += record_stmt_cost (cost_vec, ncopies, vector_store, + stmt_info, 0, vect_epilogue); + /* Loads. */ + prologue_cost + += record_stmt_cost (cost_vec, ncopies * nregs, scalar_load, + stmt_info, 0, vect_epilogue); + } + } } - else - vect_model_store_cost (vinfo, stmt_info, ncopies, memory_access_type, - alignment_support_scheme, misalignment, vls_type, - slp_node, cost_vec); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_store_cost: inside_cost = %d, " + "prologue_cost = %d .\n", + inside_cost, prologue_cost); } return true; From patchwork Thu Sep 14 03:11:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139280 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp81438vqi; Wed, 13 Sep 2023 20:18:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGndFUfjeOTTmfBUW6Cx/R7m/5nyCQY9sAX7RDdA5cOwSp1VYJGxFnA61QKxU4KoP50loSU X-Received: by 2002:a17:906:3089:b0:9ad:b046:bc50 with SMTP id 9-20020a170906308900b009adb046bc50mr794341ejv.10.1694661518860; Wed, 13 Sep 2023 20:18:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661518; cv=none; d=google.com; s=arc-20160816; b=zFqcbgI0WtxJZom7LplpFamFw+RRJzoX4HYM3LIWq8TLJp6DMx7utHZ365vWepzVIs KWYuyGs1dhPqfa8um7mJiXzctaYJuTm8cZ75Ox4Ysvl3vgGrb+yHK88kE1ilffAJ7XQa pt/t3kRX9xVHj/AYyEutKn8IbBn3jG+kzRe3koaOrPLPc6MY+n2Lk8mm2jDa6I9ccI9s 8PPHei9FEs27GohIXSs17E6fMvS+HyK/0UbPyYav9cYt+nhVGCiRpVXDBk658McDBqav coSU3TZZbGT8IjzMcUmZO1i/yEtn3umcAoIv+1/Qy575zmowgOu6bIriau+SZJOTEnXW M9RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=p3m/dVg9mRJnfmR1+w8o2tq3treyr0bLr54ohXJdDqM=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=x+VfPeyWyCvkOsFLtNVTxlURMpyBH+Apx/huplVK86USeQqO/A5L0QqjpysSvA/am3 CqLq5zOMv1vOknvMNVnI5XfDrh2qActZTrGbTOF0AjvXOA8pScHe6TGDF5C7NOULVYzT spjQUWtVuGRxWKXiHjJmg4LQKY/y7XK9ufb6DDjpWvpyqGGWddBO5LpHOCJ60hxhFjQC TRYM/00QEFDKSpnA6xNFtILNbBgzY7NaipZKzKTU/4ki1Y5EQQr4HpGdOL6ICYsqPWzI mKF5LJJ5d/V67rh3UpQ9337U0HYoIvqu2vWcQxJ6fChTXsrBLeY3EGRFVw8HI5SFAuGo VFsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ri1xH4DN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id mh14-20020a170906eb8e00b0098d373fa9e4si493583ejb.1007.2023.09.13.20.18.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:18:38 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ri1xH4DN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 19D563882100 for ; Thu, 14 Sep 2023 03:15:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 19D563882100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661358; bh=p3m/dVg9mRJnfmR1+w8o2tq3treyr0bLr54ohXJdDqM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ri1xH4DNY+psC0B7tyIKAhr/XQN1f0KGrNyNjCL7KrWOMfPiivEz3zj2Rf8dPDtW1 fv0juea+hLY4A4yv6HlqUya1iRtoD7xO5i3L1PWFUu0p9lSzt4Zkg56CyD4wGfN6W8 OKup8iEua00ZRZkDPH2YMbuzBy1fhiYSyRn55mZs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 6E23D385843E for ; Thu, 14 Sep 2023 03:12:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6E23D385843E Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E374NM013837; Thu, 14 Sep 2023 03:12:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3rabj99e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:15 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E370Qk012491; Thu, 14 Sep 2023 03:12:14 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3rabj996-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:14 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E1vxH6024080; Thu, 14 Sep 2023 03:12:14 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7k3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:13 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3CCoL62194174 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:12 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 490FD2004B; Thu, 14 Sep 2023 03:12:12 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8015220040; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 10/10] vect: Consider vec_perm costing for VMAT_CONTIGUOUS_REVERSE Date: Wed, 13 Sep 2023 22:11:59 -0500 Message-Id: <7514680ad7b9b859a054ca1a59356f58b5ac9089.1694657495.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: qyDYQG2df7uWknIsrpRbxOg3KonIyCDk X-Proofpoint-ORIG-GUID: rg6aGb3L8l6jg6xlMeJNyVfJVxWS22Mq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 adultscore=0 mlxscore=0 mlxlogscore=999 malwarescore=0 bulkscore=0 priorityscore=1501 clxscore=1015 phishscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981396472394463 X-GMAIL-MSGID: 1776981396472394463 For VMAT_CONTIGUOUS_REVERSE, the transform code in function vectorizable_store generates a VEC_PERM_EXPR stmt before storing, but it's never considered in costing. This patch is to make it consider vec_perm in costing, it adjusts the order of transform code a bit to make it easy to early return for costing_p. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Consider generated VEC_PERM_EXPR stmt for VMAT_CONTIGUOUS_REVERSE in costing as vec_perm. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c: New test. --- .../costmodel/ppc/costmodel-vect-store-2.c | 29 +++++++++ gcc/tree-vect-stmts.cc | 63 +++++++++++-------- 2 files changed, 65 insertions(+), 27 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c new file mode 100644 index 00000000000..72b67cf9040 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-additional-options "-mvsx" } */ + +/* Verify we do cost the required vec_perm. */ + +int +foo (int *a, int *b, int len) +{ + int i; + int *a1 = a; + int *a0 = a1 - 4; + for (i = 0; i < len; i++) + { + *b = *a0 + *a1; + b--; + a0++; + a1++; + } + return 0; +} + +/* The reason why it doesn't check the exact count is that + we can get more than 1 vec_perm when it's compiled with + partial vector capability like Power10 (retrying for + the epilogue) or it's complied without unaligned vector + memory access support (realign). */ +/* { dg-final { scan-tree-dump {\mvec_perm\M} "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3d451c80bca..ce925cc1d53 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9279,6 +9279,40 @@ vectorizable_store (vec_info *vinfo, stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < vec_num; i++) { + if (!costing_p) + { + if (slp) + vec_oprnd = vec_oprnds[i]; + else if (grouped_store) + /* For grouped stores vectorized defs are interleaved in + vect_permute_store_chain(). */ + vec_oprnd = result_chain[i]; + } + + if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) + { + if (costing_p) + inside_cost += record_stmt_cost (cost_vec, 1, vec_perm, + stmt_info, 0, vect_body); + else + { + tree perm_mask = perm_mask_for_reverse (vectype); + tree perm_dest = vect_create_destination_var ( + vect_get_store_rhs (stmt_info), vectype); + tree new_temp = make_ssa_name (perm_dest); + + /* Generate the permute statement. */ + gimple *perm_stmt + = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, + vec_oprnd, perm_mask); + vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, + gsi); + + perm_stmt = SSA_NAME_DEF_STMT (new_temp); + vec_oprnd = new_temp; + } + } + if (costing_p) { vect_get_store_cost (vinfo, stmt_info, 1, @@ -9294,8 +9328,6 @@ vectorizable_store (vec_info *vinfo, continue; } - unsigned misalign; - unsigned HOST_WIDE_INT align; tree final_mask = NULL_TREE; tree final_len = NULL_TREE; @@ -9315,13 +9347,8 @@ vectorizable_store (vec_info *vinfo, dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, stmt_info, bump); - if (slp) - vec_oprnd = vec_oprnds[i]; - else if (grouped_store) - /* For grouped stores vectorized defs are interleaved in - vect_permute_store_chain(). */ - vec_oprnd = result_chain[i]; - + unsigned misalign; + unsigned HOST_WIDE_INT align; align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) misalign = 0; @@ -9338,24 +9365,6 @@ vectorizable_store (vec_info *vinfo, misalign); align = least_bit_hwi (misalign | align); - if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) - { - tree perm_mask = perm_mask_for_reverse (vectype); - tree perm_dest - = vect_create_destination_var (vect_get_store_rhs (stmt_info), - vectype); - tree new_temp = make_ssa_name (perm_dest); - - /* Generate the permute statement. */ - gimple *perm_stmt - = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, - vec_oprnd, perm_mask); - vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, gsi); - - perm_stmt = SSA_NAME_DEF_STMT (new_temp); - vec_oprnd = new_temp; - } - /* Compute IFN when LOOP_LENS or final_mask valid. */ machine_mode vmode = TYPE_MODE (vectype); machine_mode new_vmode = vmode;