From patchwork Thu Sep 14 03:11:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 139280 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp81438vqi; Wed, 13 Sep 2023 20:18:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGndFUfjeOTTmfBUW6Cx/R7m/5nyCQY9sAX7RDdA5cOwSp1VYJGxFnA61QKxU4KoP50loSU X-Received: by 2002:a17:906:3089:b0:9ad:b046:bc50 with SMTP id 9-20020a170906308900b009adb046bc50mr794341ejv.10.1694661518860; Wed, 13 Sep 2023 20:18:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694661518; cv=none; d=google.com; s=arc-20160816; b=zFqcbgI0WtxJZom7LplpFamFw+RRJzoX4HYM3LIWq8TLJp6DMx7utHZ365vWepzVIs KWYuyGs1dhPqfa8um7mJiXzctaYJuTm8cZ75Ox4Ysvl3vgGrb+yHK88kE1ilffAJ7XQa pt/t3kRX9xVHj/AYyEutKn8IbBn3jG+kzRe3koaOrPLPc6MY+n2Lk8mm2jDa6I9ccI9s 8PPHei9FEs27GohIXSs17E6fMvS+HyK/0UbPyYav9cYt+nhVGCiRpVXDBk658McDBqav coSU3TZZbGT8IjzMcUmZO1i/yEtn3umcAoIv+1/Qy575zmowgOu6bIriau+SZJOTEnXW M9RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=p3m/dVg9mRJnfmR1+w8o2tq3treyr0bLr54ohXJdDqM=; fh=se0EChbRuDzWeQLxh4ma2F8om/5/For7fEVL6Npc7+A=; b=x+VfPeyWyCvkOsFLtNVTxlURMpyBH+Apx/huplVK86USeQqO/A5L0QqjpysSvA/am3 CqLq5zOMv1vOknvMNVnI5XfDrh2qActZTrGbTOF0AjvXOA8pScHe6TGDF5C7NOULVYzT spjQUWtVuGRxWKXiHjJmg4LQKY/y7XK9ufb6DDjpWvpyqGGWddBO5LpHOCJ60hxhFjQC TRYM/00QEFDKSpnA6xNFtILNbBgzY7NaipZKzKTU/4ki1Y5EQQr4HpGdOL6ICYsqPWzI mKF5LJJ5d/V67rh3UpQ9337U0HYoIvqu2vWcQxJ6fChTXsrBLeY3EGRFVw8HI5SFAuGo VFsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ri1xH4DN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id mh14-20020a170906eb8e00b0098d373fa9e4si493583ejb.1007.2023.09.13.20.18.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 20:18:38 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ri1xH4DN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 19D563882100 for ; Thu, 14 Sep 2023 03:15:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 19D563882100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694661358; bh=p3m/dVg9mRJnfmR1+w8o2tq3treyr0bLr54ohXJdDqM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ri1xH4DNY+psC0B7tyIKAhr/XQN1f0KGrNyNjCL7KrWOMfPiivEz3zj2Rf8dPDtW1 fv0juea+hLY4A4yv6HlqUya1iRtoD7xO5i3L1PWFUu0p9lSzt4Zkg56CyD4wGfN6W8 OKup8iEua00ZRZkDPH2YMbuzBy1fhiYSyRn55mZs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 6E23D385843E for ; Thu, 14 Sep 2023 03:12:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6E23D385843E Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38E374NM013837; Thu, 14 Sep 2023 03:12:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3rabj99e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:15 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 38E370Qk012491; Thu, 14 Sep 2023 03:12:14 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3t3rabj996-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:14 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 38E1vxH6024080; Thu, 14 Sep 2023 03:12:14 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3t131tg7k3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Sep 2023 03:12:13 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 38E3CCoL62194174 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 03:12:12 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 490FD2004B; Thu, 14 Sep 2023 03:12:12 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8015220040; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Sep 2023 03:12:11 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com Subject: [PATCH 10/10] vect: Consider vec_perm costing for VMAT_CONTIGUOUS_REVERSE Date: Wed, 13 Sep 2023 22:11:59 -0500 Message-Id: <7514680ad7b9b859a054ca1a59356f58b5ac9089.1694657495.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: qyDYQG2df7uWknIsrpRbxOg3KonIyCDk X-Proofpoint-ORIG-GUID: rg6aGb3L8l6jg6xlMeJNyVfJVxWS22Mq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-13_19,2023-09-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 adultscore=0 mlxscore=0 mlxlogscore=999 malwarescore=0 bulkscore=0 priorityscore=1501 clxscore=1015 phishscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309140025 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Kewen Lin via Gcc-patches From: "Kewen.Lin" Reply-To: Kewen Lin Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776981396472394463 X-GMAIL-MSGID: 1776981396472394463 For VMAT_CONTIGUOUS_REVERSE, the transform code in function vectorizable_store generates a VEC_PERM_EXPR stmt before storing, but it's never considered in costing. This patch is to make it consider vec_perm in costing, it adjusts the order of transform code a bit to make it easy to early return for costing_p. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Consider generated VEC_PERM_EXPR stmt for VMAT_CONTIGUOUS_REVERSE in costing as vec_perm. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c: New test. --- .../costmodel/ppc/costmodel-vect-store-2.c | 29 +++++++++ gcc/tree-vect-stmts.cc | 63 +++++++++++-------- 2 files changed, 65 insertions(+), 27 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c new file mode 100644 index 00000000000..72b67cf9040 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-additional-options "-mvsx" } */ + +/* Verify we do cost the required vec_perm. */ + +int +foo (int *a, int *b, int len) +{ + int i; + int *a1 = a; + int *a0 = a1 - 4; + for (i = 0; i < len; i++) + { + *b = *a0 + *a1; + b--; + a0++; + a1++; + } + return 0; +} + +/* The reason why it doesn't check the exact count is that + we can get more than 1 vec_perm when it's compiled with + partial vector capability like Power10 (retrying for + the epilogue) or it's complied without unaligned vector + memory access support (realign). */ +/* { dg-final { scan-tree-dump {\mvec_perm\M} "vect" } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3d451c80bca..ce925cc1d53 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9279,6 +9279,40 @@ vectorizable_store (vec_info *vinfo, stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < vec_num; i++) { + if (!costing_p) + { + if (slp) + vec_oprnd = vec_oprnds[i]; + else if (grouped_store) + /* For grouped stores vectorized defs are interleaved in + vect_permute_store_chain(). */ + vec_oprnd = result_chain[i]; + } + + if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) + { + if (costing_p) + inside_cost += record_stmt_cost (cost_vec, 1, vec_perm, + stmt_info, 0, vect_body); + else + { + tree perm_mask = perm_mask_for_reverse (vectype); + tree perm_dest = vect_create_destination_var ( + vect_get_store_rhs (stmt_info), vectype); + tree new_temp = make_ssa_name (perm_dest); + + /* Generate the permute statement. */ + gimple *perm_stmt + = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, + vec_oprnd, perm_mask); + vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, + gsi); + + perm_stmt = SSA_NAME_DEF_STMT (new_temp); + vec_oprnd = new_temp; + } + } + if (costing_p) { vect_get_store_cost (vinfo, stmt_info, 1, @@ -9294,8 +9328,6 @@ vectorizable_store (vec_info *vinfo, continue; } - unsigned misalign; - unsigned HOST_WIDE_INT align; tree final_mask = NULL_TREE; tree final_len = NULL_TREE; @@ -9315,13 +9347,8 @@ vectorizable_store (vec_info *vinfo, dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, stmt_info, bump); - if (slp) - vec_oprnd = vec_oprnds[i]; - else if (grouped_store) - /* For grouped stores vectorized defs are interleaved in - vect_permute_store_chain(). */ - vec_oprnd = result_chain[i]; - + unsigned misalign; + unsigned HOST_WIDE_INT align; align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) misalign = 0; @@ -9338,24 +9365,6 @@ vectorizable_store (vec_info *vinfo, misalign); align = least_bit_hwi (misalign | align); - if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) - { - tree perm_mask = perm_mask_for_reverse (vectype); - tree perm_dest - = vect_create_destination_var (vect_get_store_rhs (stmt_info), - vectype); - tree new_temp = make_ssa_name (perm_dest); - - /* Generate the permute statement. */ - gimple *perm_stmt - = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, - vec_oprnd, perm_mask); - vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, gsi); - - perm_stmt = SSA_NAME_DEF_STMT (new_temp); - vec_oprnd = new_temp; - } - /* Compute IFN when LOOP_LENS or final_mask valid. */ machine_mode vmode = TYPE_MODE (vectype); machine_mode new_vmode = vmode;