From patchwork Wed May 17 06:15:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 95064 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp911655vqo; Tue, 16 May 2023 23:16:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Gw64v+VQ3GMvHOnz011m7KTwWWY5inY42OPk5LX6awg+qvjQKhORTdD9tQrYeeoKQnnDy X-Received: by 2002:a17:906:4784:b0:94f:7486:85a7 with SMTP id cw4-20020a170906478400b0094f748685a7mr41488310ejc.31.1684304175877; Tue, 16 May 2023 23:16:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684304175; cv=none; d=google.com; s=arc-20160816; b=h+Ye18/hukLQCEUJzTyjVgWMKF0iBxhzzA9hus3neYBMCN4jIZfXgTyPQEf+uIpxab f6KCTAHVUpDGZxBLYTWk0jKqxhjVb+Vyu1/YDurJbJ49sFHVeLI+d5FBpLLc8OSVuA4T hL5BGH5OMYOwzoijESXV33hvKD1eXnD+8wTVwBg0pkg/Wr3uDgAXGnu8m2HjH9Fs0Igb Ume0UOGuxqWp3+AKmtxTjnbV1lpBce4ltgChtjn03j/QC5Om2egq7XV1XlSKvgoTbX49 PSUlRLKdu99TRxUlSwwLgogamytDElEpY3yfypvMzzwJmkqSHUprUkFLeUjhaAp9sYtM uJuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:date:message-id:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=qM4Cj708rrMnOdi8UuY68qWlyHBLSxfmPNrQSn/s+s0=; b=RTE9VppSmUf3jvbflOOlaI0tY+NHYPYjJHOCsjk8TA9YiHUDpWoUjShSQVribGE3+f 7HVlHdIgIguQN8INr0t0eshSarJ0XHaldBcKifTkCmx/cojOomxRnrZkjVx59kl5TU2J T3XWKJEswOB7NlOE8Rj5x3QFiNB5Zvn88175qea/6po2ZmWCkKCS4Z4P8z+OAlHiwC7j j/8jAJu90ntm1kZMMrnoRMphT5eXB8qWbg8KRNSpNS019FvMwx8zk1Nk7Vb04SCzJfVN AesTJI8B3wIHZcobCoVoQ/P5cj+PeuqVm7uff9219NAGWOe1GM1FVerj4TUoDQ2K5Inh 9Wgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Pf6HxIMM; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id rq9-20020a17090788c900b0096b12069991si5944655ejc.411.2023.05.16.23.16.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 May 2023 23:16:15 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Pf6HxIMM; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30DB738438FD for ; Wed, 17 May 2023 06:16:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 30DB738438FD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684304163; bh=qM4Cj708rrMnOdi8UuY68qWlyHBLSxfmPNrQSn/s+s0=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Pf6HxIMMX0Aho4vR9RqogZH8faoGnkOTVqyJbAUfNyVQJAH/dX46rulkv3VfJGsuy qW8l5jJsCKHEAo9dtbaGIGMfF3vhU9hGDtkgOLCdtApOvK6hzFkDtx4odbuAwROHkl s1YxLkEmhzhAqGHL2DH+8bTqHcJhaBbhIcD5MKMc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 0A8223858C41 for ; Wed, 17 May 2023 06:15:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0A8223858C41 Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34H687o4013993; Wed, 17 May 2023 06:15:14 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qms788jsy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 May 2023 06:15:13 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34H68k9J018250; Wed, 17 May 2023 06:15:13 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qms788jrj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 May 2023 06:15:13 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34H0aG8Z019786; Wed, 17 May 2023 06:15:09 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma06fra.de.ibm.com (PPS) with ESMTPS id 3qj1tdsnr9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 May 2023 06:15:08 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34H6F6aq16712402 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 17 May 2023 06:15:06 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 55B9320040; Wed, 17 May 2023 06:15:05 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7564220049; Wed, 17 May 2023 06:15:02 +0000 (GMT) Received: from [9.177.81.160] (unknown [9.177.81.160]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 17 May 2023 06:15:02 +0000 (GMT) Message-ID: <71fda837-6a92-7f74-43e1-90b046919f6a@linux.ibm.com> Date: Wed, 17 May 2023 14:15:00 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: [PATCH 2/2] vect: Enhance cost evaluation in vect_transform_slp_perm_load_1 Content-Language: en-US To: GCC Patches Cc: Richard Biener , Richard Sandiford , Segher Boessenkool , Peter Bergner References: <72a5c5db-bc06-eded-d229-82af34342515@linux.ibm.com> In-Reply-To: <72a5c5db-bc06-eded-d229-82af34342515@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Lxtqpv_pLrGHcs3z6YJVSbutywyywwVx X-Proofpoint-ORIG-GUID: VytFRz4_xchlFV9_kgYHTH5Li9v-T5Lp X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-16_14,2023-05-16_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 suspectscore=0 bulkscore=0 priorityscore=1501 phishscore=0 spamscore=0 impostorscore=0 mlxlogscore=999 lowpriorityscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305170051 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766120935425872867?= X-GMAIL-MSGID: =?utf-8?q?1766120935425872867?= Hi, Following Richi's suggestion in [1], I'm working on deferring cost evaluation next to the transformation, this patch is to enhance function vect_transform_slp_perm_load_1 which could under-cost for vector permutation, since the costing doesn't try to consider nvectors_per_build, it's inconsistent with the transformation part. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html BR, Kewen ----- gcc/ChangeLog: * tree-vect-slp.cc (vect_transform_slp_perm_load_1): Adjust the calculation on n_perms by considering nvectors_per_build. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c: New test. --- .../vect/costmodel/ppc/costmodel-slp-perm.c | 23 +++++++ gcc/tree-vect-slp.cc | 66 ++++++++++--------- 2 files changed, 57 insertions(+), 32 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c -- 2.39.1 diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c new file mode 100644 index 00000000000..e5c4dceddfb --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* Specify power9 to ensure the vectorization is profitable + and test point stands, otherwise it could be not profitable + to vectorize. */ +/* { dg-additional-options "-mdejagnu-cpu=power9 -mpower9-vector" } */ + +/* Verify we cost the exact count for required vec_perm. */ + +int x[1024], y[1024]; + +void +foo () +{ + for (int i = 0; i < 512; ++i) + { + x[2 * i] = y[1023 - (2 * i)]; + x[2 * i + 1] = y[1023 - (2 * i + 1)]; + } +} + +/* { dg-final { scan-tree-dump-times "2 times vec_perm" 1 "vect" } } */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index e5c9d7e766e..af9a6dd4fa9 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -8115,12 +8115,12 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node, mode = TYPE_MODE (vectype); poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); + unsigned int nstmts = SLP_TREE_NUMBER_OF_VEC_STMTS (node); /* Initialize the vect stmts of NODE to properly insert the generated stmts later. */ if (! analyze_only) - for (unsigned i = SLP_TREE_VEC_STMTS (node).length (); - i < SLP_TREE_NUMBER_OF_VEC_STMTS (node); i++) + for (unsigned i = SLP_TREE_VEC_STMTS (node).length (); i < nstmts; i++) SLP_TREE_VEC_STMTS (node).quick_push (NULL); /* Generate permutation masks for every NODE. Number of masks for each NODE @@ -8161,7 +8161,10 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node, (b) the permutes only need a single vector input. */ mask.new_vector (nunits, group_size, 3); nelts_to_build = mask.encoded_nelts (); - nvectors_per_build = SLP_TREE_VEC_STMTS (node).length (); + /* It's possible to obtain zero nstmts during analyze_only, so make + it at least one to ensure the later computation for n_perms + proceed. */ + nvectors_per_build = nstmts > 0 ? nstmts : 1; in_nlanes = DR_GROUP_SIZE (stmt_info) * 3; } else @@ -8252,40 +8255,39 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node, return false; } - ++*n_perms; - + tree mask_vec = NULL_TREE; if (!analyze_only) - { - tree mask_vec = vect_gen_perm_mask_checked (vectype, indices); + mask_vec = vect_gen_perm_mask_checked (vectype, indices); - if (second_vec_index == -1) - second_vec_index = first_vec_index; + if (second_vec_index == -1) + second_vec_index = first_vec_index; - for (unsigned int ri = 0; ri < nvectors_per_build; ++ri) + for (unsigned int ri = 0; ri < nvectors_per_build; ++ri) + { + ++*n_perms; + if (analyze_only) + continue; + /* Generate the permute statement if necessary. */ + tree first_vec = dr_chain[first_vec_index + ri]; + tree second_vec = dr_chain[second_vec_index + ri]; + gassign *stmt = as_a (stmt_info->stmt); + tree perm_dest + = vect_create_destination_var (gimple_assign_lhs (stmt), + vectype); + perm_dest = make_ssa_name (perm_dest); + gimple *perm_stmt + = gimple_build_assign (perm_dest, VEC_PERM_EXPR, first_vec, + second_vec, mask_vec); + vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, + gsi); + if (dce_chain) { - /* Generate the permute statement if necessary. */ - tree first_vec = dr_chain[first_vec_index + ri]; - tree second_vec = dr_chain[second_vec_index + ri]; - gassign *stmt = as_a (stmt_info->stmt); - tree perm_dest - = vect_create_destination_var (gimple_assign_lhs (stmt), - vectype); - perm_dest = make_ssa_name (perm_dest); - gimple *perm_stmt - = gimple_build_assign (perm_dest, VEC_PERM_EXPR, - first_vec, second_vec, mask_vec); - vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, - gsi); - if (dce_chain) - { - bitmap_set_bit (used_defs, first_vec_index + ri); - bitmap_set_bit (used_defs, second_vec_index + ri); - } - - /* Store the vector statement in NODE. */ - SLP_TREE_VEC_STMTS (node) [vect_stmts_counter++] - = perm_stmt; + bitmap_set_bit (used_defs, first_vec_index + ri); + bitmap_set_bit (used_defs, second_vec_index + ri); } + + /* Store the vector statement in NODE. */ + SLP_TREE_VEC_STMTS (node)[vect_stmts_counter++] = perm_stmt; } } else if (!analyze_only)