From patchwork Tue Aug 22 08:45:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 136472 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp3490091vqi; Tue, 22 Aug 2023 01:46:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHbcYuwmGyctdu3WBAUFS/CwMjt8T5PvP1wXUJC4m7PxDrHwzurMO9dVbJvlGdj/kLqRelt X-Received: by 2002:a05:6512:1115:b0:4ff:8f76:677f with SMTP id l21-20020a056512111500b004ff8f76677fmr7049104lfg.67.1692693989820; Tue, 22 Aug 2023 01:46:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692693989; cv=none; d=google.com; s=arc-20160816; b=dkOoQvX66pUr+e0gnNnoPnzYCQhK8CSfrrC2HlinPnXHWCWfUYXp3YRdEzcG0r98zM TUwCbw0++4vbjJDQxo6R4n30f59HUvFmpSJlDQ23kSDA5HBJmV9DWBuj43E4RJeFo6Qq f2Gx4dOL4K8EqULNeJiqnSG2e8TktjDDBYHAG9xYkU3UMsR+JBnBGJgwLi4Ja96o8lgd +YK7Fq9M+b9Als8ctYo4wtdjF3UiIOn1/kMNGPIZBu5/+sE8R+wFXALO9SL2nlRA/u+7 LoAEv6iSxmW+IKBCHnC+PeHK2SfEP33YaqM46AlkvVz4XhORV42gP08uk7YQ8U4KvF+K rUiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:cc:to:subject:content-language:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=E6s5z8MRtf2z2yzWsOonQLiMAxeHj6QDFUR+g51IPaA=; fh=Gv6PAYpoZePgnyLjqy3TiguihZUSZfKa6GhguGNpSZ8=; b=vowWHl2a5nUD3zWi1e6AkHW+HppN1QgMlPq537tC5ZavW0BCk8rMdGIStGcJnXsuZV sfWAD5i8EDIEGC2xeKfiQq5nplbAP9mkLjmPPWnyqtkuE7BQt9DNtnX+kpXqJqDbSRNi cpMNsvs4J7p4RM5liBLvUI1xKgYHV9mKauofDXf72rWRVgxKQpTZxaXi0X0rNMCphaTE xPifiLrFYahGf/fPjI6lW6GB+dNUW7FfdpQLE87dFxz7213MX3iYVvDr7Iw1CEJLScy7 cxOo/QhXxr9iqrXl+Bxx8jZQhx9bTRRnN52edV8sseOmlSjshVz7dm2ItZM7CE9i7kjE smbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=H18fDI1W; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d15-20020a50fb0f000000b00523d8bee81dsi7240878edq.611.2023.08.22.01.46.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Aug 2023 01:46:29 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=H18fDI1W; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7463D3857033 for ; Tue, 22 Aug 2023 08:46:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7463D3857033 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692693988; bh=E6s5z8MRtf2z2yzWsOonQLiMAxeHj6QDFUR+g51IPaA=; h=Date:Subject:To:Cc:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=H18fDI1W0ZCnZJqIQeQCLnVGD9AMUlk4mRpI+CkuJu/RsDyYV8th7F8BDLlbrhBjX SFlYOzH/yldep580TveG83nhLrnusZDj9su9fVShjy9WfkgY30Yi7WFU9ninDugxhj cLkSekszCrZiKYzF9ZiLieIgtd/rdzyVjvOAslDQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 581E33856943 for ; Tue, 22 Aug 2023 08:45:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 581E33856943 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37M89i2h027905; Tue, 22 Aug 2023 08:45:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sms22s90w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:45:35 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37M8CC1t002816; Tue, 22 Aug 2023 08:45:35 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sms22s90g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:45:35 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37M6pHYL007167; Tue, 22 Aug 2023 08:45:33 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3smak7dx4b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:45:33 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37M8jVFr5636722 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 22 Aug 2023 08:45:31 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AACFE20040; Tue, 22 Aug 2023 08:45:31 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CA6602004B; Tue, 22 Aug 2023 08:45:29 +0000 (GMT) Received: from [9.197.233.216] (unknown [9.197.233.216]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 22 Aug 2023 08:45:29 +0000 (GMT) Message-ID: <8c6c6b96-0b97-4eed-5b88-bda2b3dcc902@linux.ibm.com> Date: Tue, 22 Aug 2023 16:45:28 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Content-Language: en-US Subject: [PATCH 1/3] vect: Remove some manual release in vectorizable_store To: GCC Patches Cc: Richard Biener , Richard Sandiford , Segher Boessenkool , Peter Bergner X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ATL1ijLBFj12BlHUQ5TVOfOIYCH3DrIm X-Proofpoint-ORIG-GUID: zbDWvt1AsKQYokO9BVVePPH9K_yAh8AD X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-08-22_07,2023-08-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 mlxscore=0 lowpriorityscore=0 adultscore=0 impostorscore=0 phishscore=0 priorityscore=1501 suspectscore=0 spamscore=0 bulkscore=0 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308220065 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774918292689486792 X-GMAIL-MSGID: 1774918292689486792 Hi, To avoid some duplicates in some follow-up patches on function vectorizable_store, this patch is to adjust some existing vec with auto_vec and remove some manual release invocation. Also refactor a bit and remove some useless codes. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Remove vec oprnds, adjust vec result_chain, vec_oprnd with auto_vec, and adjust gvec_oprnds with auto_delete_vec. --- gcc/tree-vect-stmts.cc | 64 +++++++++++++++--------------------------- 1 file changed, 23 insertions(+), 41 deletions(-) -- 2.31.1 diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 1580a396301..fcaa4127e52 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8200,9 +8200,6 @@ vectorizable_store (vec_info *vinfo, stmt_vec_info first_stmt_info; bool grouped_store; unsigned int group_size, i; - vec oprnds = vNULL; - vec result_chain = vNULL; - vec vec_oprnds = vNULL; bool slp = (slp_node != NULL); unsigned int vec_num; bb_vec_info bb_vinfo = dyn_cast (vinfo); @@ -8601,6 +8598,7 @@ vectorizable_store (vec_info *vinfo, alias_off = build_int_cst (ref_type, 0); stmt_vec_info next_stmt_info = first_stmt_info; + auto_vec vec_oprnds (ncopies); for (g = 0; g < group_size; g++) { running_off = offvar; @@ -8682,7 +8680,7 @@ vectorizable_store (vec_info *vinfo, } } next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); - vec_oprnds.release (); + vec_oprnds.truncate(0); if (slp) break; } @@ -8690,9 +8688,6 @@ vectorizable_store (vec_info *vinfo, return true; } - auto_vec dr_chain (group_size); - oprnds.create (group_size); - gcc_assert (alignment_support_scheme); vec_loop_masks *loop_masks = (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) @@ -8783,11 +8778,15 @@ vectorizable_store (vec_info *vinfo, STMT_VINFO_RELATED_STMT for the next copies. */ + auto_vec dr_chain (group_size); + auto_vec result_chain (group_size); auto_vec vec_masks; tree vec_mask = NULL; auto_vec vec_offsets; - auto_vec > gvec_oprnds; - gvec_oprnds.safe_grow_cleared (group_size, true); + auto_delete_vec> gvec_oprnds (group_size); + for (i = 0; i < group_size; i++) + gvec_oprnds.quick_push (new auto_vec (ncopies)); + auto_vec vec_oprnds; for (j = 0; j < ncopies; j++) { gimple *new_stmt; @@ -8803,11 +8802,11 @@ vectorizable_store (vec_info *vinfo, else { /* For interleaved stores we collect vectorized defs for all the - stores in the group in DR_CHAIN and OPRNDS. DR_CHAIN is then - used as an input to vect_permute_store_chain(). + stores in the group in DR_CHAIN. DR_CHAIN is then used as an + input to vect_permute_store_chain(). If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN - and OPRNDS are of size 1. */ + is of size 1. */ stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < group_size; i++) { @@ -8817,11 +8816,10 @@ vectorizable_store (vec_info *vinfo, that there is no interleaving, DR_GROUP_SIZE is 1, and only one iteration of the loop will be executed. */ op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs_for_operand (vinfo, next_stmt_info, - ncopies, op, &gvec_oprnds[i]); - vec_oprnd = gvec_oprnds[i][0]; - dr_chain.quick_push (gvec_oprnds[i][0]); - oprnds.quick_push (gvec_oprnds[i][0]); + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, + op, gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); } if (mask) @@ -8863,16 +8861,13 @@ vectorizable_store (vec_info *vinfo, else { gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); - /* For interleaved stores we created vectorized defs for all the - defs stored in OPRNDS in the previous iteration (previous copy). - DR_CHAIN is then used as an input to vect_permute_store_chain(). - If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN and - OPRNDS are of size 1. */ + /* DR_CHAIN is then used as an input to vect_permute_store_chain(). + If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN is + of size 1. */ for (i = 0; i < group_size; i++) { - vec_oprnd = gvec_oprnds[i][j]; - dr_chain[i] = gvec_oprnds[i][j]; - oprnds[i] = gvec_oprnds[i][j]; + vec_oprnd = (*gvec_oprnds[i])[j]; + dr_chain[i] = vec_oprnd; } if (mask) vec_mask = vec_masks[j]; @@ -8975,13 +8970,9 @@ vectorizable_store (vec_info *vinfo, { new_stmt = NULL; if (grouped_store) - { - if (j == 0) - result_chain.create (group_size); - /* Permute. */ - vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, - gsi, &result_chain); - } + /* Permute. */ + vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, + gsi, &result_chain); stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < vec_num; i++) @@ -9278,15 +9269,6 @@ vectorizable_store (vec_info *vinfo, } } - for (i = 0; i < group_size; ++i) - { - vec oprndsi = gvec_oprnds[i]; - oprndsi.release (); - } - oprnds.release (); - result_chain.release (); - vec_oprnds.release (); - return true; } From patchwork Tue Aug 22 08:49:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 136474 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp3491511vqi; Tue, 22 Aug 2023 01:50:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHBqTXgmJ0qnnvH89gRuvotKBhTq/9m9ARPIstGRLqFD1wGTQbqNFnAb1O1NzQFxKIhFNzH X-Received: by 2002:aa7:c603:0:b0:525:6588:b624 with SMTP id h3-20020aa7c603000000b005256588b624mr7134828edq.37.1692694233705; Tue, 22 Aug 2023 01:50:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692694233; cv=none; d=google.com; s=arc-20160816; b=By3a8UzEKWGJ8mvx2roGRwjl8zT1h6MnYf3e6m0MRpNS5B/ZVlRXbtmkJGe3VBDjFL cctw4wLNhcPgUFupwN6De7OufW08fQ6tTXTgq3bVvOM0o81YDpDmVoJsb3wnIGxVa/MD XT8vya24ftug1+UfnE/3eX4EzR+uCO9NDUVIheBNZq3q8hBJ/rJaOKH8zpjzA9w556wW UFYET2cjjn4/6KruTIdj6GPj/gp+KbydLqsBZ7Aiz9fAcs9z5m93xZQyDpluZI9TrG88 bozhpDmUcwA491XcYkMOOP5rI3oxdHqHoYkQRo1IZvSCdyMXfJvRMAx2Nk+3w5w8L/wv A3ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=/59mgKpeqt5OvHreM8SqFEE73obSCqNND9XF44lVoW0=; fh=Gv6PAYpoZePgnyLjqy3TiguihZUSZfKa6GhguGNpSZ8=; b=O/Jb2BcuP/AXg8ZCpaL2OLFZCdhmxc6Tf+5jdVQoGzDH+yCzPvglIZpMI6zEy8lg2J pLQMy8aTJhKdo2JxZsi/ofeEkOnOJJ0k6zvyCMroGTQfLwFSXcX6fkBoUt5D9E8giZ/O 0MpJgBHDZ/SU/OAzSHBS3u7IQoXV6YjLumPpqdnUw1mRb9FcFwaNdp5oNwVUMaMq+BYp kZ9zPC9h6NkCu6JseUIr2bvyGrwzqog9YVQx06s0Xvala8/MVLE05Lbv/ChMgABCXZ6C 8nkyuRo9/ycfQC3mDZAUTA/5S2SMerUFyvypDtmyXcBHGsCV5wEnZxjNNd0UmHhAA7Tv 8bMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="YDri/++9"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j6-20020a50ed06000000b0052a025456e6si4214452eds.148.2023.08.22.01.50.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Aug 2023 01:50:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="YDri/++9"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 408CE3857355 for ; Tue, 22 Aug 2023 08:50:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 408CE3857355 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692694232; bh=/59mgKpeqt5OvHreM8SqFEE73obSCqNND9XF44lVoW0=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=YDri/++9fHYO6OGj28586Ty/q2Og3ZdHrjy+eMLn+h31cVdbIwSeJgVTzJRPd3J6C K2YkkyLbxh1MK8idkTbtMRFLeTz+WiUMUbWSylbL1M24JMSOTx9oPPJ+HJB0yerAso lQy77bOMpo5mPK6fjkcVTSknVH8o6eM6PPf1wURg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id C57C13858D28 for ; Tue, 22 Aug 2023 08:49:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C57C13858D28 Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37M8ca59028216; Tue, 22 Aug 2023 08:49:40 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3smscxgje9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:49:40 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37M8fD4l002519; Tue, 22 Aug 2023 08:49:40 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3smscxgjdw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:49:39 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37M71dUt009007; Tue, 22 Aug 2023 08:49:39 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3smak9dwwp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:49:39 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37M8nbmu17695336 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 22 Aug 2023 08:49:37 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ECB8D20049; Tue, 22 Aug 2023 08:49:36 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A5D8E20040; Tue, 22 Aug 2023 08:49:34 +0000 (GMT) Received: from [9.197.233.216] (unknown [9.197.233.216]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 22 Aug 2023 08:49:34 +0000 (GMT) Message-ID: <8a82c294-eaab-bfb2-5e2d-a08d38f3e570@linux.ibm.com> Date: Tue, 22 Aug 2023 16:49:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: [PATCH 2/3] vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest Content-Language: en-US To: GCC Patches Cc: Richard Biener , Richard Sandiford , Segher Boessenkool , Peter Bergner References: <8c6c6b96-0b97-4eed-5b88-bda2b3dcc902@linux.ibm.com> In-Reply-To: <8c6c6b96-0b97-4eed-5b88-bda2b3dcc902@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: j1P8aBb4LQZ8ojKIGnKd8O3DQLqfSpXm X-Proofpoint-ORIG-GUID: wac2rKOnNg79hvntu-8ipvV6Zv4ynZl0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-08-22_07,2023-08-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 spamscore=0 priorityscore=1501 malwarescore=0 phishscore=0 bulkscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 clxscore=1015 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308220065 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774918549191674514 X-GMAIL-MSGID: 1774918549191674514 Hi, Like commit r14-3214 which moves the handlings on memory access type VMAT_LOAD_STORE_LANES in vectorizable_load final loop nest, this one is to deal with the function vectorizable_store. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Move the handlings on VMAT_LOAD_STORE_LANES in the final loop nest to its own loop, and update the final nest accordingly. --- gcc/tree-vect-stmts.cc | 732 ++++++++++++++++++++++------------------- 1 file changed, 387 insertions(+), 345 deletions(-) -- 2.31.1 diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index fcaa4127e52..18f5ebcc09c 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8779,42 +8779,29 @@ vectorizable_store (vec_info *vinfo, */ auto_vec dr_chain (group_size); - auto_vec result_chain (group_size); auto_vec vec_masks; tree vec_mask = NULL; - auto_vec vec_offsets; auto_delete_vec> gvec_oprnds (group_size); for (i = 0; i < group_size; i++) gvec_oprnds.quick_push (new auto_vec (ncopies)); - auto_vec vec_oprnds; - for (j = 0; j < ncopies; j++) + + if (memory_access_type == VMAT_LOAD_STORE_LANES) { - gimple *new_stmt; - if (j == 0) + gcc_assert (!slp && grouped_store); + for (j = 0; j < ncopies; j++) { - if (slp) - { - /* Get vectorized arguments for SLP_NODE. */ - vect_get_vec_defs (vinfo, stmt_info, slp_node, 1, - op, &vec_oprnds); - vec_oprnd = vec_oprnds[0]; - } - else - { - /* For interleaved stores we collect vectorized defs for all the - stores in the group in DR_CHAIN. DR_CHAIN is then used as an - input to vect_permute_store_chain(). - - If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN - is of size 1. */ + gimple *new_stmt; + if (j == 0) + { + /* For interleaved stores we collect vectorized defs for all + the stores in the group in DR_CHAIN. DR_CHAIN is then used + as an input to vect_permute_store_chain(). */ stmt_vec_info next_stmt_info = first_stmt_info; for (i = 0; i < group_size; i++) { /* Since gaps are not supported for interleaved stores, - DR_GROUP_SIZE is the exact number of stmts in the chain. - Therefore, NEXT_STMT_INFO can't be NULL_TREE. In case - that there is no interleaving, DR_GROUP_SIZE is 1, - and only one iteration of the loop will be executed. */ + DR_GROUP_SIZE is the exact number of stmts in the + chain. Therefore, NEXT_STMT_INFO can't be NULL_TREE. */ op = vect_get_store_rhs (next_stmt_info); vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, op, gvec_oprnds[i]); @@ -8825,66 +8812,37 @@ vectorizable_store (vec_info *vinfo, if (mask) { vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, - mask, &vec_masks, mask_vectype); + mask, &vec_masks, + mask_vectype); vec_mask = vec_masks[0]; } - } - /* We should have catched mismatched types earlier. */ - gcc_assert (useless_type_conversion_p (vectype, - TREE_TYPE (vec_oprnd))); - bool simd_lane_access_p - = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; - if (simd_lane_access_p - && !loop_masks - && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR - && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) - && integer_zerop (get_dr_vinfo_offset (vinfo, first_dr_info)) - && integer_zerop (DR_INIT (first_dr_info->dr)) - && alias_sets_conflict_p (get_alias_set (aggr_type), - get_alias_set (TREE_TYPE (ref_type)))) - { - dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr)); - dataref_offset = build_int_cst (ref_type, 0); + /* We should have catched mismatched types earlier. */ + gcc_assert ( + useless_type_conversion_p (vectype, TREE_TYPE (vec_oprnd))); + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, + NULL, offset, &dummy, gsi, + &ptr_incr, false, bump); } - else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, - slp_node, &gs_info, &dataref_ptr, - &vec_offsets); else - dataref_ptr - = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, - simd_lane_access_p ? loop : NULL, - offset, &dummy, gsi, &ptr_incr, - simd_lane_access_p, bump); - } - else - { - gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); - /* DR_CHAIN is then used as an input to vect_permute_store_chain(). - If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN is - of size 1. */ - for (i = 0; i < group_size; i++) { - vec_oprnd = (*gvec_oprnds[i])[j]; - dr_chain[i] = vec_oprnd; + gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); + /* DR_CHAIN is then used as an input to + vect_permute_store_chain(). */ + for (i = 0; i < group_size; i++) + { + vec_oprnd = (*gvec_oprnds[i])[j]; + dr_chain[i] = vec_oprnd; + } + if (mask) + vec_mask = vec_masks[j]; + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, + stmt_info, bump); } - if (mask) - vec_mask = vec_masks[j]; - if (dataref_offset) - dataref_offset - = int_const_binop (PLUS_EXPR, dataref_offset, bump); - else if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, - stmt_info, bump); - } - - if (memory_access_type == VMAT_LOAD_STORE_LANES) - { - tree vec_array; /* Get an array into which we can store the individual vectors. */ - vec_array = create_vector_array (vectype, vec_num); + tree vec_array = create_vector_array (vectype, vec_num); /* Invalidate the current contents of VEC_ARRAY. This should become an RTL clobber too, which prevents the vector registers @@ -8895,8 +8853,8 @@ vectorizable_store (vec_info *vinfo, for (i = 0; i < vec_num; i++) { vec_oprnd = dr_chain[i]; - write_vector_array (vinfo, stmt_info, - gsi, vec_oprnd, vec_array, i); + write_vector_array (vinfo, stmt_info, gsi, vec_oprnd, vec_array, + i); } tree final_mask = NULL; @@ -8906,8 +8864,8 @@ vectorizable_store (vec_info *vinfo, final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, ncopies, vectype, j); if (vec_mask) - final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, - final_mask, vec_mask, gsi); + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, + vec_mask, gsi); if (lanes_ifn == IFN_MASK_LEN_STORE_LANES) { @@ -8955,8 +8913,7 @@ vectorizable_store (vec_info *vinfo, /* Emit: MEM_REF[...all elements...] = STORE_LANES (VEC_ARRAY). */ data_ref = create_array_ref (aggr_type, dataref_ptr, ref_type); - call = gimple_build_call_internal (IFN_STORE_LANES, 1, - vec_array); + call = gimple_build_call_internal (IFN_STORE_LANES, 1, vec_array); gimple_call_set_lhs (call, data_ref); } gimple_call_set_nothrow (call, true); @@ -8965,301 +8922,386 @@ vectorizable_store (vec_info *vinfo, /* Record that VEC_ARRAY is now dead. */ vect_clobber_variable (vinfo, stmt_info, gsi, vec_array); + if (j == 0) + *vec_stmt = new_stmt; + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); } - else - { - new_stmt = NULL; - if (grouped_store) - /* Permute. */ - vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, - gsi, &result_chain); - stmt_vec_info next_stmt_info = first_stmt_info; - for (i = 0; i < vec_num; i++) - { - unsigned misalign; - unsigned HOST_WIDE_INT align; + return true; + } - tree final_mask = NULL_TREE; - tree final_len = NULL_TREE; - tree bias = NULL_TREE; - if (loop_masks) - final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, - vec_num * ncopies, - vectype, vec_num * j + i); - if (vec_mask) - final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, - final_mask, vec_mask, gsi); + auto_vec result_chain (group_size); + auto_vec vec_offsets; + auto_vec vec_oprnds; + for (j = 0; j < ncopies; j++) + { + gimple *new_stmt; + if (j == 0) + { + if (slp) + { + /* Get vectorized arguments for SLP_NODE. */ + vect_get_vec_defs (vinfo, stmt_info, slp_node, 1, op, + &vec_oprnds); + vec_oprnd = vec_oprnds[0]; + } + else + { + /* For interleaved stores we collect vectorized defs for all the + stores in the group in DR_CHAIN. DR_CHAIN is then used as an + input to vect_permute_store_chain(). - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info.ifn != IFN_LAST) + If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN + is of size 1. */ + stmt_vec_info next_stmt_info = first_stmt_info; + for (i = 0; i < group_size; i++) { - if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vec_offset = vec_offsets[vec_num * j + i]; - tree scale = size_int (gs_info.scale); - - if (gs_info.ifn == IFN_MASK_LEN_SCATTER_STORE) - { - if (loop_lens) - final_len - = vect_get_loop_len (loop_vinfo, gsi, loop_lens, - vec_num * ncopies, vectype, - vec_num * j + i, 1); - else - final_len - = build_int_cst (sizetype, - TYPE_VECTOR_SUBPARTS (vectype)); - signed char biasval - = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); - bias = build_int_cst (intQI_type_node, biasval); - if (!final_mask) - { - mask_vectype = truth_type_for (vectype); - final_mask = build_minus_one_cst (mask_vectype); - } - } - - gcall *call; - if (final_len && final_mask) - call - = gimple_build_call_internal (IFN_MASK_LEN_SCATTER_STORE, - 7, dataref_ptr, vec_offset, - scale, vec_oprnd, final_mask, - final_len, bias); - else if (final_mask) - call = gimple_build_call_internal - (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, vec_offset, - scale, vec_oprnd, final_mask); - else - call = gimple_build_call_internal - (IFN_SCATTER_STORE, 4, dataref_ptr, vec_offset, - scale, vec_oprnd); - gimple_call_set_nothrow (call, true); - vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); - new_stmt = call; - break; + /* Since gaps are not supported for interleaved stores, + DR_GROUP_SIZE is the exact number of stmts in the chain. + Therefore, NEXT_STMT_INFO can't be NULL_TREE. In case + that there is no interleaving, DR_GROUP_SIZE is 1, + and only one iteration of the loop will be executed. */ + op = vect_get_store_rhs (next_stmt_info); + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, + op, gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); } - else if (memory_access_type == VMAT_GATHER_SCATTER) + if (mask) { - /* Emulated scatter. */ - gcc_assert (!final_mask); - unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); - unsigned HOST_WIDE_INT const_offset_nunits - = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype) - .to_constant (); - vec *ctor_elts; - vec_alloc (ctor_elts, const_nunits); - gimple_seq stmts = NULL; - tree elt_type = TREE_TYPE (vectype); - unsigned HOST_WIDE_INT elt_size - = tree_to_uhwi (TYPE_SIZE (elt_type)); - /* We support offset vectors with more elements - than the data vector for now. */ - unsigned HOST_WIDE_INT factor - = const_offset_nunits / const_nunits; - vec_offset = vec_offsets[j / factor]; - unsigned elt_offset = (j % factor) * const_nunits; - tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset)); - tree scale = size_int (gs_info.scale); - align = get_object_alignment (DR_REF (first_dr_info->dr)); - tree ltype = build_aligned_type (TREE_TYPE (vectype), align); - for (unsigned k = 0; k < const_nunits; ++k) - { - /* Compute the offsetted pointer. */ - tree boff = size_binop (MULT_EXPR, TYPE_SIZE (idx_type), - bitsize_int (k + elt_offset)); - tree idx = gimple_build (&stmts, BIT_FIELD_REF, - idx_type, vec_offset, - TYPE_SIZE (idx_type), boff); - idx = gimple_convert (&stmts, sizetype, idx); - idx = gimple_build (&stmts, MULT_EXPR, - sizetype, idx, scale); - tree ptr = gimple_build (&stmts, PLUS_EXPR, - TREE_TYPE (dataref_ptr), - dataref_ptr, idx); - ptr = gimple_convert (&stmts, ptr_type_node, ptr); - /* Extract the element to be stored. */ - tree elt = gimple_build (&stmts, BIT_FIELD_REF, - TREE_TYPE (vectype), vec_oprnd, - TYPE_SIZE (elt_type), - bitsize_int (k * elt_size)); - gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); - stmts = NULL; - tree ref = build2 (MEM_REF, ltype, ptr, - build_int_cst (ref_type, 0)); - new_stmt = gimple_build_assign (ref, elt); - vect_finish_stmt_generation (vinfo, stmt_info, - new_stmt, gsi); - } - break; + vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, + mask, &vec_masks, + mask_vectype); + vec_mask = vec_masks[0]; } + } - if (i > 0) - /* Bump the vector pointer. */ - dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, - gsi, stmt_info, bump); + /* We should have catched mismatched types earlier. */ + gcc_assert (useless_type_conversion_p (vectype, + TREE_TYPE (vec_oprnd))); + bool simd_lane_access_p + = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; + if (simd_lane_access_p + && !loop_masks + && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR + && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) + && integer_zerop (get_dr_vinfo_offset (vinfo, first_dr_info)) + && integer_zerop (DR_INIT (first_dr_info->dr)) + && alias_sets_conflict_p (get_alias_set (aggr_type), + get_alias_set (TREE_TYPE (ref_type)))) + { + dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr)); + dataref_offset = build_int_cst (ref_type, 0); + } + else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, slp_node, + &gs_info, &dataref_ptr, &vec_offsets); + else + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, + simd_lane_access_p ? loop : NULL, + offset, &dummy, gsi, &ptr_incr, + simd_lane_access_p, bump); + } + else + { + gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); + /* DR_CHAIN is then used as an input to vect_permute_store_chain(). + If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN is + of size 1. */ + for (i = 0; i < group_size; i++) + { + vec_oprnd = (*gvec_oprnds[i])[j]; + dr_chain[i] = vec_oprnd; + } + if (mask) + vec_mask = vec_masks[j]; + if (dataref_offset) + dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset, bump); + else if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, + stmt_info, bump); + } - if (slp) - vec_oprnd = vec_oprnds[i]; - else if (grouped_store) - /* For grouped stores vectorized defs are interleaved in - vect_permute_store_chain(). */ - vec_oprnd = result_chain[i]; + new_stmt = NULL; + if (grouped_store) + /* Permute. */ + vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, gsi, + &result_chain); - align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); - if (alignment_support_scheme == dr_aligned) - misalign = 0; - else if (misalignment == DR_MISALIGNMENT_UNKNOWN) - { - align = dr_alignment (vect_dr_behavior (vinfo, first_dr_info)); - misalign = 0; - } - else - misalign = misalignment; - if (dataref_offset == NULL_TREE - && TREE_CODE (dataref_ptr) == SSA_NAME) - set_ptr_info_alignment (get_ptr_info (dataref_ptr), align, - misalign); - align = least_bit_hwi (misalign | align); - - if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) - { - tree perm_mask = perm_mask_for_reverse (vectype); - tree perm_dest = vect_create_destination_var - (vect_get_store_rhs (stmt_info), vectype); - tree new_temp = make_ssa_name (perm_dest); - - /* Generate the permute statement. */ - gimple *perm_stmt - = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, - vec_oprnd, perm_mask); - vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, gsi); - - perm_stmt = SSA_NAME_DEF_STMT (new_temp); - vec_oprnd = new_temp; - } + stmt_vec_info next_stmt_info = first_stmt_info; + for (i = 0; i < vec_num; i++) + { + unsigned misalign; + unsigned HOST_WIDE_INT align; - /* Compute IFN when LOOP_LENS or final_mask valid. */ - machine_mode vmode = TYPE_MODE (vectype); - machine_mode new_vmode = vmode; - internal_fn partial_ifn = IFN_LAST; - if (loop_lens) - { - opt_machine_mode new_ovmode - = get_len_load_store_mode (vmode, false, &partial_ifn); - new_vmode = new_ovmode.require (); - unsigned factor - = (new_ovmode == vmode) ? 1 : GET_MODE_UNIT_SIZE (vmode); - final_len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, - vec_num * ncopies, vectype, - vec_num * j + i, factor); - } - else if (final_mask) - { - if (!can_vec_mask_load_store_p (vmode, - TYPE_MODE (TREE_TYPE (final_mask)), - false, &partial_ifn)) - gcc_unreachable (); - } + tree final_mask = NULL_TREE; + tree final_len = NULL_TREE; + tree bias = NULL_TREE; + if (loop_masks) + final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + vec_num * ncopies, vectype, + vec_num * j + i); + if (vec_mask) + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, + vec_mask, gsi); - if (partial_ifn == IFN_MASK_LEN_STORE) + if (memory_access_type == VMAT_GATHER_SCATTER + && gs_info.ifn != IFN_LAST) + { + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vec_offset = vec_offsets[vec_num * j + i]; + tree scale = size_int (gs_info.scale); + + if (gs_info.ifn == IFN_MASK_LEN_SCATTER_STORE) { - if (!final_len) - { - /* Pass VF value to 'len' argument of - MASK_LEN_STORE if LOOP_LENS is invalid. */ - final_len = size_int (TYPE_VECTOR_SUBPARTS (vectype)); - } + if (loop_lens) + final_len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, + vec_num * ncopies, vectype, + vec_num * j + i, 1); + else + final_len = build_int_cst (sizetype, + TYPE_VECTOR_SUBPARTS (vectype)); + signed char biasval + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); + bias = build_int_cst (intQI_type_node, biasval); if (!final_mask) { - /* Pass all ones value to 'mask' argument of - MASK_LEN_STORE if final_mask is invalid. */ mask_vectype = truth_type_for (vectype); final_mask = build_minus_one_cst (mask_vectype); } } - if (final_len) - { - signed char biasval - = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); - bias = build_int_cst (intQI_type_node, biasval); + gcall *call; + if (final_len && final_mask) + call = gimple_build_call_internal (IFN_MASK_LEN_SCATTER_STORE, + 7, dataref_ptr, vec_offset, + scale, vec_oprnd, final_mask, + final_len, bias); + else if (final_mask) + call + = gimple_build_call_internal (IFN_MASK_SCATTER_STORE, 5, + dataref_ptr, vec_offset, scale, + vec_oprnd, final_mask); + else + call = gimple_build_call_internal (IFN_SCATTER_STORE, 4, + dataref_ptr, vec_offset, + scale, vec_oprnd); + gimple_call_set_nothrow (call, true); + vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); + new_stmt = call; + break; + } + else if (memory_access_type == VMAT_GATHER_SCATTER) + { + /* Emulated scatter. */ + gcc_assert (!final_mask); + unsigned HOST_WIDE_INT const_nunits = nunits.to_constant (); + unsigned HOST_WIDE_INT const_offset_nunits + = TYPE_VECTOR_SUBPARTS (gs_info.offset_vectype).to_constant (); + vec *ctor_elts; + vec_alloc (ctor_elts, const_nunits); + gimple_seq stmts = NULL; + tree elt_type = TREE_TYPE (vectype); + unsigned HOST_WIDE_INT elt_size + = tree_to_uhwi (TYPE_SIZE (elt_type)); + /* We support offset vectors with more elements + than the data vector for now. */ + unsigned HOST_WIDE_INT factor + = const_offset_nunits / const_nunits; + vec_offset = vec_offsets[j / factor]; + unsigned elt_offset = (j % factor) * const_nunits; + tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset)); + tree scale = size_int (gs_info.scale); + align = get_object_alignment (DR_REF (first_dr_info->dr)); + tree ltype = build_aligned_type (TREE_TYPE (vectype), align); + for (unsigned k = 0; k < const_nunits; ++k) + { + /* Compute the offsetted pointer. */ + tree boff = size_binop (MULT_EXPR, TYPE_SIZE (idx_type), + bitsize_int (k + elt_offset)); + tree idx + = gimple_build (&stmts, BIT_FIELD_REF, idx_type, vec_offset, + TYPE_SIZE (idx_type), boff); + idx = gimple_convert (&stmts, sizetype, idx); + idx = gimple_build (&stmts, MULT_EXPR, sizetype, idx, scale); + tree ptr + = gimple_build (&stmts, PLUS_EXPR, TREE_TYPE (dataref_ptr), + dataref_ptr, idx); + ptr = gimple_convert (&stmts, ptr_type_node, ptr); + /* Extract the element to be stored. */ + tree elt + = gimple_build (&stmts, BIT_FIELD_REF, TREE_TYPE (vectype), + vec_oprnd, TYPE_SIZE (elt_type), + bitsize_int (k * elt_size)); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + stmts = NULL; + tree ref + = build2 (MEM_REF, ltype, ptr, build_int_cst (ref_type, 0)); + new_stmt = gimple_build_assign (ref, elt); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } + break; + } - /* Arguments are ready. Create the new vector stmt. */ - if (final_len) - { - gcall *call; - tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); - /* Need conversion if it's wrapped with VnQI. */ - if (vmode != new_vmode) - { - tree new_vtype - = build_vector_type_for_mode (unsigned_intQI_type_node, - new_vmode); - tree var - = vect_get_new_ssa_name (new_vtype, vect_simple_var); - vec_oprnd - = build1 (VIEW_CONVERT_EXPR, new_vtype, vec_oprnd); - gassign *new_stmt - = gimple_build_assign (var, VIEW_CONVERT_EXPR, - vec_oprnd); - vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, - gsi); - vec_oprnd = var; - } + if (i > 0) + /* Bump the vector pointer. */ + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, + stmt_info, bump); - if (partial_ifn == IFN_MASK_LEN_STORE) - call = gimple_build_call_internal (IFN_MASK_LEN_STORE, 6, - dataref_ptr, ptr, - final_mask, final_len, - bias, vec_oprnd); - else - call - = gimple_build_call_internal (IFN_LEN_STORE, 5, - dataref_ptr, ptr, - final_len, bias, - vec_oprnd); - gimple_call_set_nothrow (call, true); - vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); - new_stmt = call; + if (slp) + vec_oprnd = vec_oprnds[i]; + else if (grouped_store) + /* For grouped stores vectorized defs are interleaved in + vect_permute_store_chain(). */ + vec_oprnd = result_chain[i]; + + align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); + if (alignment_support_scheme == dr_aligned) + misalign = 0; + else if (misalignment == DR_MISALIGNMENT_UNKNOWN) + { + align = dr_alignment (vect_dr_behavior (vinfo, first_dr_info)); + misalign = 0; + } + else + misalign = misalignment; + if (dataref_offset == NULL_TREE + && TREE_CODE (dataref_ptr) == SSA_NAME) + set_ptr_info_alignment (get_ptr_info (dataref_ptr), align, + misalign); + align = least_bit_hwi (misalign | align); + + if (memory_access_type == VMAT_CONTIGUOUS_REVERSE) + { + tree perm_mask = perm_mask_for_reverse (vectype); + tree perm_dest + = vect_create_destination_var (vect_get_store_rhs (stmt_info), + vectype); + tree new_temp = make_ssa_name (perm_dest); + + /* Generate the permute statement. */ + gimple *perm_stmt + = gimple_build_assign (new_temp, VEC_PERM_EXPR, vec_oprnd, + vec_oprnd, perm_mask); + vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt, gsi); + + perm_stmt = SSA_NAME_DEF_STMT (new_temp); + vec_oprnd = new_temp; + } + + /* Compute IFN when LOOP_LENS or final_mask valid. */ + machine_mode vmode = TYPE_MODE (vectype); + machine_mode new_vmode = vmode; + internal_fn partial_ifn = IFN_LAST; + if (loop_lens) + { + opt_machine_mode new_ovmode + = get_len_load_store_mode (vmode, false, &partial_ifn); + new_vmode = new_ovmode.require (); + unsigned factor + = (new_ovmode == vmode) ? 1 : GET_MODE_UNIT_SIZE (vmode); + final_len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, + vec_num * ncopies, vectype, + vec_num * j + i, factor); + } + else if (final_mask) + { + if (!can_vec_mask_load_store_p ( + vmode, TYPE_MODE (TREE_TYPE (final_mask)), false, + &partial_ifn)) + gcc_unreachable (); + } + + if (partial_ifn == IFN_MASK_LEN_STORE) + { + if (!final_len) + { + /* Pass VF value to 'len' argument of + MASK_LEN_STORE if LOOP_LENS is invalid. */ + final_len = size_int (TYPE_VECTOR_SUBPARTS (vectype)); } - else if (final_mask) + if (!final_mask) { - tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); - gcall *call - = gimple_build_call_internal (IFN_MASK_STORE, 4, - dataref_ptr, ptr, - final_mask, vec_oprnd); - gimple_call_set_nothrow (call, true); - vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); - new_stmt = call; + /* Pass all ones value to 'mask' argument of + MASK_LEN_STORE if final_mask is invalid. */ + mask_vectype = truth_type_for (vectype); + final_mask = build_minus_one_cst (mask_vectype); } - else + } + if (final_len) + { + signed char biasval + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); + + bias = build_int_cst (intQI_type_node, biasval); + } + + /* Arguments are ready. Create the new vector stmt. */ + if (final_len) + { + gcall *call; + tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); + /* Need conversion if it's wrapped with VnQI. */ + if (vmode != new_vmode) { - data_ref = fold_build2 (MEM_REF, vectype, - dataref_ptr, - dataref_offset - ? dataref_offset - : build_int_cst (ref_type, 0)); - if (alignment_support_scheme == dr_aligned) - ; - else - TREE_TYPE (data_ref) - = build_aligned_type (TREE_TYPE (data_ref), - align * BITS_PER_UNIT); - vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); - new_stmt = gimple_build_assign (data_ref, vec_oprnd); + tree new_vtype + = build_vector_type_for_mode (unsigned_intQI_type_node, + new_vmode); + tree var = vect_get_new_ssa_name (new_vtype, vect_simple_var); + vec_oprnd = build1 (VIEW_CONVERT_EXPR, new_vtype, vec_oprnd); + gassign *new_stmt + = gimple_build_assign (var, VIEW_CONVERT_EXPR, vec_oprnd); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + vec_oprnd = var; } - if (slp) - continue; - - next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); - if (!next_stmt_info) - break; + if (partial_ifn == IFN_MASK_LEN_STORE) + call = gimple_build_call_internal (IFN_MASK_LEN_STORE, 6, + dataref_ptr, ptr, final_mask, + final_len, bias, vec_oprnd); + else + call = gimple_build_call_internal (IFN_LEN_STORE, 5, + dataref_ptr, ptr, final_len, + bias, vec_oprnd); + gimple_call_set_nothrow (call, true); + vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); + new_stmt = call; + } + else if (final_mask) + { + tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); + gcall *call + = gimple_build_call_internal (IFN_MASK_STORE, 4, dataref_ptr, + ptr, final_mask, vec_oprnd); + gimple_call_set_nothrow (call, true); + vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); + new_stmt = call; + } + else + { + data_ref + = fold_build2 (MEM_REF, vectype, dataref_ptr, + dataref_offset ? dataref_offset + : build_int_cst (ref_type, 0)); + if (alignment_support_scheme == dr_aligned) + ; + else + TREE_TYPE (data_ref) + = build_aligned_type (TREE_TYPE (data_ref), + align * BITS_PER_UNIT); + vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); + new_stmt = gimple_build_assign (data_ref, vec_oprnd); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } + + if (slp) + continue; + + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); + if (!next_stmt_info) + break; } if (!slp) { From patchwork Tue Aug 22 08:52:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 136475 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp3492535vqi; Tue, 22 Aug 2023 01:53:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHu5YdOAL6ooYYsFT46Tdshegy7fsdMhbgRsQr7orL/zqYiDlbUexZcG01VYFhOQ0rci4/9 X-Received: by 2002:a17:906:3d21:b0:99b:d1d8:d747 with SMTP id l1-20020a1709063d2100b0099bd1d8d747mr7742914ejf.27.1692694432417; Tue, 22 Aug 2023 01:53:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692694432; cv=none; d=google.com; s=arc-20160816; b=t27SKiuqzTB+lp6DN4xxziu+yL0kUKqLR5T5c2Phc7OzrhBhYXikCs+yW85yDx5M4/ GMvLcZVggy5rjdf/SBPhntQeOwc4Ex9CoaG3aI80M9cs93lQN1NCYM9KDbLSWuidRnrR BDBOBOnCQPjxrbcpZoH+WNIgA9rEbGk25t7hPNMIcMmccT/2ZPo6shlU6VxlRkGQrhJo 5M8hvHGFGIQhNiuWuSMIXif6aHMSUo8qyZtd66KxxBWtzFaFGIpDEc4Wa1Fm2kLKxmdt BhyKgYVuBmpBUxFsevn2UCCeP5Z2oXZtObm4q+/XtKv5rh93lBLOJG4ywxtMT0qPxM80 9xIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=TGIgfpCuSZAsJ599ubIW/pqMpvxMX3kpDQRkk4GNG/Y=; fh=Gv6PAYpoZePgnyLjqy3TiguihZUSZfKa6GhguGNpSZ8=; b=lhYf7XXOmJYU0KOYnkIACsnojPauNtHy0cuw6CiNaJPpMUGPK+4fH0CeGUCAyD8s2X 0nJmuSlh0aw/t5e58SvT8Mh643IQiFY0nBPOKvkyJqPjYrOXvL3z+axf+Nazt5t31erz KpvVw+hP3Zj0dw4xajqSgnajX546eKyypwkDCM91Eo/rfRT6Q2n1CUof9x4aAJ2yhSPw Ku/x4PCM3+JHaKVc+zB8rjVqrIw2+UFm+tQZ/fcyuTDHlLTe+Z8MjPg1AxDiNpLSxcMZ L1Q/cmvXcV6odr2pylnRH6vPaomHx2/fUEQJXhy0uD5iDihqIPUAEc8tps0pkOdnWJfp 9C0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=yYrOUnxp; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f19-20020a170906085300b009a198078c54si2979875ejd.628.2023.08.22.01.53.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Aug 2023 01:53:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=yYrOUnxp; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EEDA23857715 for ; Tue, 22 Aug 2023 08:53:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EEDA23857715 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692694420; bh=TGIgfpCuSZAsJ599ubIW/pqMpvxMX3kpDQRkk4GNG/Y=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=yYrOUnxp+DLb+/nm4ZCHotGzSacONPcBpdOujreh00O+qCbgdDzgs0tQ365p4GWpr wsVnOeVDW+jyzq3LnuXyoj23wul00wDzTat3kVmBsqvAXNZwJIFunCILMAeJ57ISET jokfzSZ4YhSMtlxQuKBdjxWzl0++0k5+mNUkoNvg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 17D3E3858D28 for ; Tue, 22 Aug 2023 08:52:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 17D3E3858D28 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37M8nJlo023593; Tue, 22 Aug 2023 08:52:50 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sms22sek4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:52:49 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37M8iwme007109; Tue, 22 Aug 2023 08:52:49 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sms22sej3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:52:49 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37M7w6kP007480; Tue, 22 Aug 2023 08:52:46 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3ska9k1tme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 22 Aug 2023 08:52:46 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37M8qinY55771596 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 22 Aug 2023 08:52:45 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DC27B2004D; Tue, 22 Aug 2023 08:52:44 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C220820040; Tue, 22 Aug 2023 08:52:42 +0000 (GMT) Received: from [9.197.233.216] (unknown [9.197.233.216]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 22 Aug 2023 08:52:42 +0000 (GMT) Message-ID: <1c07d6a4-f322-6a1d-aaea-4d17733493fe@linux.ibm.com> Date: Tue, 22 Aug 2023 16:52:41 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: [PATCH 3/3] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest Content-Language: en-US To: GCC Patches Cc: Richard Biener , Richard Sandiford , Segher Boessenkool , Peter Bergner References: <8c6c6b96-0b97-4eed-5b88-bda2b3dcc902@linux.ibm.com> In-Reply-To: <8c6c6b96-0b97-4eed-5b88-bda2b3dcc902@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ddvLXDZGz0qPQFM9iIUOxi3xo9u9Kagx X-Proofpoint-ORIG-GUID: 9s6avYIWOsHTZBGrEDn8qswwZbADRwtm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-08-22_07,2023-08-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 mlxscore=0 lowpriorityscore=0 adultscore=0 impostorscore=0 phishscore=0 priorityscore=1501 suspectscore=0 spamscore=0 bulkscore=0 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308220065 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774918757017824498 X-GMAIL-MSGID: 1774918757017824498 Hi, Like r14-3317 which moves the handlings on memory access type VMAT_GATHER_SCATTER in vectorizable_load final loop nest, this one is to deal with vectorizable_store side. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Move the handlings on VMAT_GATHER_SCATTER in the final loop nest to its own loop, and update the final nest accordingly. --- gcc/tree-vect-stmts.cc | 258 +++++++++++++++++++++++++---------------- 1 file changed, 159 insertions(+), 99 deletions(-) -- 2.31.1 diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 18f5ebcc09c..b959c1861ad 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8930,44 +8930,23 @@ vectorizable_store (vec_info *vinfo, return true; } - auto_vec result_chain (group_size); - auto_vec vec_offsets; - auto_vec vec_oprnds; - for (j = 0; j < ncopies; j++) + if (memory_access_type == VMAT_GATHER_SCATTER) { - gimple *new_stmt; - if (j == 0) + gcc_assert (!slp && !grouped_store); + auto_vec vec_offsets; + for (j = 0; j < ncopies; j++) { - if (slp) - { - /* Get vectorized arguments for SLP_NODE. */ - vect_get_vec_defs (vinfo, stmt_info, slp_node, 1, op, - &vec_oprnds); - vec_oprnd = vec_oprnds[0]; - } - else + gimple *new_stmt; + if (j == 0) { - /* For interleaved stores we collect vectorized defs for all the - stores in the group in DR_CHAIN. DR_CHAIN is then used as an - input to vect_permute_store_chain(). - - If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN - is of size 1. */ - stmt_vec_info next_stmt_info = first_stmt_info; - for (i = 0; i < group_size; i++) - { - /* Since gaps are not supported for interleaved stores, - DR_GROUP_SIZE is the exact number of stmts in the chain. - Therefore, NEXT_STMT_INFO can't be NULL_TREE. In case - that there is no interleaving, DR_GROUP_SIZE is 1, - and only one iteration of the loop will be executed. */ - op = vect_get_store_rhs (next_stmt_info); - vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, - op, gvec_oprnds[i]); - vec_oprnd = (*gvec_oprnds[i])[0]; - dr_chain.quick_push (vec_oprnd); - next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); - } + /* Since the store is not grouped, DR_GROUP_SIZE is 1, and + DR_CHAIN is of size 1. */ + gcc_assert (group_size == 1); + op = vect_get_store_rhs (first_stmt_info); + vect_get_vec_defs_for_operand (vinfo, first_stmt_info, ncopies, + op, gvec_oprnds[0]); + vec_oprnd = (*gvec_oprnds[0])[0]; + dr_chain.quick_push (vec_oprnd); if (mask) { vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, @@ -8975,91 +8954,55 @@ vectorizable_store (vec_info *vinfo, mask_vectype); vec_mask = vec_masks[0]; } - } - /* We should have catched mismatched types earlier. */ - gcc_assert (useless_type_conversion_p (vectype, - TREE_TYPE (vec_oprnd))); - bool simd_lane_access_p - = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; - if (simd_lane_access_p - && !loop_masks - && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR - && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) - && integer_zerop (get_dr_vinfo_offset (vinfo, first_dr_info)) - && integer_zerop (DR_INIT (first_dr_info->dr)) - && alias_sets_conflict_p (get_alias_set (aggr_type), - get_alias_set (TREE_TYPE (ref_type)))) - { - dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr)); - dataref_offset = build_int_cst (ref_type, 0); + /* We should have catched mismatched types earlier. */ + gcc_assert (useless_type_conversion_p (vectype, + TREE_TYPE (vec_oprnd))); + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, + slp_node, &gs_info, &dataref_ptr, + &vec_offsets); + else + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, + NULL, offset, &dummy, gsi, + &ptr_incr, false, bump); } - else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info, slp_node, - &gs_info, &dataref_ptr, &vec_offsets); else - dataref_ptr - = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, - simd_lane_access_p ? loop : NULL, - offset, &dummy, gsi, &ptr_incr, - simd_lane_access_p, bump); - } - else - { - gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); - /* DR_CHAIN is then used as an input to vect_permute_store_chain(). - If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN is - of size 1. */ - for (i = 0; i < group_size; i++) { - vec_oprnd = (*gvec_oprnds[i])[j]; - dr_chain[i] = vec_oprnd; + gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); + vec_oprnd = (*gvec_oprnds[0])[j]; + dr_chain[0] = vec_oprnd; + if (mask) + vec_mask = vec_masks[j]; + if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, + gsi, stmt_info, bump); } - if (mask) - vec_mask = vec_masks[j]; - if (dataref_offset) - dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset, bump); - else if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, - stmt_info, bump); - } - - new_stmt = NULL; - if (grouped_store) - /* Permute. */ - vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, gsi, - &result_chain); - stmt_vec_info next_stmt_info = first_stmt_info; - for (i = 0; i < vec_num; i++) - { - unsigned misalign; + new_stmt = NULL; unsigned HOST_WIDE_INT align; - tree final_mask = NULL_TREE; tree final_len = NULL_TREE; tree bias = NULL_TREE; if (loop_masks) final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, - vec_num * ncopies, vectype, - vec_num * j + i); + ncopies, vectype, j); if (vec_mask) final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, vec_mask, gsi); - if (memory_access_type == VMAT_GATHER_SCATTER - && gs_info.ifn != IFN_LAST) + if (gs_info.ifn != IFN_LAST) { if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) - vec_offset = vec_offsets[vec_num * j + i]; + vec_offset = vec_offsets[j]; tree scale = size_int (gs_info.scale); if (gs_info.ifn == IFN_MASK_LEN_SCATTER_STORE) { if (loop_lens) final_len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, - vec_num * ncopies, vectype, - vec_num * j + i, 1); + ncopies, vectype, j, 1); else final_len = build_int_cst (sizetype, TYPE_VECTOR_SUBPARTS (vectype)); @@ -9091,9 +9034,8 @@ vectorizable_store (vec_info *vinfo, gimple_call_set_nothrow (call, true); vect_finish_stmt_generation (vinfo, stmt_info, call, gsi); new_stmt = call; - break; } - else if (memory_access_type == VMAT_GATHER_SCATTER) + else { /* Emulated scatter. */ gcc_assert (!final_mask); @@ -9142,8 +9084,126 @@ vectorizable_store (vec_info *vinfo, new_stmt = gimple_build_assign (ref, elt); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } - break; } + if (j == 0) + *vec_stmt = new_stmt; + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt); + } + return true; + } + + auto_vec result_chain (group_size); + auto_vec vec_oprnds; + for (j = 0; j < ncopies; j++) + { + gimple *new_stmt; + if (j == 0) + { + if (slp) + { + /* Get vectorized arguments for SLP_NODE. */ + vect_get_vec_defs (vinfo, stmt_info, slp_node, 1, op, + &vec_oprnds); + vec_oprnd = vec_oprnds[0]; + } + else + { + /* For interleaved stores we collect vectorized defs for all the + stores in the group in DR_CHAIN. DR_CHAIN is then used as an + input to vect_permute_store_chain(). + + If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN + is of size 1. */ + stmt_vec_info next_stmt_info = first_stmt_info; + for (i = 0; i < group_size; i++) + { + /* Since gaps are not supported for interleaved stores, + DR_GROUP_SIZE is the exact number of stmts in the chain. + Therefore, NEXT_STMT_INFO can't be NULL_TREE. In case + that there is no interleaving, DR_GROUP_SIZE is 1, + and only one iteration of the loop will be executed. */ + op = vect_get_store_rhs (next_stmt_info); + vect_get_vec_defs_for_operand (vinfo, next_stmt_info, ncopies, + op, gvec_oprnds[i]); + vec_oprnd = (*gvec_oprnds[i])[0]; + dr_chain.quick_push (vec_oprnd); + next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info); + } + if (mask) + { + vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies, + mask, &vec_masks, + mask_vectype); + vec_mask = vec_masks[0]; + } + } + + /* We should have catched mismatched types earlier. */ + gcc_assert (useless_type_conversion_p (vectype, + TREE_TYPE (vec_oprnd))); + bool simd_lane_access_p + = STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) != 0; + if (simd_lane_access_p + && !loop_masks + && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR + && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) + && integer_zerop (get_dr_vinfo_offset (vinfo, first_dr_info)) + && integer_zerop (DR_INIT (first_dr_info->dr)) + && alias_sets_conflict_p (get_alias_set (aggr_type), + get_alias_set (TREE_TYPE (ref_type)))) + { + dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr)); + dataref_offset = build_int_cst (ref_type, 0); + } + else + dataref_ptr + = vect_create_data_ref_ptr (vinfo, first_stmt_info, aggr_type, + simd_lane_access_p ? loop : NULL, + offset, &dummy, gsi, &ptr_incr, + simd_lane_access_p, bump); + } + else + { + gcc_assert (!LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)); + /* DR_CHAIN is then used as an input to vect_permute_store_chain(). + If the store is not grouped, DR_GROUP_SIZE is 1, and DR_CHAIN is + of size 1. */ + for (i = 0; i < group_size; i++) + { + vec_oprnd = (*gvec_oprnds[i])[j]; + dr_chain[i] = vec_oprnd; + } + if (mask) + vec_mask = vec_masks[j]; + if (dataref_offset) + dataref_offset = int_const_binop (PLUS_EXPR, dataref_offset, bump); + else + dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr, ptr_incr, gsi, + stmt_info, bump); + } + + new_stmt = NULL; + if (grouped_store) + /* Permute. */ + vect_permute_store_chain (vinfo, dr_chain, group_size, stmt_info, gsi, + &result_chain); + + stmt_vec_info next_stmt_info = first_stmt_info; + for (i = 0; i < vec_num; i++) + { + unsigned misalign; + unsigned HOST_WIDE_INT align; + + tree final_mask = NULL_TREE; + tree final_len = NULL_TREE; + tree bias = NULL_TREE; + if (loop_masks) + final_mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + vec_num * ncopies, vectype, + vec_num * j + i); + if (vec_mask) + final_mask = prepare_vec_mask (loop_vinfo, mask_vectype, final_mask, + vec_mask, gsi); if (i > 0) /* Bump the vector pointer. */