From patchwork Fri Nov 3 07:44:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Surya Kumari Jangala X-Patchwork-Id: 161190 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp861767vqu; Fri, 3 Nov 2023 00:45:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEf3MEJjiIuhfZTucqh8EVJPeK+sx5AqsPUhJ9pVc5LvsvnwpZQQ51yu7X81UUgSn3K4YFY X-Received: by 2002:a05:6214:2504:b0:66d:5fcc:e4c4 with SMTP id gf4-20020a056214250400b0066d5fcce4c4mr28226163qvb.5.1698997503453; Fri, 03 Nov 2023 00:45:03 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698997503; cv=pass; d=google.com; s=arc-20160816; b=ZAfCzyWNRMXsHNCXzl1ak4W+sFaTnO4mGLyDkFsFRk6Ovqh/JOGNonuoQOxVRDrnv+ JSkaBE9d9NA3RbYmcY+879yPeJoARZGknadQoH9cdVkmFHEXXgwUvMy1Yguxp5S5eweS Z4myNyCbPycFedz1IPIsZuzm+fuDJx+jItnI3+m23N4BmeLOM+yzxEqXAfl1eI197gsI qhwhmvuDd7e4+bRKn9e4xoPQzFcfFZ3tBsno27Ko+bbAhsOay5su6oyQ19cbd5ptTrZx DxuJdUnGdbvGdzW+yn1Nzxo4wllQGXCCF+7NiAlbQx6ceT6+0TMKEWatfLyswpD1ESHY n08A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :subject:from:to:content-language:user-agent:mime-version:date :message-id:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=efaVR0fiDaa9zWOvDRd1ve8OEPLLtXFTutqSRicuT/8=; fh=CgzzKTs7OFSOIr5AZZiyCBRQBziJ3rO3lpokJrw68LA=; b=0X4nGUq+holKvr7IeLHpfhPOkELNzfASF3Ywyn+fZkS/Cnd7xhJhzK2bDytLtdr5OD BSmuwRPnao/KNtUB6vKg+/wywdm04byfLCAUwW6OetVZhTFAmnFzJxnwP/kdyUID+NXv QxwsevqaSACM+08hsopoWhsq+/pdR/DJ8D/GmbkZiJY5C6Fyg+pPm2df8klg0/Hba8xU 9VjhYkJMcXNO9wL+jmTNyQbRgzOoZwbWerXATTbPSOrVsZbcxElw+CqL0gHs5KcBPqjQ ESt2QS4HPM3zN6q6CGgLOaOy9bjc46MfwMgxu7AWqYPBa8mLN15qFaaSZT6JkUfotu8c THfQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Moo34TPQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t12-20020a05621405cc00b0066d21602c30si1005697qvz.351.2023.11.03.00.45.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 00:45:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Moo34TPQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 36FD53858404 for ; Fri, 3 Nov 2023 07:45:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id E441A3858D28 for ; Fri, 3 Nov 2023 07:44:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E441A3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=linux.vnet.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E441A3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698997479; cv=none; b=xDP6jE/myF1DBTUwvvY+8fW0TgHYGxMxedH3ZZgx7nwzfTCcF+bKEFSGA5++DgfMlll+VdXoRfDQITd6PgRptxzYBrxA3RN0V9opd5llWGXZzA2SfZr2/QEdI14lEQGKps4qtS3mk5gINxRO5WvSBBYTCMfHzyDsAEvo75GLyqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698997479; c=relaxed/simple; bh=23QWFIgiOcWuy69QtzLMZHMLAQY6H8yJZFjAANHTR/k=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=kLTpyt+VpjAEYxW1P7djmG9DytEFed3BQLVpOu7ixbZtxumiBS7Q2qBuEtu5qSOQdZNRgF5y3Ib+cBQH1ZwA3ZqVaMwaxVgQZRQ8kSKlXfnVBNrrRSHy1DeCRbynkfd4Z1dYhWGq/EGw9+iNUNknlYHijkz1Jj/A9gDpRZDQdFw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3A37QVgq007889; Fri, 3 Nov 2023 07:44:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : from : subject : content-type : content-transfer-encoding; s=pp1; bh=efaVR0fiDaa9zWOvDRd1ve8OEPLLtXFTutqSRicuT/8=; b=Moo34TPQWnZjISTJnKkktbmeIEex0AzTDZmWeqIBl3Q9bvWhVjhV3hlL7FNr+2HEWNZV tgxmDj2947hHtiwr5rtvEVDKq61fZRVinZFx10NExMu7Ej+QhuL90hjubxSDyerzQfpH lGpZJPSsG06dsTEjrfsgL/XD9FKWKb7BeZLGAxglCDgdjSwYbrsZVXsZ1seEHfLe/FSq d0GAvgmt2UVBURkx9BTqd1c2HWgn6fCQf66+gg5aQ4CjQZjWYOL27IYlnwLNY/9XA4ES 9iD+abniq1UrlXsUH2Ohts5Tix/O+3ToiL+D3BdvvB+ezj/oUXLxgGI3QpzC9KQQENLp +A== Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u4vbcrku1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Nov 2023 07:44:27 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3A355Bmv020308; Fri, 3 Nov 2023 07:44:26 GMT Received: from smtprelay05.wdc07v.mail.ibm.com ([172.16.1.72]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3u1d104n38-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Nov 2023 07:44:26 +0000 Received: from smtpav01.dal12v.mail.ibm.com (smtpav01.dal12v.mail.ibm.com [10.241.53.100]) by smtprelay05.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3A37iPdd21824124 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 3 Nov 2023 07:44:25 GMT Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 03EE558058; Fri, 3 Nov 2023 07:44:25 +0000 (GMT) Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C2D1658059; Fri, 3 Nov 2023 07:44:23 +0000 (GMT) Received: from [9.109.195.201] (unknown [9.109.195.201]) by smtpav01.dal12v.mail.ibm.com (Postfix) with ESMTP; Fri, 3 Nov 2023 07:44:23 +0000 (GMT) Message-ID: Date: Fri, 3 Nov 2023 13:14:22 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Segher Boessenkool , GCC Patches , Peter Bergner From: Surya Kumari Jangala Subject: [PATCH v3] rs6000/p8swap: Fix incorrect lane extraction by vec_extract() [PR106770] X-TM-AS-GCONF: 00 X-Proofpoint-GUID: s5u3Q93oTeWQLVX1_M0qEE6ymwtC8ZKD X-Proofpoint-ORIG-GUID: s5u3Q93oTeWQLVX1_M0qEE6ymwtC8ZKD X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-03_07,2023-11-02_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 clxscore=1015 lowpriorityscore=0 adultscore=0 phishscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2310240000 definitions=main-2311030062 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781528005920639377 X-GMAIL-MSGID: 1781528005920639377 Hi Segher, I have incorporated changes in the code as per the review comments provided by you for version 2 of the patch. Please review. Regards, Surya rs6000/p8swap: Fix incorrect lane extraction by vec_extract() [PR106770] In the routine rs6000_analyze_swaps(), special handling of swappable instructions is done even if the webs that contain the swappable instructions are not optimized, i.e., the webs do not contain any permuting load/store instructions along with the associated register swap instructions. Doing special handling in such webs will result in the extracted lane being adjusted unnecessarily for vec_extract. Another issue is that existing code treats non-permuting loads/stores as special swappables. Non-permuting loads/stores (that have not yet been split into a permuting load/store and a swap) are handled by converting them into a permuting load/store (which effectively removes the swap). As a result, if special swappables are handled only in webs containing permuting loads/stores, then non-optimal code is generated for non-permuting loads/stores. Hence, in this patch, all webs containing either permuting loads/ stores or non-permuting loads/stores are marked as requiring special handling of swappables. Swaps associated with permuting loads/stores are marked for removal, and non-permuting loads/stores are converted to permuting loads/stores. Then the special swappables in the webs are fixed up. This patch also ensures that swappable instructions are not modified in the following webs as it is incorrect to do so: - webs containing permuting load/store instructions and associated swap instructions that are transformed by converting the permuting memory instructions into non-permuting instructions and removing the swap instructions. - webs where swap(load(vector constant)) instructions are replaced with load(swapped vector constant). 2023-09-10 Surya Kumari Jangala gcc/ PR rtl-optimization/PR106770 * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New function. (handle_non_permuting_mem_insn): New function. (rs6000_analyze_swaps): Handle swappable instructions only in certain webs. (web_requires_special_handling): New instance variable. (handle_special_swappables): Remove handling of non-permuting load/store instructions. gcc/testsuite/ PR rtl-optimization/PR106770 * gcc.target/powerpc/pr106770.c: New test. diff --git a/gcc/config/rs6000/rs6000-p8swap.cc b/gcc/config/rs6000/rs6000-p8swap.cc index 0388b9bd736..02ea299bc3d 100644 --- a/gcc/config/rs6000/rs6000-p8swap.cc +++ b/gcc/config/rs6000/rs6000-p8swap.cc @@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base unsigned int special_handling : 4; /* Set if the web represented by this entry cannot be optimized. */ unsigned int web_not_optimizable : 1; + /* Set if the swappable insns in the web represented by this entry + have to be fixed. Swappable insns have to be fixed in: + - webs containing permuting loads/stores and the swap insns + in such webs have been marked for removal + - webs where non-permuting loads/stores have been converted + to permuting loads/stores */ + unsigned int web_requires_special_handling : 1; /* Set if this insn should be deleted. */ unsigned int will_delete : 1; }; @@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry *insn_entry, unsigned i) if (dump_file) fprintf (dump_file, "Adjusting subreg in insn %d\n", i); break; - case SH_NOSWAP_LD: - /* Convert a non-permuting load to a permuting one. */ - permute_load (insn); - break; - case SH_NOSWAP_ST: - /* Convert a non-permuting store to a permuting one. */ - permute_store (insn); - break; case SH_EXTRACT: /* Change the lane on an extract operation. */ adjust_extract (insn); @@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun) free (to_delete); } +/* Return true if insn is a non-permuting load/store. */ +static bool +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i) +{ + return insn_entry[i].special_handling == SH_NOSWAP_LD + || insn_entry[i].special_handling == SH_NOSWAP_ST; +} + +/* Convert a non-permuting load/store insn to a permuting one. */ +static void +convert_mem_insn (swap_web_entry *insn_entry, unsigned int i) +{ + rtx_insn *insn = insn_entry[i].insn; + if (insn_entry[i].special_handling == SH_NOSWAP_LD) + permute_load (insn); + if (insn_entry[i].special_handling == SH_NOSWAP_ST) + permute_store (insn); +} + /* Main entry point for this pass. */ unsigned int rs6000_analyze_swaps (function *fun) @@ -2624,25 +2642,55 @@ rs6000_analyze_swaps (function *fun) dump_swap_insn_table (insn_entry); } - /* For each load and store in an optimizable web (which implies - the loads and stores are permuting), find the associated - register swaps and mark them for removal. Due to various - optimizations we may mark the same swap more than once. Also - perform special handling for swappable insns that require it. */ + /* There are two kinds of optimizations that can be performed on an + optimizable web: + 1. Remove the register swaps associated with permuting load/store + in an optimizable web + 2. Convert the vanilla loads/stores (that have not yet been split + into a permuting load/store and a swap) into a permuting + load/store (which effectively removes the swap) + In both the cases, swappable instructions in the webs need + special handling to fix them up. */ for (i = 0; i < e; ++i) + /* For each permuting load/store in an optimizable web, find + the associated register swaps and mark them for removal. + Due to various optimizations we may mark the same swap more + than once. */ if ((insn_entry[i].is_load || insn_entry[i].is_store) && insn_entry[i].is_swap) { swap_web_entry* root_entry = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); if (!root_entry->web_not_optimizable) - mark_swaps_for_removal (insn_entry, i); + { + mark_swaps_for_removal (insn_entry, i); + root_entry->web_requires_special_handling = true; + } } - else if (insn_entry[i].is_swappable && insn_entry[i].special_handling) + /* Convert the non-permuting loads/stores into a permuting + load/store. */ + else if (insn_entry[i].is_swappable + && non_permuting_mem_insn (insn_entry, i)) { swap_web_entry* root_entry = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); if (!root_entry->web_not_optimizable) + { + convert_mem_insn (insn_entry, i); + root_entry->web_requires_special_handling = true; + } + } + + /* Now that the webs which require special handling have been + identified, modify the instructions that are sensitive to + element order. */ + for (i = 0; i < e; ++i) + if (insn_entry[i].is_swappable && insn_entry[i].special_handling + && !non_permuting_mem_insn (insn_entry, i)) + { + swap_web_entry* root_entry + = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); + if (root_entry->web_requires_special_handling) handle_special_swappables (insn_entry, i); } diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c b/gcc/testsuite/gcc.target/powerpc/pr106770.c new file mode 100644 index 00000000000..5b300b94a41 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c @@ -0,0 +1,20 @@ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */ +/* The 2 xxpermdi instructions are generated by the two + calls to vec_promote() */ +/* { dg-final { scan-assembler-times {xxpermdi} 2 } } */ + +/* Test case to resolve PR106770 */ + +#include + +int cmp2(double a, double b) +{ + vector double va = vec_promote(a, 1); + vector double vb = vec_promote(b, 1); + vector long long vlt = (vector long long)vec_cmplt(va, vb); + vector long long vgt = (vector long long)vec_cmplt(vb, va); + vector signed long long vr = vec_sub(vlt, vgt); + + return vec_extract(vr, 1); +}