From patchwork Fri Dec 29 10:28:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Xue OS X-Patchwork-Id: 183889 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp2486605dyb; Fri, 29 Dec 2023 02:29:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IHC5GTz0NXZT6YNXTpbKUczPkabvvaSKIDftgrmz2K6YeHyzarHDL2XMEVUaAyLecLoI9z/ X-Received: by 2002:a05:622a:64b:b0:427:7c88:fc26 with SMTP id a11-20020a05622a064b00b004277c88fc26mr17259429qtb.32.1703845787069; Fri, 29 Dec 2023 02:29:47 -0800 (PST) ARC-Seal: i=3; a=rsa-sha256; t=1703845787; cv=pass; d=google.com; s=arc-20160816; b=YP/rLyer1g5/UDGPd0Gbv0Anc7dIMkCPhDJO9N1rHU/0KVnsjKo+t8dE/4TSsSUqmm 0Fv+zW9nkcXzxWiJT9wlexHDr5zbAHbkjWyfIbiPs3ca1qCsWhm0zpW4rUklrtM12/ly J/dzGuMRbml7caPSfaWyKlHTm7sYn7KhaYLe7blS1BjX017e/s0dUawOXh/UipF7ILlx 4uRE0EMqJS36BnGy5FwJATg3+xg5LKcyqkcBDmYW7S5QdyIaXeZfx/e+647kiXpsu73U hJqFHeTqHO80MS3H6KV7IcEYyHgqKXzmYvfHmPrGwAK1UqnAPt0Xb8n7M7RED7zMnD5w j3Ow== ARC-Message-Signature: i=3; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:msip_labels:content-language :accept-language:message-id:date:thread-index:thread-topic:subject :to:from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=njzQDJs2r8h6xcvTUdQ/imdhkG2sbrM2pRb95jyklFs=; fh=XNn3asQvIblazGK92GBt13dVv+YmGV3pBS0JC29ZQco=; b=FVWKVvnXr7BsPX4LxjtqlPHpz7o1embJu2wq8ZIqIWWHsUFI47q/2mq/mCcmemFDyN WOJlceILm0i4NWKkxmp164EDx7SYNi3V45gkjmtF+pIau7x4d87hzvJmSs6hhS1YCmM0 XiK0JGHY7x1VbGzLnTaZz6YbLH2vvuF170qpow7xsHoYH0By1Zb+H8DOEo0VdmG024G4 CxZ7dmJOmnyXjs9EUgZNYcfZMOEiVCaDHUJz1mCdBg9pJ8uL0PBL8R5wSABI3OqW4hyO wvkeH5uV3Aafa4ZCfEHbUEYNUb+IuIpuWV80B6YVQ9PmW+S9aH3980vvh6DGzuT4+02D zwHQ== ARC-Authentication-Results: i=3; mx.google.com; dkim=pass header.i=@os.amperecomputing.com header.s=selector2 header.b=vwRHgnfO; arc=pass (i=2); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amperecomputing.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b17-20020ac85bd1000000b004257557d656si19378847qtb.745.2023.12.29.02.29.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Dec 2023 02:29:47 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@os.amperecomputing.com header.s=selector2 header.b=vwRHgnfO; arc=pass (i=2); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amperecomputing.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BD3F43858404 for ; Fri, 29 Dec 2023 10:29:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2119.outbound.protection.outlook.com [40.107.237.119]) by sourceware.org (Postfix) with ESMTPS id 682013858D33 for ; Fri, 29 Dec 2023 10:28:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 682013858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=os.amperecomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 682013858D33 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.237.119 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1703845725; cv=pass; b=o9bkazWLgxW+JFxm2tAacNadsvcprrzqJvXe9no9mFdxhxdRuArLirwWauNC6jFLBXL76QbihyulkRgP3INqbPVgf/fgO4Pt8be+91ZhuFv6lbS5yVdeoPw1m1yyqCLcKUpkcPuYB980tOETQ6wnqmbCbpa17kCtfpXrroJBgtM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1703845725; c=relaxed/simple; bh=eweQw0fDpj/BgFZIX21RfEgDt0Xq402TEiCmE24fIBw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=lbtpPaz+kjCnx0clxNhVzjmkFMWEB+8IKkpHLNgoSQIifqM+zNWiFMinrdKM0o5CHfzrZS59hxDIJrJ1d8yw+2YYo4nLkjSQWqjqkwLpQZjWMwqTK9kkeGsF9z11uCaLgO7rXB+uyKwexPo6EQU+Br/f5aV9+rNfMNhl21pGleU= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Maqp2N2EiY63rj76AqCPvB1Cc/jdWVM4WirmVU2SUy6NHBmCG/y2fwnrwIflaphPwYQpD8tAufEIxTsiihl5kHmHXvAdvHDYS0eVwBZo2eLAePbQbc9EyKQx1YN9C946KTCgP1C4lc6blKJBW6vuKArShDWM+chGU/1Wg32ZTFiMzL7cfYEIC4ZIiu7qiUo+eN2Un+oLuTne09cNv/rUaJ77h2+z2n4lJbAp8L9kYOPTM1bjH2pbA2uRbQHcP7w+7YdsNyhTxUu/NOBy3qH3PA3aYoBUVUSqHTKVUV31DgT+NHqJIF8ikT738M+Bp2CSAu8QMrVRPo+EQwUmKBHI1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=njzQDJs2r8h6xcvTUdQ/imdhkG2sbrM2pRb95jyklFs=; b=hKAIMtaYJXKXNyr9NscR+q7JGC9I111lDtaUreNDmCblDQWtNLpkt74PoMGpPC+a4HHARUhFsBqQHU87hu996Qan6fIZI3BhgG18c2U7B6kC+JFOfzQ3wygowUDl7IR8zq7JEaxy9d0+HESjKu9PRv0v8eUBDLEsnGwdeobrHt4ALJ0FGFFfBNkq+57tms4vzhNqc/D9kpi4zLqS1D7MwC/JowPrOR++kHZ+hG8YOi1PnlM47tSAUhcJOmrwN52bFNj/t5QxrR6ZkOkV2ocg95B4dX40kM5IriU4nWwosff2ko0osb7qeuDUkhqBLnfZ7O4ozEBh02fXlKi69Qywxw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=njzQDJs2r8h6xcvTUdQ/imdhkG2sbrM2pRb95jyklFs=; b=vwRHgnfOUZif5cke48uyDHBiOX/m+yn8RYedXDG8UbvF6GixBri5PdCSCrDWKlhT/0EGomwjGQZOLWqcaXDnLnAZEAB+/yjNq5Znnk2Sp4MwM5pf4cKs1NOCDAfBD3zD1uXd/SSArUZ2ALqTl1u/d3zH1jRxbK3PI3d6M4eJNC0= Received: from LV2PR01MB7839.prod.exchangelabs.com (2603:10b6:408:14f::13) by PH0PR01MB7473.prod.exchangelabs.com (2603:10b6:510:f0::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7113.27; Fri, 29 Dec 2023 10:28:39 +0000 Received: from LV2PR01MB7839.prod.exchangelabs.com ([fe80::49c6:c879:4612:1907]) by LV2PR01MB7839.prod.exchangelabs.com ([fe80::49c6:c879:4612:1907%4]) with mapi id 15.20.7113.027; Fri, 29 Dec 2023 10:28:39 +0000 From: Feng Xue OS To: "gcc-patches@gcc.gnu.org" Subject: [PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091] Thread-Topic: [PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091] Thread-Index: AQHaOj+7eU2CkybyGEufZnkwsqWdbw== Date: Fri, 29 Dec 2023 10:28:38 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Enabled=True; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SiteId=3bc2b170-fd94-476d-b0ce-4229bdc904a7; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SetDate=2023-12-29T10:28:38.192Z; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Name=Confidential; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ContentBits=0; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Method=Standard; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=os.amperecomputing.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: LV2PR01MB7839:EE_|PH0PR01MB7473:EE_ x-ms-office365-filtering-correlation-id: 77792fb5-4930-492e-16fe-08dc0858ebb7 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: F0n1J+uEGJmpwCJJ8fbhxYfKbTm1832RGZ2yes7nzoipVvmxdPiAwegCgnhkl94o7qmYQXdp5haBxqSs8qlVuFeHQjNjhQhpnDXdSA9dr3NNXCqtinhwNpZLW+yexaoNA9OD/x4hBbVoBWEKvQD6Th+2SpADbgq855uzjvPnwFTwBb/DQBO9TSldcuzbSsjpfKFUZsAIEBsysYbFPSIGsDkroc00CvCBEojQHhbyyM++w+Fp/4hVYV0Y+UFn6Wiy/MH08KqXME3Cfj4qeF4L4thPjiin201rDUAN+MUTNE+6AaSfUyO/YPOu2qRmxEhTzmrVdEnxHFzCAQAtKZ48vTX+VvmSKs73e9b18rXSwFxhVhMTE7qtO4oN1HqjRISP2SZuI/jyKT06bw+drU+0hJi7q0CQYs0S1tZcLBOswFAjS95GZAUvCA44glQnNLQ/bLnHAqzvOgIX/d9xLaO6zCwfa63xANPCQd7g59i0t27ICKavPp1nuJM5+NqVbLaO4zqFlxYQXJJ4pRiUrrXCn+Fn2Ndo0PCOBUKvD3Ok/oTxCmbu1wxksCxKYlO0UcrP9hfxQpWPpqY36kICWRjZp6HJ+4rtd32Z9VrWyuAR6/5RlL4M38n7WoUsUlzY/CuNdP/B2bizcdoIv1xKMTNihDnUAYsG2jyf0/FVDv4/aSE= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:LV2PR01MB7839.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(346002)(39840400004)(136003)(376002)(396003)(230273577357003)(230173577357003)(230922051799003)(1800799012)(186009)(451199024)(64100799003)(55016003)(38070700009)(91956017)(76116006)(6506007)(478600001)(7696005)(64756008)(71200400001)(66446008)(8936002)(66556008)(9686003)(2906002)(6916009)(316002)(8676002)(66476007)(66946007)(52536014)(83380400001)(26005)(38100700002)(122000001)(86362001)(5660300002)(41300700001)(33656002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?WxiOGwCP1cii7otEiLIrIXF?= =?iso-8859-1?q?K+y8BKGOHRjfi0mA7ooaonKw2r/u+Z8lgbZMwIBmSmWO3m++YgeXAQxTVjdy?= =?iso-8859-1?q?qk42l57XsJiowGl9PFYZBK+Y+jZ7pF5Dfuu8OAnd2c4GSMtxTDg0nGn668gZ?= =?iso-8859-1?q?NbIl7vD3SBPuj4fNHHtVnSHUyab7uK0K1Sen9FP0ep84DuM3CK95HUMe7Oaf?= =?iso-8859-1?q?1KaYBGfACR6oiBuaddrQMfkCCjEmNOO3jox1b6fi+NPr2HguP+r88Hfxl/Jk?= =?iso-8859-1?q?Psyu6EaokvOKqs6Jeo7Vk8u8zpV8sYUDFCJWAHj0sD+xjvy8Q0ab5ujb6FmN?= =?iso-8859-1?q?iXyMcscFznNeh3mJR1cDsS1w/KDiskU8IQbUuaD7wn9ytvoyJ0OBIDYnI2on?= =?iso-8859-1?q?XbYdN3cIKah/TlRtr9D4PwHzkIH9Ec4GS0SN5igG8B8z5zRKeODH5w4A4+xp?= =?iso-8859-1?q?k71QSqiulUv4NYPI/NZWOREIdRqBihD2v9yU8D3J+I9DtBJKVMpyJTm6NM70?= =?iso-8859-1?q?xpvZuSZSIx3XJlEypPyGTwufn5AYLZl5PV9+0hoyHfNE70D+/5ck5v9ZeAuC?= =?iso-8859-1?q?q1QgV88/tdRu5//Uq5AfLRmc6UuxBicD6bGgL5g/vtXelUFT5BJrfoTWD7ro?= =?iso-8859-1?q?b0AgshwUzdX+7kDGxGGsrrRm0Eu6eD68myEQPVS0DCNgHLLCVfCS9NClUXh0?= =?iso-8859-1?q?JqFeWGAflYMpQEjS0D/67FDGsajH6K+SVnal9VHR1KqiOhb2SitYsso2S4Mo?= =?iso-8859-1?q?K+uGpRhSz2Z64EnC/V5prAxebIzfMsbNRp9U3qyQ2JDlrfmGFSXSk0veQMzy?= =?iso-8859-1?q?bI+65RKyWvSOeeOc8hAxWs/T06YEGKF84iDThXX9zIHJGL4fBD5FbU1pwx7n?= =?iso-8859-1?q?drP2m8Lom7/EGmVGh6o6HL6fMXZjDWw4ohwrUc8NxIrxpVdnXdH7DHfzaFbe?= =?iso-8859-1?q?7Msqcv5hq0lSaQFGK0UYJ2/dXa6NOJlyQ0u9d2Hpr3EPxuPJl3eMWe21ni5y?= =?iso-8859-1?q?bWbA8c5+L2GnmBGlE6bPfqteTUgeX25O+3AoKT8M+XUKviGG8i9GUXabrKFN?= =?iso-8859-1?q?E/puJe5JmEn6c5n/iK19/+KZh5I5WQ7ViD7kc2zaRwkvTyBjIgjfQGoj2B3z?= =?iso-8859-1?q?aEHhq1QuVBhFd046yCUUo7M6xMCTmtXhMcWpKuKMp+a5FKbRQU6RFSohHw/9?= =?iso-8859-1?q?ObPN1rU/kHUhadZoMejNM8Q8WVa/s4mEObA1MGrCAtkzBMIL7eTKDfJiJxoa?= =?iso-8859-1?q?QGBfqyYhAkqqCEFKmM7uq/Z0GhKLkBHUzRpzFyfU5AyTZv7oHzWwYZ+UvGHr?= =?iso-8859-1?q?B7yZEloxtPUNz0IelEh0q89VG64wDlstLM8oc50f7EMOy/Uis8zxppkBWX14?= =?iso-8859-1?q?2k1nmqEYCydH6jQTahL946KFwmrsD/xuPPZr5qubkTJ2V04ZoMlfDEGiMKZn?= =?iso-8859-1?q?Ycs7WFhRjOPgjpqrYHZHVB9wYscODUwQWcV3LCqzXC735ANQ9beMMNLu3KPm?= =?iso-8859-1?q?Idn6WjgbaWCko+6etBaXpfPKagFhz3A/M9UVz0YDfEphW0jQCRWv/tvYt+4u?= =?iso-8859-1?q?mXJ0W3bO4j/Ghe/HzmAEbGddZRUPM6O0y3hmS/sTpNca5o7OgQEcm4kBh7f6?= =?iso-8859-1?q?iUDx0tyidv27aW5v/?= MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: LV2PR01MB7839.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 77792fb5-4930-492e-16fe-08dc0858ebb7 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Dec 2023 10:28:38.4293 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: AN7Uf1X0ob2aLHjmqB/J0Df7KUb4eb4P+eOyD8yrE49jJiZ78v7LcM5jLFTlg+VbFo/nuz1xwAZLadgb2krMHZ3DQnMhiqQH2WEqL0C6oyw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR01MB7473 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786611799753609050 X-GMAIL-MSGID: 1786611799753609050 This patch is meant to fix over-estimation about SLP vector-to-scalar cost for STMT_VINFO_LIVE_P statement. When pattern recognition is involved, a statement whose definition is consumed in some pattern, may not be included in the final replacement pattern statements, and would be skipped when building SLP graph. * Original char a_c = *(char *) a; char b_c = *(char *) b; unsigned short a_s = (unsigned short) a_c; int a_i = (int) a_s; int b_i = (int) b_c; int r_i = a_i - b_i; * After pattern replacement a_s = (unsigned short) a_c; a_i = (int) a_s; patt_b_s = (unsigned short) b_c; // b_i = (int) b_c patt_b_i = (int) patt_b_s; // b_i = (int) b_c patt_r_s = widen_minus(a_c, b_c); // r_i = a_i - b_i patt_r_i = (int) patt_r_s; // r_i = a_i - b_i The definitions of a_i(original statement) and b_i(pattern statement) are related to, but actually not part of widen_minus pattern. Vectorizing the pattern does not cause these definition statements to be marked as PURE_SLP. For this case, we need to recursively check whether their uses are all absorbed into vectorized code. But there is an exception that some use may participate in an vectorized operation via an external SLP node containing that use as an element. Feng --- .../gcc.target/aarch64/bb-slp-pr113091.c | 22 ++ gcc/tree-vect-slp.cc | 189 ++++++++++++++---- 2 files changed, 172 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c diff --git a/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c b/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c new file mode 100644 index 00000000000..ff822e90b4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -fdump-tree-slp-details -ftree-slp-vectorize" } */ + +int test(unsigned array[8]); + +int foo(char *a, char *b) +{ + unsigned array[8]; + + array[0] = (a[0] - b[0]); + array[1] = (a[1] - b[1]); + array[2] = (a[2] - b[2]); + array[3] = (a[3] - b[3]); + array[4] = (a[4] - b[4]); + array[5] = (a[5] - b[5]); + array[6] = (a[6] - b[6]); + array[7] = (a[7] - b[7]); + + return test(array); +} + +/* { dg-final { scan-tree-dump-times "Basic block will be vectorized using SLP" 1 "slp2" } } */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index a82fca45161..d36ff37114e 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -6418,6 +6418,84 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, return res; } +/* Given a definition DEF, analyze if it will have any live scalar use after + performing SLP vectorization whose information is represented by BB_VINFO, + and record result into hash map SCALAR_USE_MAP as cache for later fast + check. */ + +static bool +vec_slp_has_scalar_use (bb_vec_info bb_vinfo, tree def, + hash_map &scalar_use_map) +{ + imm_use_iterator use_iter; + gimple *use_stmt; + + if (bool *res = scalar_use_map.get (def)) + return *res; + + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, def) + { + if (is_gimple_debug (use_stmt)) + continue; + + stmt_vec_info use_stmt_info = bb_vinfo->lookup_stmt (use_stmt); + + if (!use_stmt_info) + break; + + if (PURE_SLP_STMT (vect_stmt_to_vectorize (use_stmt_info))) + continue; + + /* Do not step forward when encounter PHI statement, since it may + involve cyclic reference and cause infinite recursive invocation. */ + if (gimple_code (use_stmt) == GIMPLE_PHI) + break; + + /* When pattern recognition is involved, a statement whose definition is + consumed in some pattern, may not be included in the final replacement + pattern statements, so would be skipped when building SLP graph. + + * Original + char a_c = *(char *) a; + char b_c = *(char *) b; + unsigned short a_s = (unsigned short) a_c; + int a_i = (int) a_s; + int b_i = (int) b_c; + int r_i = a_i - b_i; + + * After pattern replacement + a_s = (unsigned short) a_c; + a_i = (int) a_s; + + patt_b_s = (unsigned short) b_c; // b_i = (int) b_c + patt_b_i = (int) patt_b_s; // b_i = (int) b_c + + patt_r_s = widen_minus(a_c, b_c); // r_i = a_i - b_i + patt_r_i = (int) patt_r_s; // r_i = a_i - b_i + + The definitions of a_i(original statement) and b_i(pattern statement) + are related to, but actually not part of widen_minus pattern. + Vectorizing the pattern does not cause these definition statements to + be marked as PURE_SLP. For this case, we need to recursively check + whether their uses are all absorbed into vectorized code. But there + is an exception that some use may participate in an vectorized + operation via an external SLP node containing that use as an element. + The parameter "scalar_use_map" tags such kind of SSA as having scalar + use in advance. */ + tree lhs = gimple_get_lhs (use_stmt); + + if (!lhs || TREE_CODE (lhs) != SSA_NAME + || vec_slp_has_scalar_use (bb_vinfo, lhs, scalar_use_map)) + break; + } + + bool found = !end_imm_use_stmt_p (&use_iter); + bool added = scalar_use_map.put (def, found); + + gcc_assert (!added); + return found; +} + /* Mark lanes of NODE that are live outside of the basic-block vectorized region and that can be vectorized using vectorizable_live_operation with STMT_VINFO_LIVE_P. Not handled live operations will cause the @@ -6427,6 +6505,7 @@ static void vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, slp_instance instance, stmt_vector_for_cost *cost_vec, + hash_map &scalar_use_map, hash_set &svisited, hash_set &visited) { @@ -6451,32 +6530,22 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, def_operand_p def_p; FOR_EACH_PHI_OR_STMT_DEF (def_p, orig_stmt, op_iter, SSA_OP_DEF) { - imm_use_iterator use_iter; - gimple *use_stmt; - stmt_vec_info use_stmt_info; - FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p)) - if (!is_gimple_debug (use_stmt)) - { - use_stmt_info = bb_vinfo->lookup_stmt (use_stmt); - if (!use_stmt_info - || !PURE_SLP_STMT (vect_stmt_to_vectorize (use_stmt_info))) - { - STMT_VINFO_LIVE_P (stmt_info) = true; - if (vectorizable_live_operation (bb_vinfo, stmt_info, - node, instance, i, - false, cost_vec)) - /* ??? So we know we can vectorize the live stmt - from one SLP node. If we cannot do so from all - or none consistently we'd have to record which - SLP node (and lane) we want to use for the live - operation. So make sure we can code-generate - from all nodes. */ - mark_visited = false; - else - STMT_VINFO_LIVE_P (stmt_info) = false; - break; - } - } + if (vec_slp_has_scalar_use (bb_vinfo, DEF_FROM_PTR (def_p), + scalar_use_map)) + { + STMT_VINFO_LIVE_P (stmt_info) = true; + if (vectorizable_live_operation (bb_vinfo, stmt_info, node, + instance, i, false, cost_vec)) + /* ??? So we know we can vectorize the live stmt from one SLP + node. If we cannot do so from all or none consistently + we'd have to record which SLP node (and lane) we want to + use for the live operation. So make sure we can + code-generate from all nodes. */ + mark_visited = false; + else + STMT_VINFO_LIVE_P (stmt_info) = false; + } + /* We have to verify whether we can insert the lane extract before all uses. The following is a conservative approximation. We cannot put this into vectorizable_live_operation because @@ -6495,6 +6564,10 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, from the latest stmt in a node. So we compensate for this during code-generation, simply not replacing uses for those hopefully rare cases. */ + imm_use_iterator use_iter; + gimple *use_stmt; + stmt_vec_info use_stmt_info; + if (STMT_VINFO_LIVE_P (stmt_info)) FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p)) if (!is_gimple_debug (use_stmt) @@ -6517,8 +6590,56 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, slp_tree child; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) if (child && SLP_TREE_DEF_TYPE (child) == vect_internal_def) - vect_bb_slp_mark_live_stmts (bb_vinfo, child, instance, - cost_vec, svisited, visited); + vect_bb_slp_mark_live_stmts (bb_vinfo, child, instance, cost_vec, + scalar_use_map, svisited, visited); +} + +/* Traverse all slp instances of BB_VINFO, and mark lanes of every node that + are live outside of the basic-block vectorized region and that can be + vectorized using vectorizable_live_operation with STMT_VINFO_LIVE_P. */ + +static void +vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo) +{ + if (bb_vinfo->slp_instances.is_empty ()) + return; + + hash_set svisited; + hash_set visited; + hash_map scalar_use_map; + auto_vec worklist; + + for (slp_instance instance : bb_vinfo->slp_instances) + if (!visited.add (SLP_INSTANCE_TREE (instance))) + worklist.safe_push (SLP_INSTANCE_TREE (instance)); + + do + { + slp_tree node = worklist.pop (); + + if (SLP_TREE_DEF_TYPE (node) == vect_external_def) + { + for (tree op : SLP_TREE_SCALAR_OPS (node)) + if (TREE_CODE (op) == SSA_NAME) + scalar_use_map.put (op, true); + } + else + { + for (slp_tree child : SLP_TREE_CHILDREN (node)) + if (child && !visited.add (child)) + worklist.safe_push (child); + } + } while (!worklist.is_empty ()); + + visited.empty (); + + for (slp_instance instance : bb_vinfo->slp_instances) + { + vect_location = instance->location (); + vect_bb_slp_mark_live_stmts (bb_vinfo, SLP_INSTANCE_TREE (instance), + instance, &instance->cost_vec, + scalar_use_map, svisited, visited); + } } /* Determine whether we can vectorize the reduction epilogue for INSTANCE. */ @@ -6684,17 +6805,7 @@ vect_slp_analyze_operations (vec_info *vinfo) /* Compute vectorizable live stmts. */ if (bb_vec_info bb_vinfo = dyn_cast (vinfo)) - { - hash_set svisited; - hash_set visited; - for (i = 0; vinfo->slp_instances.iterate (i, &instance); ++i) - { - vect_location = instance->location (); - vect_bb_slp_mark_live_stmts (bb_vinfo, SLP_INSTANCE_TREE (instance), - instance, &instance->cost_vec, svisited, - visited); - } - } + vect_bb_slp_mark_live_stmts (bb_vinfo); return !vinfo->slp_instances.is_empty (); }