Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]
Message ID | DM4PR11MB5487BDBFCAA75C00EAFBDE3EEC6D9@DM4PR11MB5487.namprd11.prod.outlook.com |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:38f:b0:2d5:3c95:9e21 with SMTP id 15csp95847pxh; Wed, 17 Aug 2022 23:49:00 -0700 (PDT) X-Google-Smtp-Source: AA6agR4uRS9ViXmAz6x2WkiF7WsULEHQeaRODNWRfiOC3TSqqqFMQBQpFQnBQr/xOM4bj8/tvimS X-Received: by 2002:a05:6402:40d0:b0:43f:8f56:6b0e with SMTP id z16-20020a05640240d000b0043f8f566b0emr1155816edb.380.1660805340724; Wed, 17 Aug 2022 23:49:00 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id x8-20020a05640225c800b0044603506bddsi673852edb.137.2022.08.17.23.49.00 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Aug 2022 23:49:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tRnRT3VH; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4BB72385841D for <ouuuleilei@gmail.com>; Thu, 18 Aug 2022 06:48:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4BB72385841D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660805339; bh=5xu6AX7BxgHg95k61EiX9Vy8uPGUkJFGP12HTZLAjZ4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=tRnRT3VH2uabi/NOzDq0ttAuJ9o9q2eU66RZLFPtSSBnG3pLemTaRaIuM0tkq4uJk R/bWjY3IKKuHDxwrTt0qdyWcA19tqbzKcznA7KKjSgDUAfThx1fdYfw0wnoIyZKnYg oYUqPe2CmPiRvBoM/y2RM9+jsT7cbiSjqaRp2sr8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 6C92E3858CDA for <gcc-patches@gcc.gnu.org>; Thu, 18 Aug 2022 06:48:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6C92E3858CDA X-IronPort-AV: E=McAfee;i="6500,9779,10442"; a="291434348" X-IronPort-AV: E=Sophos;i="5.93,245,1654585200"; d="scan'208";a="291434348" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Aug 2022 23:47:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,245,1654585200"; d="scan'208";a="749992885" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by fmsmga001.fm.intel.com with ESMTP; 17 Aug 2022 23:47:56 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Wed, 17 Aug 2022 23:47:56 -0700 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Wed, 17 Aug 2022 23:47:55 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28 via Frontend Transport; Wed, 17 Aug 2022 23:47:55 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.175) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2375.28; Wed, 17 Aug 2022 23:47:54 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kz537lnN5x/JxIfQz31j3J0tnThqPHMv0No/8O8xaLdTlyOtDxeru0iB2TyQp1FqMpFxkxtQLSNdiem6WwldhMJJnMXtJkunZ8CmG+Y+ZbPgGoi5MUt21Dfgvgbxn0lJoMRagorr2lx75ln1A+20gxlsccamW57zoSCRubJ88x+Gm9zSCzaIMqAIMU4BVcP/4G/5Ip+/fCuNlEBPy61/E+UWvdgu22XH64RgFPlKd/0hKN9eZ6e75mtIlDdjEwDurfopsaquoJxW22h4izCkCVy6STvPWOCuYchiSkF7l6GquVe7ZFKt+8GYwAEBtCMSPXtFr6keQ37+eSPXTpYV7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5xu6AX7BxgHg95k61EiX9Vy8uPGUkJFGP12HTZLAjZ4=; b=cdTuedTM2J+K/k02MGSMNM/mHdUFPd4C6PxIbO72CUaPa1GpJbKoe+q0L/fnsw2roLGjQMJWDDayB6g40/qawRrWXmkXZ/i3rvskHI8W9cqUteaS0vm8nFkOjTAZtd8e+j4O6wUCitBSsS8oL8GM0CS6NbcoLLd7T1E8tKzmyhX3JywvYicGUXg6wGSlWvNKeDhKYxxdmRh7Tvil3FB3qQNp6JEF5qIWhzBU7ZXBs9VrvUjuu84hGz3bdDbccHMIV2hSB85LjpddNWMjEXizCya6AjsYd3+ELrQbEZ+u/TDJzXWsn0ay/tbBdOsB6JO4xyMLFISZHWLmhOxH4nqQnQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from DM4PR11MB5487.namprd11.prod.outlook.com (2603:10b6:5:39f::22) by SN6PR11MB2574.namprd11.prod.outlook.com (2603:10b6:805:59::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5525.19; Thu, 18 Aug 2022 06:47:52 +0000 Received: from DM4PR11MB5487.namprd11.prod.outlook.com ([fe80::2974:d93b:a589:f189]) by DM4PR11MB5487.namprd11.prod.outlook.com ([fe80::2974:d93b:a589:f189%9]) with mapi id 15.20.5525.011; Thu, 18 Aug 2022 06:47:52 +0000 To: "Liu, Hongtao" <hongtao.liu@intel.com>, "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org> Subject: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735] Thread-Topic: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735] Thread-Index: AdiyzcjTCPng9VotSJ+bADYz6wSZJQ== Date: Thu, 18 Aug 2022 06:47:52 +0000 Message-ID: <DM4PR11MB5487BDBFCAA75C00EAFBDE3EEC6D9@DM4PR11MB5487.namprd11.prod.outlook.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.6.500.17 dlp-reaction: no-action x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: c7b3a66b-8cc9-4849-8004-08da80e592bf x-ms-traffictypediagnostic: SN6PR11MB2574:EE_ x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: hmxas/hICLXEdXrwjieOCruEJpuNSmjcFGNRh6HPXX7CZPtR0l1o3Bk8pv5Kk0zJi4lRJ92lUlSUMUDUckmMr0EE7Q+CMC/uWPTfv1reY0/onZSfeV9QKuDe1M4efuaQ22V6PGPB5dWzdBRpMK2CIIgkfuCKVLWeSjHrtM2XT90bUQZgajYRAXx5lZbTba7VLmg4QvTwuZc6KR3/01dWfJGdNEyjRYhH6PjF0yrqxTjzwu4T74QXzvid1/PTsqhnyYZ4Y6kYuJMnca6yLP+7tUeWvn+rcgbOCnWhPkTct5r1JuaFsvapKzw7j1tWAeJS6r6WWANeC9e9YqSPmOvQaoOhwBshW3ggQ/AnsMS11dBa4UfDGkh9+7nQqlEaHpSFq0X754e+9H7yN0YwT8JelAYOId7hsRTHl0QCgwHGpkMNLPkv0JSIP3Z3o4LOxcAIeTeWaewFxmT96kHn1GqCfZQoAWoeV9/IJcunuTXShuT2w+uK89geHiRY7dpNBeu5pZ0mEVunI6ddZiCTmtfuku40TOiffqYoHZwlu2d/x0SIPPEUZfj9Xc6V1AUnh2VveluqjOj1WsFONLZ90xlxb5JOf2zd4RX+49NmjfdD4yAgpjfnkKhnuY3xkgQ30VBnPphBHq80AxnDYbP3AhXhqNGVzOiMERd6iBdQra8EmhiWEmAxy2A6aOk+rik/hNXSu76hGwV8g1BainbaAWESHkIVqfO1o/RIopga1T+bpMZSt7v9wGiFgbn6NQRNKXllbqljilxhdYOrNA0aCaLhbVuhy6/KJdCg73u0IKcJtiA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM4PR11MB5487.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(136003)(39860400002)(366004)(346002)(396003)(376002)(9686003)(82960400001)(186003)(84970400001)(6506007)(7696005)(33656002)(26005)(86362001)(38070700005)(38100700002)(122000001)(55016003)(71200400001)(5660300002)(8676002)(478600001)(66946007)(4326008)(8936002)(76116006)(52536014)(66476007)(2906002)(66556008)(64756008)(66446008)(107886003)(41300700001)(316002)(110136005); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: bw2qyofADnYt+sxyXy9V4eEHBMwnAkvt2P8EpEH8B1h4Y4jxjVnFPrprQlIFJ716a2EUv5bHFpLbXGgJz3cIzZDylHiyvoB5WW2q7m4x7sc8P+hu3i3GwCwCVudHNOtzLacoAU68IKENeai8Ei+hQpzSpxenzUQhtQPKIkmwE02kZleDXepRjapSKBmaxhZE2xM4Uppk8N724QKrp2r+R+yQ1dGIo6y1yYTA2dFPx8FhCw7P0NTkHfnV4X2BWOqW1t/vWfB8FpfQzDArysZyS50Gx7/pg5i+WHhejv5R12GKQURAkFOmy2STdMlfrD+AOycBNvKRVQLfbnYD0dhKzgnnAmZC5Hu175ClmR4ovAvo5NMkwpRc+6reXZ7jQeu4P2tgcnNOXaqkNBo1P70iHgHrwvP1Z/m8F24DxnW4ZbX+BkWcN3pzfFbTawnQRo3a8kHml3lXds4er0ElXmKgLjaw0fia3N9Dt3tipzGfxb2VBsKdZb4mXJls0grFzMBlhTs06GeSwUTJZLgleEYqoFv7p5D4gne3r3tZz4gqF1x/YGODrpAvokulYb6JZ8AVpNXMXQ3GBtA9VNF5ksi2cCmjVdGZdTbu7OLW3UlKGZRoc3st/+QAd4dINw3y0cDDkxR6SGl8O8bjoB0GVdkuRh44AQlcNAd5bmbBskSj3y11OPq0/pw82WvoyzuIuCC2PbR7XK3gYo3oGG8DdM8+Kr8wnU8xfDS+TiDbXaa1NQtZTKcWqEupHlCpy4ecO5BoQkcXf9zO0kT5Ts298oWRk0W+tPrkUs7+jVCUJEKrWM9CGYCM/bqUfPPCmBzsTcH+8EIQH0qi978vKf0C7NzszIJ8TuWr7djLD17qJfnxHMccwnKsEvoLOV3JXGkitGs3lJf/MpdrICSDjVKZEc0SNOF0UOSslX62vacMVZgXhmsb1OPmLwmzx5FIgGJ+ZHEZCq9EzxRfEZZmN91rQP7jp397dNj1fUMMQlFlFs+tvkT8mNwtKy+YsQI1iOa8RbMe+iqApKvMBEEVQkW50xRjzbFHxqsKMU4LcZScZDAoSqIESdGuGsx+DXoIgdK6Ib2hjyAB9Jy9T3pi9F1u7qkNa8Sj4WWGxu6g5dOn47PeqWAimbax576Qf7EXVJ1aVd9N9oSGULbPMYn8JhWlBofFP1i6jrYybuUbHeguURrV67fBSWdJ6nq0aB3q0QG3Akc9650OoTZDPFLLkkaSQHyDA5mBR0j5ka2cenwmL+ebSvY/PCwm483442AohX/R0MjMhBXYJWuKxVvsjtKeOEHGlDH1pykctwXizyNeGdOhGlbLSft8Ym9SYgAq0jkiKtVvIXRZHhputu1WYyWUm8e1DQG0cxeDma66tfkmqJHIKj/rt77T6DshejeMmTcGTXcf7WEznyNZyS7Qg4cc5pbQ8MnI/dIOKNxnU+L4/ELD0jgTls8PaEuN5U/BpG+/iyJW0TjbdPCXvvHU5JTuiexQZuXPYGH2uqJdFvUEQatjjMwDlxTe443b0mrWwHTM0Vy0D6AD6HnXzURHZbZ729TC7k6FyWMRsaEVij45DkZrgpsdE1vX6zK+gogeZ7CBfj1E Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR11MB5487.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: c7b3a66b-8cc9-4849-8004-08da80e592bf X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2022 06:47:52.4080 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: YcUKKjy6FHHTdI87fLBkd0pZ5MKkgdEMEXkXWYFTji5o/HvWJoUNhd9lHiYrs7SkSgWteqex8ahKu6E0sp3OwA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR11MB2574 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: "Kong, Lingling via Gcc-patches" <gcc-patches@gcc.gnu.org> Reply-To: "Kong, Lingling" <lingling.kong@intel.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1741480621190727793?= X-GMAIL-MSGID: =?utf-8?q?1741480621190727793?= |
Series |
Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]
|
|
Commit Message
Li, Pan2 via Gcc-patches
Aug. 18, 2022, 6:47 a.m. UTC
Hi, This patch is for pr105735/pr101991. It will enable below optimization: { - long unsigned int bit; - - <bb 2> [local count: 32534376]: - - <bb 3> [local count: 1041207449]: - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> - # bit_12 = PHI <bit_8(5), 0(2)> - tmp_7 = bit2_6(D) & tmp_10; - bit_8 = bit_12 + 1; - if (bit_8 != 32) - goto <bb 5>; [96.97%] - else - goto <bb 4>; [3.03%] - - <bb 5> [local count: 1009658865]: - goto <bb 3>; [100.00%] - - <bb 4> [local count: 32534376]: - # tmp_11 = PHI <tmp_7(3)> - return tmp_11; + tmp_11 = tmp_4(D) & bit2_6(D); + return tmp_11; } Ok for master ? gcc/ChangeLog: PR middle-end/105735 * match.pd (bitop_with_inv_p): New match. * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. (analyze_and_compute_bitop_with_inv_effect): New function. (final_value_replacement_loop): Enhanced to handle bitop with inv induction. gcc/testsuite/ChangeLog: * gcc.target/i386/pr105735-1.c: New test. * gcc.target/i386/pr105735-2.c: New test. --- gcc/match.pd | 4 + gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ 4 files changed, 179 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c def = compute_overall_effect_of_inner_loop (ex_loop, def); + /* Handle bitop with invariant induction expression. + + .i.e + for (int i =0 ;i < 32; i++) + tmp &= bit2; + if bit2 is an invariant in loop which could simple to + tmp &= bit2. */ + tree bitinv_def; + if ((bitinv_def + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) + def = bitinv_def; + /* Handle bitwise induction expression. .i.e. -- 2.18.2
Comments
Hi Richard, could you help to have a look for the patch ? > Hi, > > This patch is for pr105735/pr101991. It will enable below optimization: > { > - long unsigned int bit; > - > - <bb 2> [local count: 32534376]: > - > - <bb 3> [local count: 1041207449]: > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > - # bit_12 = PHI <bit_8(5), 0(2)> > - tmp_7 = bit2_6(D) & tmp_10; > - bit_8 = bit_12 + 1; > - if (bit_8 != 32) > - goto <bb 5>; [96.97%] > - else > - goto <bb 4>; [3.03%] > - > - <bb 5> [local count: 1009658865]: > - goto <bb 3>; [100.00%] > - > - <bb 4> [local count: 32534376]: > - # tmp_11 = PHI <tmp_7(3)> > - return tmp_11; > + tmp_11 = tmp_4(D) & bit2_6(D); > + return tmp_11; > > } > > Ok for master ? > > gcc/ChangeLog: > > PR middle-end/105735 > * match.pd (bitop_with_inv_p): New match. > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > (analyze_and_compute_bitop_with_inv_effect): New function. > (final_value_replacement_loop): Enhanced to handle bitop > with inv induction. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr105735-1.c: New test. > * gcc.target/i386/pr105735-2.c: New test. > --- > gcc/match.pd | 4 + > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > 4 files changed, 179 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > diff --git a/gcc/match.pd b/gcc/match.pd index 562138a8034..cfe593ebb02 > 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -8050,6 +8050,10 @@ and, > (bit_not > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) @3)))) > > +(for bit_op (bit_and bit_ior bit_xor) > + (match (bitop_with_inv_p @0 @1) > + (bit_op:c @0 @1))) > + > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for signed case. */ > (simplify diff --git a/gcc/testsuite/gcc.target/i386/pr105735-1.c > b/gcc/testsuite/gcc.target/i386/pr105735-1.c > new file mode 100644 > index 00000000000..8d2123ed351 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > @@ -0,0 +1,88 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > +} } */ > + > +unsigned long long > +__attribute__((noipa)) > +foo (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo1 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo2 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo3 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo4 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo5 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 63; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res &= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res |= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res ^= bit2; > + return res; > +} > + > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c > b/gcc/testsuite/gcc.target/i386/pr105735-2.c > new file mode 100644 > index 00000000000..79c1d300b1b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > @@ -0,0 +1,28 @@ > +/* { dg-do run } */ > +/* { dg-options "-O1" } */ > + > +#include "pr105735-1.c" > + > +int main() > +{ > + unsigned long long tmp = 0x1101101ULL; > + unsigned long long bit2 = 0x1111111011110111ULL; > + if (foo (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo1 (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo4 (tmp, bit2) != 0x1101101ULL) > + __builtin_abort (); > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > + __builtin_abort (); > + if (f (tmp, 64, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > + __builtin_abort (); > +} > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index > fc59d035b19..81220f08d2e 100644 > --- a/gcc/tree-scalar-evolution.cc > +++ b/gcc/tree-scalar-evolution.cc > @@ -3635,6 +3635,53 @@ enum bit_op_kind > return fold_build2 (code1, type, inv, wide_int_to_tree (type, bits)); } > > +/* Match.pd function to match bitop with invariant expression > + .i.e. > + tmp_7 = _0 & _1; */ > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); > + > +/* Return the inductive expression of bitop with invariant if possible, > + otherwise returns DEF. */ > +static tree > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, > + tree niter) > +{ > + tree match_op[2],inv; > + tree type = TREE_TYPE (phidef); > + gphi* header_phi = NULL; > + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) > + > + op1 = PHI <phidef, inv> > + phidef = op0 & op1 > + if op0 is an invariant, it could change to > + phidef = op0 & inv. */ > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > + || TREE_CODE (match_op[1]) != SSA_NAME > + || !expr_invariant_in_loop_p (loop, match_op[0]) > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1]))) > + || gimple_phi_num_args (header_phi) != 2) > + return NULL_TREE; > + > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef) > + return NULL_TREE; > + > + enum tree_code code1 > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > + > + if (code1 == BIT_XOR_EXPR) > + { > + if (!tree_fits_uhwi_p (niter)) > + return NULL_TREE; > + unsigned HOST_WIDE_INT niter_num; > + niter_num = tree_to_uhwi (niter); > + if (niter_num % 2 != 0) > + match_op[0] = build_zero_cst (type); > + } > + > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge (loop)); > + return fold_build2 (code1, type, inv, match_op[0]); } > + > /* Do final value replacement for LOOP, return true if we did anything. */ > > bool > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) > &folded_casts); > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > + /* Handle bitop with invariant induction expression. > + > + .i.e > + for (int i =0 ;i < 32; i++) > + tmp &= bit2; > + if bit2 is an invariant in loop which could simple to > + tmp &= bit2. */ > + tree bitinv_def; > + if ((bitinv_def > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > + def = bitinv_def; > + > /* Handle bitwise induction expression. > > .i.e. > -- > 2.18.2
On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > This patch is for pr105735/pr101991. It will enable below optimization: > { > - long unsigned int bit; > - > - <bb 2> [local count: 32534376]: > - > - <bb 3> [local count: 1041207449]: > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > - # bit_12 = PHI <bit_8(5), 0(2)> > - tmp_7 = bit2_6(D) & tmp_10; > - bit_8 = bit_12 + 1; > - if (bit_8 != 32) > - goto <bb 5>; [96.97%] > - else > - goto <bb 4>; [3.03%] > - > - <bb 5> [local count: 1009658865]: > - goto <bb 3>; [100.00%] > - > - <bb 4> [local count: 32534376]: > - # tmp_11 = PHI <tmp_7(3)> > - return tmp_11; > + tmp_11 = tmp_4(D) & bit2_6(D); > + return tmp_11; > > } > > Ok for master ? > > gcc/ChangeLog: > > PR middle-end/105735 > * match.pd (bitop_with_inv_p): New match. > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > (analyze_and_compute_bitop_with_inv_effect): New function. > (final_value_replacement_loop): Enhanced to handle bitop > with inv induction. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr105735-1.c: New test. > * gcc.target/i386/pr105735-2.c: New test. > --- > gcc/match.pd | 4 + > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > 4 files changed, 179 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > diff --git a/gcc/match.pd b/gcc/match.pd index 562138a8034..cfe593ebb02 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -8050,6 +8050,10 @@ and, > (bit_not > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) @3)))) > > +(for bit_op (bit_and bit_ior bit_xor) > + (match (bitop_with_inv_p @0 @1) > + (bit_op:c @0 @1))) > + That's a single-stmt match, you shouldn't use match.pd matching for this. Instead just do if (is_gimple_assign (stmt) && ((code = gimple_assign_rhs_code (stmt)), true) && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == BIT_XOR_EXPR)) .. and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is redundant btw. - while the name suggests "with invariant" you don't actually check for that. But again, given canonicalization rules the invariant will be rhs2 so above add && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST for example. > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for signed case. */ (simplify diff --git a/gcc/testsuite/gcc.target/i386/pr105735-1.c b/gcc/testsuite/gcc.target/i386/pr105735-1.c > new file mode 100644 > index 00000000000..8d2123ed351 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > @@ -0,0 +1,88 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > +} } */ > + > +unsigned long long > +__attribute__((noipa)) > +foo (unsigned long long tmp, unsigned long long bit2) { you probably need dg-require-effective-target longlong, but is it necessary to use long long for the testcases in the first place? The IV seems to be unused, if it should match the variables bit size use sizeof (type) * 8 > + for (int bit = 0; bit < 64; bit++) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo1 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp &= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo2 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo3 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 63; bit >= 0; bit -=3) > + tmp |= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo4 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 64; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +foo5 (unsigned long long tmp, unsigned long long bit2) { > + for (int bit = 0; bit < 63; bit++) > + tmp ^= bit2; > + return tmp; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res &= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res |= bit2; > + return res; > +} > + > +unsigned long long > +__attribute__((noipa)) > +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { > + unsigned long long res = tmp; > + for (long long i = 0; i < bit; i++) > + res ^= bit2; > + return res; > +} > + > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c b/gcc/testsuite/gcc.target/i386/pr105735-2.c > new file mode 100644 > index 00000000000..79c1d300b1b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > @@ -0,0 +1,28 @@ > +/* { dg-do run } */ > +/* { dg-options "-O1" } */ > + > +#include "pr105735-1.c" > + > +int main() > +{ > + unsigned long long tmp = 0x1101101ULL; > + unsigned long long bit2 = 0x1111111011110111ULL; > + if (foo (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo1 (tmp, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (foo4 (tmp, bit2) != 0x1101101ULL) > + __builtin_abort (); > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > + __builtin_abort (); > + if (f (tmp, 64, bit2) != 0x1100101ULL) > + __builtin_abort (); > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > + __builtin_abort (); > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > + __builtin_abort (); > +} > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e 100644 > --- a/gcc/tree-scalar-evolution.cc > +++ b/gcc/tree-scalar-evolution.cc > @@ -3635,6 +3635,53 @@ enum bit_op_kind > return fold_build2 (code1, type, inv, wide_int_to_tree (type, bits)); } > > +/* Match.pd function to match bitop with invariant expression > + .i.e. > + tmp_7 = _0 & _1; */ > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); > + > +/* Return the inductive expression of bitop with invariant if possible, > + otherwise returns DEF. */ > +static tree > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, > + tree niter) > +{ > + tree match_op[2],inv; > + tree type = TREE_TYPE (phidef); > + gphi* header_phi = NULL; > + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) > + > + op1 = PHI <phidef, inv> > + phidef = op0 & op1 > + if op0 is an invariant, it could change to > + phidef = op0 & inv. */ > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > + || TREE_CODE (match_op[1]) != SSA_NAME > + || !expr_invariant_in_loop_p (loop, match_op[0]) > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1]))) > + || gimple_phi_num_args (header_phi) != 2) > + return NULL_TREE; > + > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef) > + return NULL_TREE; > + > + enum tree_code code1 > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > + > + if (code1 == BIT_XOR_EXPR) > + { > + if (!tree_fits_uhwi_p (niter)) > + return NULL_TREE; > + unsigned HOST_WIDE_INT niter_num; > + niter_num = tree_to_uhwi (niter); > + if (niter_num % 2 != 0) > + match_op[0] = build_zero_cst (type); > + } > + > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge (loop)); > + return fold_build2 (code1, type, inv, match_op[0]); } The } goes to the next line. > + > /* Do final value replacement for LOOP, return true if we did anything. */ > > bool > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) > &folded_casts); > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > + /* Handle bitop with invariant induction expression. > + > + .i.e > + for (int i =0 ;i < 32; i++) > + tmp &= bit2; > + if bit2 is an invariant in loop which could simple to > + tmp &= bit2. */ > + tree bitinv_def; > + if ((bitinv_def please use else if here otherwise looks OK. > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > + def = bitinv_def; > + > /* Handle bitwise induction expression. > > .i.e. > -- > 2.18.2 >
Hi Richard, Thanks you so much for reviewing this patch. I really appreciate it. For these review comments, I have made some changes. > That's a single-stmt match, you shouldn't use match.pd matching for this. > Instead just do > > if (is_gimple_assign (stmt) > && ((code = gimple_assign_rhs_code (stmt)), true) > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > BIT_XOR_EXPR)) Yes, I fixed it and dropped modification for match.pd. > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > redundant btw. - while the name suggests "with invariant" you don't actually > check for that. But again, given canonicalization rules the invariant will be rhs2 > so above add > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST For " with invariant", this needed op1 is invariant, and I used `expr_invariant_in_loop_p (loop, match_op[0])` for check. And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly optimized and do not need modification. > you probably need dg-require-effective-target longlong, but is it necessary to > use long long for the testcases in the first place? > The IV seems to be unused, if it should match the variables bit size use sizeof > (type) * 8 Yes, It is not necessary to use long long for the testcases. I changed type to unsigned int. > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > The } goes to the next line. Sorry, It might be something wrong with my use of gcc send-email format. > > + tree bitinv_def; > > + if ((bitinv_def > > please use else if here Sorry, If use the else if here, there is no corresponding above if. I'm not sure if you mean change bitwise induction expression if to else if. Do you agree with these changes? Thanks again for taking a look. Thanks, Lingling > -----Original Message----- > From: Richard Biener <richard.guenther@gmail.com> > Sent: Tuesday, August 23, 2022 3:27 PM > To: Kong, Lingling <lingling.kong@intel.com> > Cc: Liu, Hongtao <hongtao.liu@intel.com>; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop > with an invariant induction.[PR105735] > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches <gcc- > patches@gcc.gnu.org> wrote: > > > > Hi, > > > > This patch is for pr105735/pr101991. It will enable below optimization: > > { > > - long unsigned int bit; > > - > > - <bb 2> [local count: 32534376]: > > - > > - <bb 3> [local count: 1041207449]: > > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > > - # bit_12 = PHI <bit_8(5), 0(2)> > > - tmp_7 = bit2_6(D) & tmp_10; > > - bit_8 = bit_12 + 1; > > - if (bit_8 != 32) > > - goto <bb 5>; [96.97%] > > - else > > - goto <bb 4>; [3.03%] > > - > > - <bb 5> [local count: 1009658865]: > > - goto <bb 3>; [100.00%] > > - > > - <bb 4> [local count: 32534376]: > > - # tmp_11 = PHI <tmp_7(3)> > > - return tmp_11; > > + tmp_11 = tmp_4(D) & bit2_6(D); > > + return tmp_11; > > > > } > > > > Ok for master ? > > > > gcc/ChangeLog: > > > > PR middle-end/105735 > > * match.pd (bitop_with_inv_p): New match. > > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > > (analyze_and_compute_bitop_with_inv_effect): New function. > > (final_value_replacement_loop): Enhanced to handle bitop > > with inv induction. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/pr105735-1.c: New test. > > * gcc.target/i386/pr105735-2.c: New test. > > --- > > gcc/match.pd | 4 + > > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > > 4 files changed, 179 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > 562138a8034..cfe593ebb02 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -8050,6 +8050,10 @@ and, > > (bit_not > > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) > > @3)))) > > > > +(for bit_op (bit_and bit_ior bit_xor) (match (bitop_with_inv_p @0 > > +@1) > > + (bit_op:c @0 @1))) > > + > > That's a single-stmt match, you shouldn't use match.pd matching for this. > Instead just do > > if (is_gimple_assign (stmt) > && ((code = gimple_assign_rhs_code (stmt)), true) > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > BIT_XOR_EXPR)) > .. > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > redundant btw. - while the name suggests "with invariant" you don't actually > check for that. But again, given canonicalization rules the invariant will be rhs2 > so above add > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > for example. > > > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for > > signed case. */ (simplify diff --git > > a/gcc/testsuite/gcc.target/i386/pr105735-1.c > > b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > new file mode 100644 > > index 00000000000..8d2123ed351 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > @@ -0,0 +1,88 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > > +} } */ > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo (unsigned long long tmp, unsigned long long bit2) { > > you probably need dg-require-effective-target longlong, but is it necessary to > use long long for the testcases in the first place? > The IV seems to be unused, if it should match the variables bit size use sizeof > (type) * 8 > > > + for (int bit = 0; bit < 64; bit++) > > + tmp &= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo1 (unsigned long long tmp, unsigned long long bit2) { > > + for (int bit = 63; bit >= 0; bit -=3) > > + tmp &= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo2 (unsigned long long tmp, unsigned long long bit2) { > > + for (int bit = 0; bit < 64; bit++) > > + tmp |= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo3 (unsigned long long tmp, unsigned long long bit2) { > > + for (int bit = 63; bit >= 0; bit -=3) > > + tmp |= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo4 (unsigned long long tmp, unsigned long long bit2) { > > + for (int bit = 0; bit < 64; bit++) > > + tmp ^= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +foo5 (unsigned long long tmp, unsigned long long bit2) { > > + for (int bit = 0; bit < 63; bit++) > > + tmp ^= bit2; > > + return tmp; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +f (unsigned long long tmp, long long bit, unsigned long long bit2) { > > + unsigned long long res = tmp; > > + for (long long i = 0; i < bit; i++) > > + res &= bit2; > > + return res; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { > > + unsigned long long res = tmp; > > + for (long long i = 0; i < bit; i++) > > + res |= bit2; > > + return res; > > +} > > + > > +unsigned long long > > +__attribute__((noipa)) > > +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { > > + unsigned long long res = tmp; > > + for (long long i = 0; i < bit; i++) > > + res ^= bit2; > > + return res; > > +} > > + > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c > > b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > new file mode 100644 > > index 00000000000..79c1d300b1b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > @@ -0,0 +1,28 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-O1" } */ > > + > > +#include "pr105735-1.c" > > + > > +int main() > > +{ > > + unsigned long long tmp = 0x1101101ULL; > > + unsigned long long bit2 = 0x1111111011110111ULL; > > + if (foo (tmp, bit2) != 0x1100101ULL) > > + __builtin_abort (); > > + if (foo1 (tmp, bit2) != 0x1100101ULL) > > + __builtin_abort (); > > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > > + __builtin_abort (); > > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > > + __builtin_abort (); > > + if (foo4 (tmp, bit2) != 0x1101101ULL) > > + __builtin_abort (); > > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > > + __builtin_abort (); > > + if (f (tmp, 64, bit2) != 0x1100101ULL) > > + __builtin_abort (); > > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > > + __builtin_abort (); > > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > > + __builtin_abort (); > > +} > > diff --git a/gcc/tree-scalar-evolution.cc > > b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e 100644 > > --- a/gcc/tree-scalar-evolution.cc > > +++ b/gcc/tree-scalar-evolution.cc > > @@ -3635,6 +3635,53 @@ enum bit_op_kind > > return fold_build2 (code1, type, inv, wide_int_to_tree (type, > > bits)); } > > > > +/* Match.pd function to match bitop with invariant expression > > + .i.e. > > + tmp_7 = _0 & _1; */ > > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); > > + > > +/* Return the inductive expression of bitop with invariant if possible, > > + otherwise returns DEF. */ > > +static tree > > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, > > + tree niter) { > > + tree match_op[2],inv; > > + tree type = TREE_TYPE (phidef); > > + gphi* header_phi = NULL; > > + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) > > + > > + op1 = PHI <phidef, inv> > > + phidef = op0 & op1 > > + if op0 is an invariant, it could change to > > + phidef = op0 & inv. */ > > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > > + || TREE_CODE (match_op[1]) != SSA_NAME > > + || !expr_invariant_in_loop_p (loop, match_op[0]) > > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT > (match_op[1]))) > > + || gimple_phi_num_args (header_phi) != 2) > > + return NULL_TREE; > > + > > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != > phidef) > > + return NULL_TREE; > > + > > + enum tree_code code1 > > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > > + > > + if (code1 == BIT_XOR_EXPR) > > + { > > + if (!tree_fits_uhwi_p (niter)) > > + return NULL_TREE; > > + unsigned HOST_WIDE_INT niter_num; > > + niter_num = tree_to_uhwi (niter); > > + if (niter_num % 2 != 0) > > + match_op[0] = build_zero_cst (type); > > + } > > + > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > The } goes to the next line. > > > + > > /* Do final value replacement for LOOP, return true if we did > > anything. */ > > > > bool > > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) > > &folded_casts); > > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > > > + /* Handle bitop with invariant induction expression. > > + > > + .i.e > > + for (int i =0 ;i < 32; i++) > > + tmp &= bit2; > > + if bit2 is an invariant in loop which could simple to > > + tmp &= bit2. */ > > + tree bitinv_def; > > + if ((bitinv_def > > please use else if here > > otherwise looks OK. > > > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > > + def = bitinv_def; > > + > > /* Handle bitwise induction expression. > > > > .i.e. > > -- > > 2.18.2 > >
On Tue, Sep 13, 2022 at 9:54 AM Kong, Lingling <lingling.kong@intel.com> wrote: > > Hi Richard, > > Thanks you so much for reviewing this patch. I really appreciate it. For these review comments, I have made some changes. > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > Instead just do > > > > if (is_gimple_assign (stmt) > > && ((code = gimple_assign_rhs_code (stmt)), true) > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > BIT_XOR_EXPR)) > > Yes, I fixed it and dropped modification for match.pd. > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > > redundant btw. - while the name suggests "with invariant" you don't actually > > check for that. But again, given canonicalization rules the invariant will be rhs2 > > so above add > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > For " with invariant", this needed op1 is invariant, and I used `expr_invariant_in_loop_p (loop, match_op[0])` for check. > And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly optimized and do not need modification. > > > you probably need dg-require-effective-target longlong, but is it necessary to > > use long long for the testcases in the first place? > > The IV seems to be unused, if it should match the variables bit size use sizeof > > (type) * 8 > > Yes, It is not necessary to use long long for the testcases. I changed type to unsigned int. > > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > > > The } goes to the next line. > > Sorry, It might be something wrong with my use of gcc send-email format. > > > > + tree bitinv_def; > > > + if ((bitinv_def > > > > please use else if here > > Sorry, If use the else if here, there is no corresponding above if. I'm not sure if you mean change bitwise induction expression if to else if. Yes, use else if for the bitwise induction. Can you also make the new case conditional on 'def' (the compute_overall_effect_of_inner_loop) being chrec_dont_know? If that call produced something useful it will not be of either of the two special forms. Thus like if (def != chrec_dont_know) /* Already OK. */ ; else if ((bitinv_def = ...) .. else if (tree_fits_uhwi_p (niter) ... bitwise induction case...) ... ? Otherwise looks OK now. Thanks, Richard. > Do you agree with these changes? Thanks again for taking a look. > > Thanks, > Lingling > > > -----Original Message----- > > From: Richard Biener <richard.guenther@gmail.com> > > Sent: Tuesday, August 23, 2022 3:27 PM > > To: Kong, Lingling <lingling.kong@intel.com> > > Cc: Liu, Hongtao <hongtao.liu@intel.com>; gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop > > with an invariant induction.[PR105735] > > > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches <gcc- > > patches@gcc.gnu.org> wrote: > > > > > > Hi, > > > > > > This patch is for pr105735/pr101991. It will enable below optimization: > > > { > > > - long unsigned int bit; > > > - > > > - <bb 2> [local count: 32534376]: > > > - > > > - <bb 3> [local count: 1041207449]: > > > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > > > - # bit_12 = PHI <bit_8(5), 0(2)> > > > - tmp_7 = bit2_6(D) & tmp_10; > > > - bit_8 = bit_12 + 1; > > > - if (bit_8 != 32) > > > - goto <bb 5>; [96.97%] > > > - else > > > - goto <bb 4>; [3.03%] > > > - > > > - <bb 5> [local count: 1009658865]: > > > - goto <bb 3>; [100.00%] > > > - > > > - <bb 4> [local count: 32534376]: > > > - # tmp_11 = PHI <tmp_7(3)> > > > - return tmp_11; > > > + tmp_11 = tmp_4(D) & bit2_6(D); > > > + return tmp_11; > > > > > > } > > > > > > Ok for master ? > > > > > > gcc/ChangeLog: > > > > > > PR middle-end/105735 > > > * match.pd (bitop_with_inv_p): New match. > > > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > > > (analyze_and_compute_bitop_with_inv_effect): New function. > > > (final_value_replacement_loop): Enhanced to handle bitop > > > with inv induction. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/i386/pr105735-1.c: New test. > > > * gcc.target/i386/pr105735-2.c: New test. > > > --- > > > gcc/match.pd | 4 + > > > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > > > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > > > 4 files changed, 179 insertions(+) > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > 562138a8034..cfe593ebb02 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -8050,6 +8050,10 @@ and, > > > (bit_not > > > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) > > > @3)))) > > > > > > +(for bit_op (bit_and bit_ior bit_xor) (match (bitop_with_inv_p @0 > > > +@1) > > > + (bit_op:c @0 @1))) > > > + > > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > Instead just do > > > > if (is_gimple_assign (stmt) > > && ((code = gimple_assign_rhs_code (stmt)), true) > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > BIT_XOR_EXPR)) > > .. > > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > > redundant btw. - while the name suggests "with invariant" you don't actually > > check for that. But again, given canonicalization rules the invariant will be rhs2 > > so above add > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > > > for example. > > > > > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > > > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for > > > signed case. */ (simplify diff --git > > > a/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > new file mode 100644 > > > index 00000000000..8d2123ed351 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > @@ -0,0 +1,88 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > > > +} } */ > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo (unsigned long long tmp, unsigned long long bit2) { > > > > you probably need dg-require-effective-target longlong, but is it necessary to > > use long long for the testcases in the first place? > > The IV seems to be unused, if it should match the variables bit size use sizeof > > (type) * 8 > > > > > + for (int bit = 0; bit < 64; bit++) > > > + tmp &= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo1 (unsigned long long tmp, unsigned long long bit2) { > > > + for (int bit = 63; bit >= 0; bit -=3) > > > + tmp &= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo2 (unsigned long long tmp, unsigned long long bit2) { > > > + for (int bit = 0; bit < 64; bit++) > > > + tmp |= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo3 (unsigned long long tmp, unsigned long long bit2) { > > > + for (int bit = 63; bit >= 0; bit -=3) > > > + tmp |= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo4 (unsigned long long tmp, unsigned long long bit2) { > > > + for (int bit = 0; bit < 64; bit++) > > > + tmp ^= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +foo5 (unsigned long long tmp, unsigned long long bit2) { > > > + for (int bit = 0; bit < 63; bit++) > > > + tmp ^= bit2; > > > + return tmp; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +f (unsigned long long tmp, long long bit, unsigned long long bit2) { > > > + unsigned long long res = tmp; > > > + for (long long i = 0; i < bit; i++) > > > + res &= bit2; > > > + return res; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { > > > + unsigned long long res = tmp; > > > + for (long long i = 0; i < bit; i++) > > > + res |= bit2; > > > + return res; > > > +} > > > + > > > +unsigned long long > > > +__attribute__((noipa)) > > > +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { > > > + unsigned long long res = tmp; > > > + for (long long i = 0; i < bit; i++) > > > + res ^= bit2; > > > + return res; > > > +} > > > + > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > new file mode 100644 > > > index 00000000000..79c1d300b1b > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > @@ -0,0 +1,28 @@ > > > +/* { dg-do run } */ > > > +/* { dg-options "-O1" } */ > > > + > > > +#include "pr105735-1.c" > > > + > > > +int main() > > > +{ > > > + unsigned long long tmp = 0x1101101ULL; > > > + unsigned long long bit2 = 0x1111111011110111ULL; > > > + if (foo (tmp, bit2) != 0x1100101ULL) > > > + __builtin_abort (); > > > + if (foo1 (tmp, bit2) != 0x1100101ULL) > > > + __builtin_abort (); > > > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > > > + __builtin_abort (); > > > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > > > + __builtin_abort (); > > > + if (foo4 (tmp, bit2) != 0x1101101ULL) > > > + __builtin_abort (); > > > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > > > + __builtin_abort (); > > > + if (f (tmp, 64, bit2) != 0x1100101ULL) > > > + __builtin_abort (); > > > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > > > + __builtin_abort (); > > > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > > > + __builtin_abort (); > > > +} > > > diff --git a/gcc/tree-scalar-evolution.cc > > > b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e 100644 > > > --- a/gcc/tree-scalar-evolution.cc > > > +++ b/gcc/tree-scalar-evolution.cc > > > @@ -3635,6 +3635,53 @@ enum bit_op_kind > > > return fold_build2 (code1, type, inv, wide_int_to_tree (type, > > > bits)); } > > > > > > +/* Match.pd function to match bitop with invariant expression > > > + .i.e. > > > + tmp_7 = _0 & _1; */ > > > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); > > > + > > > +/* Return the inductive expression of bitop with invariant if possible, > > > + otherwise returns DEF. */ > > > +static tree > > > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, > > > + tree niter) { > > > + tree match_op[2],inv; > > > + tree type = TREE_TYPE (phidef); > > > + gphi* header_phi = NULL; > > > + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) > > > + > > > + op1 = PHI <phidef, inv> > > > + phidef = op0 & op1 > > > + if op0 is an invariant, it could change to > > > + phidef = op0 & inv. */ > > > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > > > + || TREE_CODE (match_op[1]) != SSA_NAME > > > + || !expr_invariant_in_loop_p (loop, match_op[0]) > > > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT > > (match_op[1]))) > > > + || gimple_phi_num_args (header_phi) != 2) > > > + return NULL_TREE; > > > + > > > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != > > phidef) > > > + return NULL_TREE; > > > + > > > + enum tree_code code1 > > > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > > > + > > > + if (code1 == BIT_XOR_EXPR) > > > + { > > > + if (!tree_fits_uhwi_p (niter)) > > > + return NULL_TREE; > > > + unsigned HOST_WIDE_INT niter_num; > > > + niter_num = tree_to_uhwi (niter); > > > + if (niter_num % 2 != 0) > > > + match_op[0] = build_zero_cst (type); > > > + } > > > + > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > > > The } goes to the next line. > > > > > + > > > /* Do final value replacement for LOOP, return true if we did > > > anything. */ > > > > > > bool > > > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) > > > &folded_casts); > > > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > > > > > + /* Handle bitop with invariant induction expression. > > > + > > > + .i.e > > > + for (int i =0 ;i < 32; i++) > > > + tmp &= bit2; > > > + if bit2 is an invariant in loop which could simple to > > > + tmp &= bit2. */ > > > + tree bitinv_def; > > > + if ((bitinv_def > > > > please use else if here > > > > otherwise looks OK. > > > > > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > > > + def = bitinv_def; > > > + > > > /* Handle bitwise induction expression. > > > > > > .i.e. > > > -- > > > 2.18.2 > > >
Hi Richard, Thanks again for your reviewing. > Yes, use else if for the bitwise induction. Can you also make the new case > conditional on 'def' > (the compute_overall_effect_of_inner_loop) being chrec_dont_know? If that > call produced something useful it will not be of either of the two special forms. > Thus like > > if (def != chrec_dont_know) > /* Already OK. */ > ; > else if ((bitinv_def = ...) > .. > else if (tree_fits_uhwi_p (niter) > ... bitwise induction case...) > ... > Yes, I fixed it in new patch. Thanks. Ok for master ? Thanks, Lingling > -----Original Message----- > From: Richard Biener <richard.guenther@gmail.com> > Sent: Wednesday, September 14, 2022 4:16 PM > To: Kong, Lingling <lingling.kong@intel.com> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop > with an invariant induction.[PR105735] > > On Tue, Sep 13, 2022 at 9:54 AM Kong, Lingling <lingling.kong@intel.com> > wrote: > > > > Hi Richard, > > > > Thanks you so much for reviewing this patch. I really appreciate it. For these > review comments, I have made some changes. > > > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > > Instead just do > > > > > > if (is_gimple_assign (stmt) > > > && ((code = gimple_assign_rhs_code (stmt)), true) > > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > > BIT_XOR_EXPR)) > > > > Yes, I fixed it and dropped modification for match.pd. > > > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in > > > bit_op:c is redundant btw. - while the name suggests "with > > > invariant" you don't actually check for that. But again, given > > > canonicalization rules the invariant will be rhs2 so above add > > > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > > > For " with invariant", this needed op1 is invariant, and I used > `expr_invariant_in_loop_p (loop, match_op[0])` for check. > > And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly > optimized and do not need modification. > > > > > you probably need dg-require-effective-target longlong, but is it > > > necessary to use long long for the testcases in the first place? > > > The IV seems to be unused, if it should match the variables bit size > > > use sizeof > > > (type) * 8 > > > > Yes, It is not necessary to use long long for the testcases. I changed type to > unsigned int. > > > > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > > > > > The } goes to the next line. > > > > Sorry, It might be something wrong with my use of gcc send-email format. > > > > > > + tree bitinv_def; > > > > + if ((bitinv_def > > > > > > please use else if here > > > > Sorry, If use the else if here, there is no corresponding above if. I'm not sure if > you mean change bitwise induction expression if to else if. > > Yes, use else if for the bitwise induction. Can you also make the new case > conditional on 'def' > (the compute_overall_effect_of_inner_loop) being chrec_dont_know? If that > call produced something useful it will not be of either of the two special forms. > Thus like > > if (def != chrec_dont_know) > /* Already OK. */ > ; > else if ((bitinv_def = ...) > .. > else if (tree_fits_uhwi_p (niter) > ... bitwise induction case...) > ... > > ? > > Otherwise looks OK now. > > Thanks, > Richard. > > > Do you agree with these changes? Thanks again for taking a look. > > > > Thanks, > > Lingling > > > > > -----Original Message----- > > > From: Richard Biener <richard.guenther@gmail.com> > > > Sent: Tuesday, August 23, 2022 3:27 PM > > > To: Kong, Lingling <lingling.kong@intel.com> > > > Cc: Liu, Hongtao <hongtao.liu@intel.com>; gcc-patches@gcc.gnu.org > > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle > > > bitop with an invariant induction.[PR105735] > > > > > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches <gcc- > > > patches@gcc.gnu.org> wrote: > > > > > > > > Hi, > > > > > > > > This patch is for pr105735/pr101991. It will enable below optimization: > > > > { > > > > - long unsigned int bit; > > > > - > > > > - <bb 2> [local count: 32534376]: > > > > - > > > > - <bb 3> [local count: 1041207449]: > > > > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > > > > - # bit_12 = PHI <bit_8(5), 0(2)> > > > > - tmp_7 = bit2_6(D) & tmp_10; > > > > - bit_8 = bit_12 + 1; > > > > - if (bit_8 != 32) > > > > - goto <bb 5>; [96.97%] > > > > - else > > > > - goto <bb 4>; [3.03%] > > > > - > > > > - <bb 5> [local count: 1009658865]: > > > > - goto <bb 3>; [100.00%] > > > > - > > > > - <bb 4> [local count: 32534376]: > > > > - # tmp_11 = PHI <tmp_7(3)> > > > > - return tmp_11; > > > > + tmp_11 = tmp_4(D) & bit2_6(D); > > > > + return tmp_11; > > > > > > > > } > > > > > > > > Ok for master ? > > > > > > > > gcc/ChangeLog: > > > > > > > > PR middle-end/105735 > > > > * match.pd (bitop_with_inv_p): New match. > > > > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > > > > (analyze_and_compute_bitop_with_inv_effect): New function. > > > > (final_value_replacement_loop): Enhanced to handle bitop > > > > with inv induction. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.target/i386/pr105735-1.c: New test. > > > > * gcc.target/i386/pr105735-2.c: New test. > > > > --- > > > > gcc/match.pd | 4 + > > > > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 > > > > ++++++++++++++++++++++ > > > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > > > > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > > > > 4 files changed, 179 insertions(+) create mode 100644 > > > > gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > > 562138a8034..cfe593ebb02 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -8050,6 +8050,10 @@ and, > > > > (bit_not > > > > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 > > > > @2)) > > > > @3)))) > > > > > > > > +(for bit_op (bit_and bit_ior bit_xor) (match (bitop_with_inv_p > > > > +@0 > > > > +@1) > > > > + (bit_op:c @0 @1))) > > > > + > > > > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > > Instead just do > > > > > > if (is_gimple_assign (stmt) > > > && ((code = gimple_assign_rhs_code (stmt)), true) > > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > > BIT_XOR_EXPR)) > > > .. > > > > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in > > > bit_op:c is redundant btw. - while the name suggests "with > > > invariant" you don't actually check for that. But again, given > > > canonicalization rules the invariant will be rhs2 so above add > > > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > > > > > for example. > > > > > > > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > > > > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) > > > > for signed case. */ (simplify diff --git > > > > a/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > new file mode 100644 > > > > index 00000000000..8d2123ed351 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > @@ -0,0 +1,88 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > > > > +} } */ > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo (unsigned long long tmp, unsigned long long bit2) { > > > > > > you probably need dg-require-effective-target longlong, but is it > > > necessary to use long long for the testcases in the first place? > > > The IV seems to be unused, if it should match the variables bit size > > > use sizeof > > > (type) * 8 > > > > > > > + for (int bit = 0; bit < 64; bit++) > > > > + tmp &= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo1 (unsigned long long tmp, unsigned long long bit2) { > > > > + for (int bit = 63; bit >= 0; bit -=3) > > > > + tmp &= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo2 (unsigned long long tmp, unsigned long long bit2) { > > > > + for (int bit = 0; bit < 64; bit++) > > > > + tmp |= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo3 (unsigned long long tmp, unsigned long long bit2) { > > > > + for (int bit = 63; bit >= 0; bit -=3) > > > > + tmp |= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo4 (unsigned long long tmp, unsigned long long bit2) { > > > > + for (int bit = 0; bit < 64; bit++) > > > > + tmp ^= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +foo5 (unsigned long long tmp, unsigned long long bit2) { > > > > + for (int bit = 0; bit < 63; bit++) > > > > + tmp ^= bit2; > > > > + return tmp; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +f (unsigned long long tmp, long long bit, unsigned long long > > > > +bit2) { > > > > + unsigned long long res = tmp; > > > > + for (long long i = 0; i < bit; i++) > > > > + res &= bit2; > > > > + return res; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +f1 (unsigned long long tmp, long long bit, unsigned long long > > > > +bit2) { > > > > + unsigned long long res = tmp; > > > > + for (long long i = 0; i < bit; i++) > > > > + res |= bit2; > > > > + return res; > > > > +} > > > > + > > > > +unsigned long long > > > > +__attribute__((noipa)) > > > > +f2 (unsigned long long tmp, long long bit, unsigned long long > > > > +bit2) { > > > > + unsigned long long res = tmp; > > > > + for (long long i = 0; i < bit; i++) > > > > + res ^= bit2; > > > > + return res; > > > > +} > > > > + > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > new file mode 100644 > > > > index 00000000000..79c1d300b1b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > @@ -0,0 +1,28 @@ > > > > +/* { dg-do run } */ > > > > +/* { dg-options "-O1" } */ > > > > + > > > > +#include "pr105735-1.c" > > > > + > > > > +int main() > > > > +{ > > > > + unsigned long long tmp = 0x1101101ULL; > > > > + unsigned long long bit2 = 0x1111111011110111ULL; > > > > + if (foo (tmp, bit2) != 0x1100101ULL) > > > > + __builtin_abort (); > > > > + if (foo1 (tmp, bit2) != 0x1100101ULL) > > > > + __builtin_abort (); > > > > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > > > > + __builtin_abort (); > > > > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > > > > + __builtin_abort (); > > > > + if (foo4 (tmp, bit2) != 0x1101101ULL) > > > > + __builtin_abort (); > > > > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > > > > + __builtin_abort (); > > > > + if (f (tmp, 64, bit2) != 0x1100101ULL) > > > > + __builtin_abort (); > > > > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > > > > + __builtin_abort (); > > > > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > > > > + __builtin_abort (); > > > > +} > > > > diff --git a/gcc/tree-scalar-evolution.cc > > > > b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e > > > > 100644 > > > > --- a/gcc/tree-scalar-evolution.cc > > > > +++ b/gcc/tree-scalar-evolution.cc > > > > @@ -3635,6 +3635,53 @@ enum bit_op_kind > > > > return fold_build2 (code1, type, inv, wide_int_to_tree (type, > > > > bits)); } > > > > > > > > +/* Match.pd function to match bitop with invariant expression > > > > + .i.e. > > > > + tmp_7 = _0 & _1; */ > > > > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree > > > > +(*)(tree)); > > > > + > > > > +/* Return the inductive expression of bitop with invariant if possible, > > > > + otherwise returns DEF. */ > > > > +static tree > > > > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree > phidef, > > > > + tree niter) { > > > > + tree match_op[2],inv; > > > > + tree type = TREE_TYPE (phidef); > > > > + gphi* header_phi = NULL; > > > > + /* match thing like op0 (match[0]), op1(match[1]), > > > > +phidef(PHIDEF) > > > > + > > > > + op1 = PHI <phidef, inv> > > > > + phidef = op0 & op1 > > > > + if op0 is an invariant, it could change to > > > > + phidef = op0 & inv. */ > > > > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > > > > + || TREE_CODE (match_op[1]) != SSA_NAME > > > > + || !expr_invariant_in_loop_p (loop, match_op[0]) > > > > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT > > > (match_op[1]))) > > > > + || gimple_phi_num_args (header_phi) != 2) > > > > + return NULL_TREE; > > > > + > > > > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) > > > > + != > > > phidef) > > > > + return NULL_TREE; > > > > + > > > > + enum tree_code code1 > > > > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > > > > + > > > > + if (code1 == BIT_XOR_EXPR) > > > > + { > > > > + if (!tree_fits_uhwi_p (niter)) > > > > + return NULL_TREE; > > > > + unsigned HOST_WIDE_INT niter_num; > > > > + niter_num = tree_to_uhwi (niter); > > > > + if (niter_num % 2 != 0) > > > > + match_op[0] = build_zero_cst (type); > > > > + } > > > > + > > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > > > > > The } goes to the next line. > > > > > > > + > > > > /* Do final value replacement for LOOP, return true if we did > > > > anything. */ > > > > > > > > bool > > > > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop > *loop) > > > > &folded_casts); > > > > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > > > > > > > + /* Handle bitop with invariant induction expression. > > > > + > > > > + .i.e > > > > + for (int i =0 ;i < 32; i++) > > > > + tmp &= bit2; > > > > + if bit2 is an invariant in loop which could simple to > > > > + tmp &= bit2. */ > > > > + tree bitinv_def; > > > > + if ((bitinv_def > > > > > > please use else if here > > > > > > otherwise looks OK. > > > > > > > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter))) > > > > + def = bitinv_def; > > > > + > > > > /* Handle bitwise induction expression. > > > > > > > > .i.e. > > > > -- > > > > 2.18.2 > > > >
Thanks a lot, pushed to trunk. > Hi Richard, > > Thanks again for your reviewing. > > > Yes, use else if for the bitwise induction. Can you also make the new > > case conditional on 'def' > > (the compute_overall_effect_of_inner_loop) being chrec_dont_know? If > > that call produced something useful it will not be of either of the two special > forms. > > Thus like > > > > if (def != chrec_dont_know) > > /* Already OK. */ > > ; > > else if ((bitinv_def = ...) > > .. > > else if (tree_fits_uhwi_p (niter) > > ... bitwise induction case...) > > ... > > > Yes, I fixed it in new patch. Thanks. > Ok for master ? > > Thanks, > Lingling > > > -----Original Message----- > > From: Richard Biener <richard.guenther@gmail.com> > > Sent: Wednesday, September 14, 2022 4:16 PM > > To: Kong, Lingling <lingling.kong@intel.com> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle > > bitop with an invariant induction.[PR105735] > > > > On Tue, Sep 13, 2022 at 9:54 AM Kong, Lingling > > <lingling.kong@intel.com> > > wrote: > > > > > > Hi Richard, > > > > > > Thanks you so much for reviewing this patch. I really appreciate > > > it. For these > > review comments, I have made some changes. > > > > > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > > > Instead just do > > > > > > > > if (is_gimple_assign (stmt) > > > > && ((code = gimple_assign_rhs_code (stmt)), true) > > > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > > > BIT_XOR_EXPR)) > > > > > > Yes, I fixed it and dropped modification for match.pd. > > > > > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in > > > > bit_op:c is redundant btw. - while the name suggests "with > > > > invariant" you don't actually check for that. But again, given > > > > canonicalization rules the invariant will be rhs2 so above add > > > > > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > > > > > For " with invariant", this needed op1 is invariant, and I used > > `expr_invariant_in_loop_p (loop, match_op[0])` for check. > > > And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can > > > be directly > > optimized and do not need modification. > > > > > > > you probably need dg-require-effective-target longlong, but is it > > > > necessary to use long long for the testcases in the first place? > > > > The IV seems to be unused, if it should match the variables bit > > > > size use sizeof > > > > (type) * 8 > > > > > > Yes, It is not necessary to use long long for the testcases. I > > > changed type to > > unsigned int. > > > > > > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); > > > > > + } > > > > > > > > The } goes to the next line. > > > > > > Sorry, It might be something wrong with my use of gcc send-email format. > > > > > > > > + tree bitinv_def; > > > > > + if ((bitinv_def > > > > > > > > please use else if here > > > > > > Sorry, If use the else if here, there is no corresponding above if. > > > I'm not sure if > > you mean change bitwise induction expression if to else if. > > > > Yes, use else if for the bitwise induction. Can you also make the new > > case conditional on 'def' > > (the compute_overall_effect_of_inner_loop) being chrec_dont_know? If > > that call produced something useful it will not be of either of the two special > forms. > > Thus like > > > > if (def != chrec_dont_know) > > /* Already OK. */ > > ; > > else if ((bitinv_def = ...) > > .. > > else if (tree_fits_uhwi_p (niter) > > ... bitwise induction case...) > > ... > > > > ? > > > > Otherwise looks OK now. > > > > Thanks, > > Richard. > > > > > Do you agree with these changes? Thanks again for taking a look. > > > > > > Thanks, > > > Lingling > > > > > > > -----Original Message----- > > > > From: Richard Biener <richard.guenther@gmail.com> > > > > Sent: Tuesday, August 23, 2022 3:27 PM > > > > To: Kong, Lingling <lingling.kong@intel.com> > > > > Cc: Liu, Hongtao <hongtao.liu@intel.com>; gcc-patches@gcc.gnu.org > > > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to > > > > handle bitop with an invariant induction.[PR105735] > > > > > > > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches > > > > <gcc- patches@gcc.gnu.org> wrote: > > > > > > > > > > Hi, > > > > > > > > > > This patch is for pr105735/pr101991. It will enable below optimization: > > > > > { > > > > > - long unsigned int bit; > > > > > - > > > > > - <bb 2> [local count: 32534376]: > > > > > - > > > > > - <bb 3> [local count: 1041207449]: > > > > > - # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)> > > > > > - # bit_12 = PHI <bit_8(5), 0(2)> > > > > > - tmp_7 = bit2_6(D) & tmp_10; > > > > > - bit_8 = bit_12 + 1; > > > > > - if (bit_8 != 32) > > > > > - goto <bb 5>; [96.97%] > > > > > - else > > > > > - goto <bb 4>; [3.03%] > > > > > - > > > > > - <bb 5> [local count: 1009658865]: > > > > > - goto <bb 3>; [100.00%] > > > > > - > > > > > - <bb 4> [local count: 32534376]: > > > > > - # tmp_11 = PHI <tmp_7(3)> > > > > > - return tmp_11; > > > > > + tmp_11 = tmp_4(D) & bit2_6(D); return tmp_11; > > > > > > > > > > } > > > > > > > > > > Ok for master ? > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR middle-end/105735 > > > > > * match.pd (bitop_with_inv_p): New match. > > > > > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > > > > > (analyze_and_compute_bitop_with_inv_effect): New function. > > > > > (final_value_replacement_loop): Enhanced to handle bitop > > > > > with inv induction. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > * gcc.target/i386/pr105735-1.c: New test. > > > > > * gcc.target/i386/pr105735-2.c: New test. > > > > > --- > > > > > gcc/match.pd | 4 + > > > > > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 > > > > > ++++++++++++++++++++++ > > > > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++ > > > > > gcc/tree-scalar-evolution.cc | 59 +++++++++++++++ > > > > > 4 files changed, 179 insertions(+) create mode 100644 > > > > > gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > > > 562138a8034..cfe593ebb02 100644 > > > > > --- a/gcc/match.pd > > > > > +++ b/gcc/match.pd > > > > > @@ -8050,6 +8050,10 @@ and, > > > > > (bit_not > > > > > (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 > > > > > @2)) > > > > > @3)))) > > > > > > > > > > +(for bit_op (bit_and bit_ior bit_xor) (match (bitop_with_inv_p > > > > > +@0 > > > > > +@1) > > > > > + (bit_op:c @0 @1))) > > > > > + > > > > > > > > That's a single-stmt match, you shouldn't use match.pd matching for this. > > > > Instead just do > > > > > > > > if (is_gimple_assign (stmt) > > > > && ((code = gimple_assign_rhs_code (stmt)), true) > > > > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > > > > BIT_XOR_EXPR)) > > > > .. > > > > > > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in > > > > bit_op:c is redundant btw. - while the name suggests "with > > > > invariant" you don't actually check for that. But again, given > > > > canonicalization rules the invariant will be rhs2 so above add > > > > > > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > > > > > > > for example. > > > > > > > > > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > > > > > n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) > > > > > for signed case. */ (simplify diff --git > > > > > a/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > > b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > > new file mode 100644 > > > > > index 00000000000..8d2123ed351 > > > > > --- /dev/null > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c > > > > > @@ -0,0 +1,88 @@ > > > > > +/* { dg-do compile } */ > > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ > > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" > > > > > +} } */ > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo (unsigned long long tmp, unsigned long long bit2) { > > > > > > > > you probably need dg-require-effective-target longlong, but is it > > > > necessary to use long long for the testcases in the first place? > > > > The IV seems to be unused, if it should match the variables bit size > > > > use sizeof > > > > (type) * 8 > > > > > > > > > + for (int bit = 0; bit < 64; bit++) > > > > > + tmp &= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo1 (unsigned long long tmp, unsigned long long bit2) { > > > > > + for (int bit = 63; bit >= 0; bit -=3) > > > > > + tmp &= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo2 (unsigned long long tmp, unsigned long long bit2) { > > > > > + for (int bit = 0; bit < 64; bit++) > > > > > + tmp |= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo3 (unsigned long long tmp, unsigned long long bit2) { > > > > > + for (int bit = 63; bit >= 0; bit -=3) > > > > > + tmp |= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo4 (unsigned long long tmp, unsigned long long bit2) { > > > > > + for (int bit = 0; bit < 64; bit++) > > > > > + tmp ^= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +foo5 (unsigned long long tmp, unsigned long long bit2) { > > > > > + for (int bit = 0; bit < 63; bit++) > > > > > + tmp ^= bit2; > > > > > + return tmp; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +f (unsigned long long tmp, long long bit, unsigned long long > > > > > +bit2) { > > > > > + unsigned long long res = tmp; > > > > > + for (long long i = 0; i < bit; i++) > > > > > + res &= bit2; > > > > > + return res; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +f1 (unsigned long long tmp, long long bit, unsigned long long > > > > > +bit2) { > > > > > + unsigned long long res = tmp; > > > > > + for (long long i = 0; i < bit; i++) > > > > > + res |= bit2; > > > > > + return res; > > > > > +} > > > > > + > > > > > +unsigned long long > > > > > +__attribute__((noipa)) > > > > > +f2 (unsigned long long tmp, long long bit, unsigned long long > > > > > +bit2) { > > > > > + unsigned long long res = tmp; > > > > > + for (long long i = 0; i < bit; i++) > > > > > + res ^= bit2; > > > > > + return res; > > > > > +} > > > > > + > > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > new file mode 100644 > > > > > index 00000000000..79c1d300b1b > > > > > --- /dev/null > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > > @@ -0,0 +1,28 @@ > > > > > +/* { dg-do run } */ > > > > > +/* { dg-options "-O1" } */ > > > > > + > > > > > +#include "pr105735-1.c" > > > > > + > > > > > +int main() > > > > > +{ > > > > > + unsigned long long tmp = 0x1101101ULL; > > > > > + unsigned long long bit2 = 0x1111111011110111ULL; > > > > > + if (foo (tmp, bit2) != 0x1100101ULL) > > > > > + __builtin_abort (); > > > > > + if (foo1 (tmp, bit2) != 0x1100101ULL) > > > > > + __builtin_abort (); > > > > > + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) > > > > > + __builtin_abort (); > > > > > + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) > > > > > + __builtin_abort (); > > > > > + if (foo4 (tmp, bit2) != 0x1101101ULL) > > > > > + __builtin_abort (); > > > > > + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) > > > > > + __builtin_abort (); > > > > > + if (f (tmp, 64, bit2) != 0x1100101ULL) > > > > > + __builtin_abort (); > > > > > + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) > > > > > + __builtin_abort (); > > > > > + if (f2 (tmp, 64, bit2) != 0x1101101ULL) > > > > > + __builtin_abort (); > > > > > +} > > > > > diff --git a/gcc/tree-scalar-evolution.cc > > > > > b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e > > > > > 100644 > > > > > --- a/gcc/tree-scalar-evolution.cc > > > > > +++ b/gcc/tree-scalar-evolution.cc > > > > > @@ -3635,6 +3635,53 @@ enum bit_op_kind > > > > > return fold_build2 (code1, type, inv, wide_int_to_tree (type, > > > > > bits)); } > > > > > > > > > > +/* Match.pd function to match bitop with invariant expression > > > > > + .i.e. > > > > > + tmp_7 = _0 & _1; */ > > > > > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree > > > > > +(*)(tree)); > > > > > + > > > > > +/* Return the inductive expression of bitop with invariant if possible, > > > > > + otherwise returns DEF. */ > > > > > +static tree > > > > > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree > > phidef, > > > > > + tree niter) { > > > > > + tree match_op[2],inv; > > > > > + tree type = TREE_TYPE (phidef); > > > > > + gphi* header_phi = NULL; > > > > > + /* match thing like op0 (match[0]), op1(match[1]), > > > > > +phidef(PHIDEF) > > > > > + > > > > > + op1 = PHI <phidef, inv> > > > > > + phidef = op0 & op1 > > > > > + if op0 is an invariant, it could change to > > > > > + phidef = op0 & inv. */ > > > > > + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) > > > > > + || TREE_CODE (match_op[1]) != SSA_NAME > > > > > + || !expr_invariant_in_loop_p (loop, match_op[0]) > > > > > + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT > > > > (match_op[1]))) > > > > > + || gimple_phi_num_args (header_phi) != 2) > > > > > + return NULL_TREE; > > > > > + > > > > > + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) > > > > > + != > > > > phidef) > > > > > + return NULL_TREE; > > > > > + > > > > > + enum tree_code code1 > > > > > + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); > > > > > + > > > > > + if (code1 == BIT_XOR_EXPR) > > > > > + { > > > > > + if (!tree_fits_uhwi_p (niter)) > > > > > + return NULL_TREE; > > > > > + unsigned HOST_WIDE_INT niter_num; > > > > > + niter_num = tree_to_uhwi (niter); > > > > > + if (niter_num % 2 != 0) > > > > > + match_op[0] = build_zero_cst (type); > > > > > + } > > > > > + > > > > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > > > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > > > > > > > The } goes to the next line. > > > > > > > > > + > > > > > /* Do final value replacement for LOOP, return true if we did > > > > > anything. */ > > > > > > > > > > bool > > > > > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop > > *loop) > > > > > &folded_casts); > > > > > def = compute_overall_effect_of_inner_loop (ex_loop, def); > > > > > > > > > > + /* Handle bitop with invariant induction expression. > > > > > + > > > > > + .i.e > > > > > + for (int i =0 ;i < 32; i++) > > > > > + tmp &= bit2; > > > > > + if bit2 is an invariant in loop which could simple to > > > > > + tmp &= bit2. */ > > > > > + tree bitinv_def; > > > > > + if ((bitinv_def > > > > > > > > please use else if here > > > > > > > > otherwise looks OK. > > > > > > > > > + = analyze_and_compute_bitop_with_inv_effect (loop, phidef, > niter))) > > > > > + def = bitinv_def; > > > > > + > > > > > /* Handle bitwise induction expression. > > > > > > > > > > .i.e. > > > > > -- > > > > > 2.18.2 > > > > >
diff --git a/gcc/match.pd b/gcc/match.pd index 562138a8034..cfe593ebb02 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -8050,6 +8050,10 @@ and, (bit_not (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) @3)))) +(for bit_op (bit_and bit_ior bit_xor) + (match (bitop_with_inv_p @0 @1) + (bit_op:c @0 @1))) + /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. n - (((n > C1) ? n : C1) & -C2) -> (n <= C1) ? n : (n & C1) for signed case. */ (simplify diff --git a/gcc/testsuite/gcc.target/i386/pr105735-1.c b/gcc/testsuite/gcc.target/i386/pr105735-1.c new file mode 100644 index 00000000000..8d2123ed351 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c @@ -0,0 +1,88 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-sccp-details" } */ +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" +} } */ + +unsigned long long +__attribute__((noipa)) +foo (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 0; bit < 64; bit++) + tmp &= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +foo1 (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 63; bit >= 0; bit -=3) + tmp &= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +foo2 (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 0; bit < 64; bit++) + tmp |= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +foo3 (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 63; bit >= 0; bit -=3) + tmp |= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +foo4 (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 0; bit < 64; bit++) + tmp ^= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +foo5 (unsigned long long tmp, unsigned long long bit2) { + for (int bit = 0; bit < 63; bit++) + tmp ^= bit2; + return tmp; +} + +unsigned long long +__attribute__((noipa)) +f (unsigned long long tmp, long long bit, unsigned long long bit2) { + unsigned long long res = tmp; + for (long long i = 0; i < bit; i++) + res &= bit2; + return res; +} + +unsigned long long +__attribute__((noipa)) +f1 (unsigned long long tmp, long long bit, unsigned long long bit2) { + unsigned long long res = tmp; + for (long long i = 0; i < bit; i++) + res |= bit2; + return res; +} + +unsigned long long +__attribute__((noipa)) +f2 (unsigned long long tmp, long long bit, unsigned long long bit2) { + unsigned long long res = tmp; + for (long long i = 0; i < bit; i++) + res ^= bit2; + return res; +} + diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c b/gcc/testsuite/gcc.target/i386/pr105735-2.c new file mode 100644 index 00000000000..79c1d300b1b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ +/* { dg-options "-O1" } */ + +#include "pr105735-1.c" + +int main() +{ + unsigned long long tmp = 0x1101101ULL; + unsigned long long bit2 = 0x1111111011110111ULL; + if (foo (tmp, bit2) != 0x1100101ULL) + __builtin_abort (); + if (foo1 (tmp, bit2) != 0x1100101ULL) + __builtin_abort (); + if (foo2 (tmp, bit2) != 0x1111111011111111ULL) + __builtin_abort (); + if (foo3 (tmp, bit2) != 0x1111111011111111ULL) + __builtin_abort (); + if (foo4 (tmp, bit2) != 0x1101101ULL) + __builtin_abort (); + if (foo5 (tmp, bit2) != 0x1111111010011010ULL) + __builtin_abort (); + if (f (tmp, 64, bit2) != 0x1100101ULL) + __builtin_abort (); + if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL) + __builtin_abort (); + if (f2 (tmp, 64, bit2) != 0x1101101ULL) + __builtin_abort (); +} diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e 100644 --- a/gcc/tree-scalar-evolution.cc +++ b/gcc/tree-scalar-evolution.cc @@ -3635,6 +3635,53 @@ enum bit_op_kind return fold_build2 (code1, type, inv, wide_int_to_tree (type, bits)); } +/* Match.pd function to match bitop with invariant expression + .i.e. + tmp_7 = _0 & _1; */ +extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree)); + +/* Return the inductive expression of bitop with invariant if possible, + otherwise returns DEF. */ +static tree +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef, + tree niter) +{ + tree match_op[2],inv; + tree type = TREE_TYPE (phidef); + gphi* header_phi = NULL; + /* match thing like op0 (match[0]), op1(match[1]), phidef(PHIDEF) + + op1 = PHI <phidef, inv> + phidef = op0 & op1 + if op0 is an invariant, it could change to + phidef = op0 & inv. */ + if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL) + || TREE_CODE (match_op[1]) != SSA_NAME + || !expr_invariant_in_loop_p (loop, match_op[0]) + || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1]))) + || gimple_phi_num_args (header_phi) != 2) + return NULL_TREE; + + if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef) + return NULL_TREE; + + enum tree_code code1 + = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef)); + + if (code1 == BIT_XOR_EXPR) + { + if (!tree_fits_uhwi_p (niter)) + return NULL_TREE; + unsigned HOST_WIDE_INT niter_num; + niter_num = tree_to_uhwi (niter); + if (niter_num % 2 != 0) + match_op[0] = build_zero_cst (type); + } + + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge (loop)); + return fold_build2 (code1, type, inv, match_op[0]); } + /* Do final value replacement for LOOP, return true if we did anything. */ bool @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop *loop) &folded_casts);