From patchwork Wed Jul 19 04:33:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Liu OS X-Patchwork-Id: 122388 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2189813vqt; Tue, 18 Jul 2023 21:34:41 -0700 (PDT) X-Google-Smtp-Source: APBJJlF45Wv6M5Jw2z9zT8x6quTRb+g+wSvRO9IDjoGaXxa0UWLtgNhVZLWJ4amqhOQ0gQOR5h/H X-Received: by 2002:ac2:5051:0:b0:4fb:7642:88dd with SMTP id a17-20020ac25051000000b004fb764288ddmr9781202lfm.67.1689741281516; Tue, 18 Jul 2023 21:34:41 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f5-20020aa7d845000000b0051e5ca7a5e6si2334841eds.681.2023.07.18.21.34.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jul 2023 21:34:41 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tESCYbaX; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7DF423857353 for ; Wed, 19 Jul 2023 04:34:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7DF423857353 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689741279; bh=2ezgOHoNt/j6i66cXYgxSLt9nwQab2fdEmzcFcPAL7k=; h=To:CC:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=tESCYbaXeihsWvof1cuEZ0e1f/KdYM6/+K9slKk2lJAadsIf4izuqZFA12LWyLi6Z p4phHBxlJ28cFqCgYc8r47bd/OpPGsKHUKu60DOl7cQjYnTcIOHS6af9SX1/HjPNsT GdlBJoWl9i7xTStjiVIbvTAkvP2mMgJJJlLJ7avU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2093.outbound.protection.outlook.com [40.107.102.93]) by sourceware.org (Postfix) with ESMTPS id 825E73858D32 for ; Wed, 19 Jul 2023 04:33:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 825E73858D32 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DkeJ2knKayDEiubzLgFooG8u8bG7LTQXx/1jWC53tOdZdHRGrx43PfG/D7ra8QQEWt1HkdPCdtteghHGcTgkpLR4KRzpiSk3xffriL2mTPLFa57tILtl2114BpEkZwxa3vyLr7VIfvx6A5fSeKjOHSj6ewnmDBzD1Y8EtmJVvZhVZcMatYJd4BfxO/5DKkL74ZgceRI3n2QCoS++hDA4CesKxG34O76JCiEhiI4lSB4Q886OJtyAP3gEg9HeDy/i4+VKbV/3snoR2b0YoJ/1xXXmL3WrNhmjh+62tox0mQu7V29roXgZbIUURHb+iUaO96Y6+fjkQJ5sgxuLXb0TsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2ezgOHoNt/j6i66cXYgxSLt9nwQab2fdEmzcFcPAL7k=; b=X1CLTrLWZujOJrcUe+pW3QEIBK27723xvyfiECjgF/ThisRTF7TGg5goW5uzaFcuHSJl6jfFfe8tXA1PIz7TtdZ67+SJcmLDs+KGAT5dW6Eqx1amXK0AtcTGRPK3aXMTVd/4A8KmAGUn72ZczjyWcKO1dsIyjzSWXwXV3I/EMVnSHYHbrpMNQYvOF9WGeMX7PEbvpnuYr/kCpw3uwY6WrQU5xsAS27jQ0QHEL0u9uM7U/7/2DOurIhYl8fvACI0FUUIFIOvorXJc9AiJuM1as/s0+Le06cdYhGRoISdhBkqcpfbfoeFtL7pEyFo/ClfGMP69F6aKfWzUoPrM5yiqmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none Received: from SJ2PR01MB8635.prod.exchangelabs.com (2603:10b6:a03:57b::16) by SA1PR01MB6544.prod.exchangelabs.com (2603:10b6:806:1ab::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6609.24; Wed, 19 Jul 2023 04:33:49 +0000 Received: from SJ2PR01MB8635.prod.exchangelabs.com ([fe80::4973:da2:1b04:e600]) by SJ2PR01MB8635.prod.exchangelabs.com ([fe80::4973:da2:1b04:e600%6]) with mapi id 15.20.6588.031; Wed, 19 Jul 2023 04:33:49 +0000 To: "GCC-patches@gcc.gnu.org" CC: "richard.sandiford@arm.com" Subject: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625] Thread-Topic: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625] Thread-Index: AQHZuflIfzSQE63nDUGU9bz2B+2yKg== Date: Wed, 19 Jul 2023 04:33:48 +0000 Message-ID: Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Enabled=True; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SiteId=3bc2b170-fd94-476d-b0ce-4229bdc904a7; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SetDate=2023-07-19T04:33:46.692Z; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Name=Confidential; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ContentBits=0; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Method=Standard; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SJ2PR01MB8635:EE_|SA1PR01MB6544:EE_ x-ms-office365-filtering-correlation-id: 7c41f010-1a42-4c7f-1b5f-08db88115872 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: bVwTlNvI0FzbXxXkj9y+SvDnqyZ69zI6QoOwRRJo4xz9DDGkxHldsncIyQwRfcDdJWQHRLDz6L7bwghmYb/YVcI4GCjaR3SkhLxaQEJ9F3sMAuEoSKeRrFWuQM1Cs9mzVd32tsXm6ZneVG+q/NQwLacBqciDHffAeSY0uKDDZDa2e87WGGfx0awuIs7TFAMYAUKHnkRVuILEdKHvWbuHauy1AShwIfNR0cW4ufzvHpCXlF/JXDV21goCp+Wj16VY3+ZXvQdxChi2NC/veT/DgEbS7RdbIJaaBp5c5fnmy/QGaUiucL9RjV8pSTfdYqPvYzNHN/uLiG9N9AcgOUtleDVgUwUdcC5dhE1CZKOmaJlqEmyY8vH7MB42LD8K79nNBg6tnEkNWCci4/+I9OjHCh3VS4KHlPW8L9pKDAtz9e6KpT1CuJ+a8xyXivNHTlCeP3+X8DVwT8nyCyQIskC6+6CVpCPpD5lA/OsjRVABd1AbVAbTDKx1sa9oKWjdBseEqjgUAPWTR7I4VFcGgpKBj2EGmxPq/G2T8B9WFrndww80LxKwGOVwRux1c2bpteYhpJ33MbyzPgyIRAdk32Un+/uVxM0qCnohOZmD+ItEkbA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ2PR01MB8635.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(346002)(39850400004)(396003)(136003)(376002)(366004)(451199021)(86362001)(84970400001)(478600001)(83380400001)(186003)(38100700002)(41300700001)(26005)(71200400001)(33656002)(316002)(8936002)(6916009)(8676002)(2906002)(4326008)(122000001)(66556008)(64756008)(9686003)(7696005)(66946007)(52536014)(66446008)(66476007)(6506007)(91956017)(38070700005)(76116006)(55016003)(5660300002)(66899021); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?Ia2ocKYwjM9LtHud3boxYB0?= =?iso-8859-1?q?K+d4ar1CTwRSAd22nGthuVDuXcOHcz7IuMzuKZ9ffNVVdG+tBFwihIILzWBb?= =?iso-8859-1?q?4Z/plI9P+01ONenZJcX3NtwWe5Et2IruS46qHq7C4q7eSrllx9gBBG/0Vdaq?= =?iso-8859-1?q?PC7r8kepyJlwCGdWaC9qF4MQV26/ggQC81zgIe6Wd7zIaksmHlYkzLbC2lzk?= =?iso-8859-1?q?jxqHhsA6oX4qiXjhxmpmnSf/KwzW33ma+1M/pT8a7lC5xvK/lw2InqSgc1xd?= =?iso-8859-1?q?/Cj2wKp8VvQD1phZWVGp8dTOcGnRb198dUB6A4rxgrFnGS1aouc6mFwEkbB+?= =?iso-8859-1?q?OdxVxbHsrpdDkuFGnOD3GKgw3l0OGMhY0BDEfAuL+H1TxbOWWevSWnbYoNnF?= =?iso-8859-1?q?ajElkebHn+FCxI++FLBgbrAtgH4pdh5YvhtJyKi/7yKOepzGvDZAGcU7z9tX?= =?iso-8859-1?q?0hTwhu6dCfGmOEfQKwNuzt6uNDZ5c9+bQKvVpfrP9Nvvs/xN7kTDlRh+8bZB?= =?iso-8859-1?q?9yzx8uqwjQ6Bn93FLCXO3tKWzui4G9evUVXMWBTmTCvKhMclRo9ZWdtx9aIP?= =?iso-8859-1?q?VI62FluEMcjP1KIGu4p0L6A4tmTJZ/jwsRO+/pEgDNf42g7lIGDiSe5TfIgW?= =?iso-8859-1?q?4kh1MGOd9vs3DozeUSLczBRqmQRKaKILWTpPJZMiBAYPtp4US2r4+Hs0nQme?= =?iso-8859-1?q?Wyy+WjItxP57ybXcdpf/7WhnBaIVsqXntLHCB2x6sn61U10MnlBYvjyKoQ2L?= =?iso-8859-1?q?qN9JzPTr4y7bDnkyARhkoow16bzQjghcBfXOQTNkM7pauT9FBuL/ZrTfLrfI?= =?iso-8859-1?q?NpgP8BBZH+/pRW2t4VlyCCMkYdc9MsnBTq1W/8b/waFnx2F8gNxTfja6hlHz?= =?iso-8859-1?q?iH6YaiedoBlNIGEHDk4I3er2MLOaQg/LmygLpfAXTHbRajv9Rpk9uGIp0OAp?= =?iso-8859-1?q?pne//lfwDSThd29YDu/wg8Mx3KUmesqMYK97xe/aaoR5/oKsMHaOraOMB/6x?= =?iso-8859-1?q?sth/WCZGgFNSunE51LcW5keTLl5pHnJY2txFeIUX+28/gGgt6PdRrd7dWdYW?= =?iso-8859-1?q?9N51Uz6NY4zRjyaqmLGEvGVsML95vhg/EmOhS2kw/Lm8L3SgVY0Eb5dhJHTZ?= =?iso-8859-1?q?skn5MJEeWRyYa5o1kv0FrZAoTn/GWllvOam69s+jWcBRtuiZyjqrzLDP6dBq?= =?iso-8859-1?q?VuKNBKPpHTJahVw7livQ+f7PpLyG2ZTDb7logbhTiN8gmJiW6iVirx3haD7d?= =?iso-8859-1?q?Ph6booJ/ARnWsNizUKCsZ5AECUutkFiOv/vGu3J8NLnueJOYndUgx4d6LCnI?= =?iso-8859-1?q?BJ/gmSirlEbZKDy4kkuoz7mKQsoyocMfzntfV6WtCci7KMpIkYH0ta5gM/zk?= =?iso-8859-1?q?xluyAZu+TtkINHdT5nJzJV6/pCOPlE7mN1drwxzsRdrv/6TLyJVBl75IczLs?= =?iso-8859-1?q?+Jvdw0wbevK1oFBdXjFPgk+CVU2PPEzIRMww7J93/Ch48pdIAZTm3bubQCdo?= =?iso-8859-1?q?7GPeuRX7Nf61PHFlhDXtZDWhrLAJJ1ZmDAehdQZPwqolb9cj/T+a3zJCYD3v?= =?iso-8859-1?q?zIQOLB8eLrkGBhxmalQii4JZ2SEPZsR7Qhw10T174NVm58GjnseWXLFw0UZD?= =?iso-8859-1?q?pNUZ8sujJuDdfVRQjUh9O6JERIF5kn551Q5fxxQ=3D=3D?= MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SJ2PR01MB8635.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7c41f010-1a42-4c7f-1b5f-08db88115872 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jul 2023 04:33:48.2333 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Xrr28A3D+pPtXabiVHN+n396ADhEOqKfdKrDeiJhO+BgsHWrSUYIDQgqGGwmbsN2/qjPfF1Hq5WOJGzduhyCfn8zI5EKB2PE2ciyFHLRflfwpIRbv72P8JnklegZL+nY X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR01MB6544 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hao Liu OS via Gcc-patches From: Hao Liu OS Reply-To: Hao Liu OS Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771822153984996748 X-GMAIL-MSGID: 1771822153984996748 This only affects the new costs in aarch64 backend. Currently, the reduction latency of vector body is too large as it is multiplied by stmt count. As the scalar reduction latency is small, the new costs model may think "scalar code would issue more quickly" and increase the vector body cost a lot, which will miss vectorization opportunities. Tested by bootstrapping on aarch64-linux-gnu. gcc/ChangeLog: PR target/110625 * config/aarch64/aarch64.cc (count_ops): Remove the '* count' for reduction_latency. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr110625.c: New testcase. --- gcc/config/aarch64/aarch64.cc | 5 +-- gcc/testsuite/gcc.target/aarch64/pr110625.c | 46 +++++++++++++++++++++ 2 files changed, 47 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr110625.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 560e5431636..27afa64b7d5 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -16788,10 +16788,7 @@ aarch64_vector_costs::count_ops (unsigned int count, vect_cost_for_stmt kind, { unsigned int base = aarch64_in_loop_reduction_latency (m_vinfo, stmt_info, m_vec_flags); - - /* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately - that's not yet the case. */ - ops->reduction_latency = MAX (ops->reduction_latency, base * count); + ops->reduction_latency = MAX (ops->reduction_latency, base); } /* Assume that multiply-adds will become a single operation. */ diff --git a/gcc/testsuite/gcc.target/aarch64/pr110625.c b/gcc/testsuite/gcc.target/aarch64/pr110625.c new file mode 100644 index 00000000000..0965cac33a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr110625.c @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -mcpu=neoverse-n2 -fdump-tree-vect-details -fno-tree-slp-vectorize" } */ +/* { dg-final { scan-tree-dump-not "reduction latency = 8" "vect" } } */ + +/* Do not increase the vector body cost due to the incorrect reduction latency + Original vector body cost = 51 + Scalar issue estimate: + ... + reduction latency = 2 + estimated min cycles per iteration = 2.000000 + estimated cycles per vector iteration (for VF 2) = 4.000000 + Vector issue estimate: + ... + reduction latency = 8 <-- Too large + estimated min cycles per iteration = 8.000000 + Increasing body cost to 102 because scalar code would issue more quickly + ... + missed: cost model: the vector iteration cost = 102 divided by the scalar iteration cost = 44 is greater or equal to the vectorization factor = 2. + missed: not vectorized: vectorization not profitable. */ + +typedef struct +{ + unsigned short m1, m2, m3, m4; +} the_struct_t; +typedef struct +{ + double m1, m2, m3, m4, m5; +} the_struct2_t; + +double +bar (the_struct2_t *); + +double +foo (double *k, unsigned int n, the_struct_t *the_struct) +{ + unsigned int u; + the_struct2_t result; + for (u = 0; u < n; u++, k--) + { + result.m1 += (*k) * the_struct[u].m1; + result.m2 += (*k) * the_struct[u].m2; + result.m3 += (*k) * the_struct[u].m3; + result.m4 += (*k) * the_struct[u].m4; + } + return bar (&result); +}