From patchwork Tue May 9 16:07:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oluwatamilore Adebayo X-Patchwork-Id: 91642 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3000501vqo; Tue, 9 May 2023 09:08:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4v67xFS9uvDuiCFr3Y+7zXutLd35Isz+3s1z4NnBOHtTkTMZfMG9e7rVSCqGciIdmPZ4qB X-Received: by 2002:a17:907:704:b0:95f:969e:dc5a with SMTP id xb4-20020a170907070400b0095f969edc5amr10855021ejb.30.1683648500944; Tue, 09 May 2023 09:08:20 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id vw6-20020a170907058600b009659ffd4456si2043334ejb.643.2023.05.09.09.08.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 May 2023 09:08:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NVR0U3hr; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1E3B93857343 for ; Tue, 9 May 2023 16:08:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1E3B93857343 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683648494; bh=futIyUA6m+MhTaEYSeD6OwfEPTZO7Zk91b1tzchg19k=; h=To:CC:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=NVR0U3hrnPIxExzss2LmL5Kl01DAI/UCHl0VJ2PS/kzJrVTvzf1JKxFqG8PnmhW0E 48vX01GuSJ0yh2AOGCx8rRN/SvcCLohd2Rt7xLayRfytWF+RqY7kzUqnbErwQT3M23 049W0zEg2H+ffBDTmqqZUo71x/TqpP5p9+y7YaA8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2061.outbound.protection.outlook.com [40.107.22.61]) by sourceware.org (Postfix) with ESMTPS id DABDF3858C31 for ; Tue, 9 May 2023 16:07:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DABDF3858C31 Received: from DUZPR01CA0198.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b6::29) by AS8PR08MB6120.eurprd08.prod.outlook.com (2603:10a6:20b:299::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6363.33; Tue, 9 May 2023 16:07:22 +0000 Received: from DBAEUR03FT033.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:4b6:cafe::c5) by DUZPR01CA0198.outlook.office365.com (2603:10a6:10:4b6::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6363.33 via Frontend Transport; Tue, 9 May 2023 16:07:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT033.mail.protection.outlook.com (100.127.142.251) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.18 via Frontend Transport; Tue, 9 May 2023 16:07:22 +0000 Received: ("Tessian outbound 945aec65ec65:v136"); Tue, 09 May 2023 16:07:22 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ab0d83be8a870a8a X-CR-MTA-TID: 64aa7808 Received: from 26d8cdb80363.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8AA7850F-ECDC-4002-A1C9-A7A3855E5484.1; Tue, 09 May 2023 16:07:12 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 26d8cdb80363.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 09 May 2023 16:07:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CmvGkwsI9Mu3FmayeXWyIrzFl2YVlUvYqnC33cU5Cea+4iwZs/stg6yEq4FUi9F0lCpXoB3uyvjnX9YHnDw93h1i11HZbefGWcDlCB1mRn4Eefb/+VgTj0JmRLJaTRTPLQpQEvtW5CgONH3jy+/XTKeFD1A4rJEt1eQYgTW78OTQ8A1ehLqwEH6/ltId0jUp2aBNQSSuDiLrmS5nBLTi9kx3QUo0vGh3henat51ivFECQKF1pd3UcP/n1PPTKN8UNjtNefRNhjvj+ZtNO50u729G3geZ0QJ08dtdJ3p1HjQs4YzuldGjJTWhvPUpyRoc6sJgyrSnAYXqV1+316YZAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=futIyUA6m+MhTaEYSeD6OwfEPTZO7Zk91b1tzchg19k=; b=QBOK+/LL+/USLTXlEHqnZP1UkU5AoTfBZqtH9t/dLZGfYpD2tRwSbblOpI4soT0gP3yq8Gl4ioOGBpV2UYuPhDapX6HEzL41nrYlsFl7o2lIaL9xEn+d3fMjYdAXy24uKquSuGMLJU3szRFunUzB6FmtAyk74Z7ny7t673dpbh1BXg2Q5FBa7MMGtm8p2fn8UTbqxNJWz9Hi7igWL8o+mTYpjNS+7Zp4fRrV4pQ5E1uBqGn7zQc8Ob23PNiU8Y9OvioDPSSb3oSHY1OWWA+MpMlIklIOjhU6SX8OPcFd3dUtB5Rw3YLuWw56jjrBJiHtYg4f8wYoLQTW6npqmVb5sQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from DB7PR08MB3452.eurprd08.prod.outlook.com (2603:10a6:10:48::13) by DU0PR08MB9775.eurprd08.prod.outlook.com (2603:10a6:10:444::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6363.32; Tue, 9 May 2023 16:07:09 +0000 Received: from DB7PR08MB3452.eurprd08.prod.outlook.com ([fe80::8f5a:a899:a879:14dc]) by DB7PR08MB3452.eurprd08.prod.outlook.com ([fe80::8f5a:a899:a879:14dc%6]) with mapi id 15.20.6363.033; Tue, 9 May 2023 16:07:09 +0000 To: "gcc-patches@gcc.gnu.org" CC: Richard Sandiford , "richard.guenther@gmail.com" Subject: [PATCH] vect: Missed opportunity to use [SU]ABD Thread-Topic: [PATCH] vect: Missed opportunity to use [SU]ABD Thread-Index: AQHZgpAn75PbUsPndUKZNZGfrUEDIw== Date: Tue, 9 May 2023 16:07:09 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: DB7PR08MB3452:EE_|DU0PR08MB9775:EE_|DBAEUR03FT033:EE_|AS8PR08MB6120:EE_ X-MS-Office365-Filtering-Correlation-Id: 77d67cb8-914b-4839-23f4-08db50a778f8 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: MvoiOTZoD6tNi+5dPNbvqy2a5QdZ/URtp8Q5hPRRiNmTR4Mn7ScxyermilCV2pfF6u6CRz3vWw1HrLiyulso4kCxRKWrWyuTLKI6WlNPEybP2YQZKNMfOwv9LYteoY517Hei/oQylpHlamlIZdRRq3bHLD7hpx4QkC/9EwDecNfTTTAh9XkkInc8gqsYEJwglzpSUT8Hm/70e9L09n6qM6zMYjGL64xhldZY5Bj5aD3k5IRr5zQgHONAGIgmheMbvzAHhE7N0OBwlCuZG4BlP/Gub6GBhXk2tg33PnBmIGQaNj4NudTR4cdkvYdeBNFpNS24vi6Pq3zu8QFuSE+JpLmgD3KcVArKQk5Y6KZ25qeCoe56Usso0rjSaglDi/bBqswhcTN2qJHgjJqJoYX4q8VYdZYbtLoCIO41kAP9jOSkVS7QksNaBiUtrs19q1HyyiXdnCwRVjxCkVXpMUfBUqf48+GdUFRKIXO8G870kLyJ5I7OYg3o/6cq326iJ1eCIxwRQFIEiQBaotvEgPZCq/63E0n50vM3m1qp0TjFIMJIp/EM7CZuyfGTV/s2e0MwPh6ALFJEiWlseH1E5RZBchOWEjM56oKzgP9DYAOqto5kZ3E7AiY0Eyd2V6+Nr5cy X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DB7PR08MB3452.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(366004)(376002)(136003)(346002)(396003)(451199021)(45080400002)(66476007)(86362001)(33656002)(66446008)(54906003)(91956017)(316002)(478600001)(76116006)(64756008)(6916009)(4326008)(66556008)(7696005)(66946007)(55016003)(5660300002)(41300700001)(52536014)(2906002)(71200400001)(30864003)(8676002)(8936002)(38100700002)(122000001)(186003)(38070700005)(99936003)(26005)(53546011)(9686003)(6506007)(83380400001); DIR:OUT; SFP:1101; MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9775 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT033.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 6087cf74-3125-4192-99d5-08db50a77163 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LeEBaTDmCm5fJaWGH2PyRHPD8BWAQGnqxJnGSMVGNy+skTV8+MAoKGcG75+MT2KkpRw7zybTMaRfMpirFHxASeBYAe1ervNWyVuKkkPFjTg3j2pQLr8xmrAloNizr4LFQMpPmfaR95nAtH0IMuSFiKILsYNf18EEdyHruaEDkWQPxx+dqgCDTwNnSw0B0HEhHOTTbvf97gI8QrNL0yrkJop5cs2Mp7ofF81LHZndT+Shs2D/DUEPH2Gs3Xa7x3aVXDMm6IWxChXI+o9CAuXx9iZbk/WF+4uB4z+Bgk1mlUQo5yfhqgytfjI2rNJiEhDrg4Lnf96ScmbAmDskXhWa9w/FFyQ+Ixm9NyhtK7NRFdyg7ocXL7ps9YWqSB35NcS4z+dy/j7bxFb1CnWffhTnhjsbzNkqRB1K9izjyR3tGruwIUi90g/W9LNchVYmvr9WxZ+eJHtM7e0DGcqzfBCRYOYhQrP0TIqyawaE3ix8kdNLWkvxso1h1mFCANaSw6FHg2RAuqKQIyoMxVRVsRaYdWBMK4zx1yJYTtABQDKTX4qMBaiznPSR+fNSqUsSOw99/pavUsiWVtNVN+5Ei30rcQWIrTv2lRS6sJtqKBE27aHFDAWzGHJ/eDaaICNPt0qsZA9mGxki3KZ1U/rEcVdMqechSE0Yj2HrAXuisvD2tkJvNJRisIMYkFB1xiYFRNUYoWVTe2oqugB3Iw1j8/ispg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230028)(4636009)(396003)(376002)(136003)(39860400002)(346002)(451199021)(40470700004)(46966006)(36840700001)(86362001)(33656002)(54906003)(316002)(478600001)(70206006)(6916009)(70586007)(45080400002)(7696005)(4326008)(40480700001)(55016003)(82310400005)(8676002)(8936002)(5660300002)(2906002)(235185007)(30864003)(52536014)(41300700001)(356005)(82740400003)(81166007)(99936003)(186003)(107886003)(9686003)(6506007)(36860700001)(26005)(53546011)(47076005)(336012)(83380400001)(40460700003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 May 2023 16:07:22.2189 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 77d67cb8-914b-4839-23f4-08db50a778f8 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT033.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6120 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Oluwatamilore Adebayo via Gcc-patches From: Oluwatamilore Adebayo Reply-To: Oluwatamilore Adebayo Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765433410758795283?= X-GMAIL-MSGID: =?utf-8?q?1765433410758795283?= From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001 From: oluade01 Date: Fri, 14 Apr 2023 10:24:43 +0100 Subject: [PATCH 1/4] Missed opportunity to use [SU]ABD This adds a recognition pattern for the non-widening absolute difference (ABD). gcc/ChangeLog: * doc/md.texi (sabd, uabd): Document them. * internal-fn.def (ABD): Use new optab. * optabs.def (sabd_optab, uabd_optab): New optabs, * tree-vect-patterns.cc (vect_recog_absolute_difference): Recognize the following idiom abs (a - b). (vect_recog_sad_pattern): Refactor to use vect_recog_absolute_difference. (vect_recog_abd_pattern): Use patterns found by vect_recog_absolute_difference to build a new ABD internal call. --- gcc/doc/md.texi | 10 ++ gcc/internal-fn.def | 3 + gcc/optabs.def | 2 + gcc/tree-vect-patterns.cc | 250 +++++++++++++++++++++++++++++++++----- 4 files changed, 234 insertions(+), 31 deletions(-) From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001 From: oluade01 Date: Fri, 14 Apr 2023 10:24:43 +0100 Subject: [PATCH 1/4] Missed opportunity to use [SU]ABD This adds a recognition pattern for the non-widening absolute difference (ABD). gcc/ChangeLog: * doc/md.texi (sabd, uabd): Document them. * internal-fn.def (ABD): Use new optab. * optabs.def (sabd_optab, uabd_optab): New optabs, * tree-vect-patterns.cc (vect_recog_absolute_difference): Recognize the following idiom abs (a - b). (vect_recog_sad_pattern): Refactor to use vect_recog_absolute_difference. (vect_recog_abd_pattern): Use patterns found by vect_recog_absolute_difference to build a new ABD internal call. --- gcc/doc/md.texi | 10 ++ gcc/internal-fn.def | 3 + gcc/optabs.def | 2 + gcc/tree-vect-patterns.cc | 250 +++++++++++++++++++++++++++++++++----- 4 files changed, 234 insertions(+), 31 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 07bf8bdebffb2e523f25a41f2b57e43c0276b745..0ad546c63a8deebb4b6db894f437d1e21f0245a8 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5778,6 +5778,16 @@ Other shift and rotate instructions, analogous to the Vector shift and rotate instructions that take vectors as operand 2 instead of a scalar type. +@cindex @code{uabd@var{m}} instruction pattern +@cindex @code{sabd@var{m}} instruction pattern +@item @samp{uabd@var{m}}, @samp{sabd@var{m}} +Signed and unsigned absolute difference instructions. These +instructions find the difference between operands 1 and 2 +then return the absolute value. A C code equivalent would be: +@smallexample +op0 = abs (op0 - op1) +@end smallexample + @cindex @code{avg@var{m}3_floor} instruction pattern @cindex @code{uavg@var{m}3_floor} instruction pattern @item @samp{avg@var{m}3_floor} diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..0f1724ecf37a31c231572edf90b5577e2d82f468 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -167,6 +167,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary) DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) +DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first, + sabd, uabd, binary) + DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first, savg_floor, uavg_floor, binary) DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..29bc92281a2175f898634cbe6af63c18021e5268 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, "mask_fold_left_plus_$a") OPTAB_D (extract_last_optab, "extract_last_$a") OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a") +OPTAB_D (uabd_optab, "uabd$a3") +OPTAB_D (sabd_optab, "sabd$a3") OPTAB_D (savg_floor_optab, "avg$a3_floor") OPTAB_D (uavg_floor_optab, "uavg$a3_floor") OPTAB_D (savg_ceil_optab, "avg$a3_ceil") diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index a49b09539776c0056e77f99b10365d0a8747fbc5..91e1f9d4b610275dd833ec56dc77f76367ee7886 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -770,6 +770,89 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info stmt2_info, tree new_rhs, } } +/* Look for the following pattern + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + */ +static bool +vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt, + tree *half_type, bool reject_unsigned, + vect_unpromoted_value unprom[2], + tree diff_oprnds[2]) +{ + if (!abs_stmt) + return false; + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + enum tree_code code = gimple_assign_rhs_code (abs_stmt); + if (code != ABS_EXPR && code != ABSU_EXPR) + return false; + + tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); + tree abs_type = TREE_TYPE (abs_oprnd); + if (!abs_oprnd) + return false; + if (reject_unsigned && TYPE_UNSIGNED (abs_type)) + return false; + if (!ANY_INTEGRAL_TYPE_P (abs_type) || TYPE_OVERFLOW_WRAPS (abs_type)) + return false; + + /* Peel off conversions from the ABS input. This can involve sign + changes (e.g. from an unsigned subtraction to a signed ABS input) + or signed promotion, but it can't include unsigned promotion. + (Note that ABS of an unsigned promotion should have been folded + away before now anyway.) */ + vect_unpromoted_value unprom_diff; + abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, + &unprom_diff); + if (!abs_oprnd) + return false; + if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) + && TYPE_UNSIGNED (unprom_diff.type)) + if (!reject_unsigned) + return false; + + /* We then detect if the operand of abs_expr is defined by a minus_expr. */ + stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); + if (!diff_stmt_vinfo) + return false; + + bool assigned_oprnds = false; + gassign *diff = dyn_cast (STMT_VINFO_STMT (diff_stmt_vinfo)); + if (diff_oprnds && diff && gimple_assign_rhs_code (diff) == MINUS_EXPR) + { + assigned_oprnds = true; + diff_oprnds[0] = gimple_assign_rhs1 (diff); + diff_oprnds[1] = gimple_assign_rhs2 (diff); + } + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + if (vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + WIDEN_MINUS_EXPR, + false, 2, unprom, half_type)) + { + if (diff_oprnds && !assigned_oprnds) + { + diff_oprnds[0] = unprom[0].op; + diff_oprnds[1] = unprom[1].op; + } + } + else if (!assigned_oprnds) + { + return false; + } + else + { + *half_type = NULL_TREE; + } + + return true; +} + /* Convert UNPROM to TYPE and return the result, adding new statements to STMT_INFO's pattern definition statements if no better way is available. VECTYPE is the vector form of TYPE. @@ -1308,40 +1391,13 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ gassign *abs_stmt = dyn_cast (abs_stmt_vinfo->stmt); - if (!abs_stmt - || (gimple_assign_rhs_code (abs_stmt) != ABS_EXPR - && gimple_assign_rhs_code (abs_stmt) != ABSU_EXPR)) - return NULL; - - tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); - tree abs_type = TREE_TYPE (abs_oprnd); - if (TYPE_UNSIGNED (abs_type)) - return NULL; - - /* Peel off conversions from the ABS input. This can involve sign - changes (e.g. from an unsigned subtraction to a signed ABS input) - or signed promotion, but it can't include unsigned promotion. - (Note that ABS of an unsigned promotion should have been folded - away before now anyway.) */ - vect_unpromoted_value unprom_diff; - abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, - &unprom_diff); - if (!abs_oprnd) - return NULL; - if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) - && TYPE_UNSIGNED (unprom_diff.type)) - return NULL; - /* We then detect if the operand of abs_expr is defined by a minus_expr. */ - stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); - if (!diff_stmt_vinfo) + vect_unpromoted_value unprom[2]; + if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type, + true, unprom, NULL)) return NULL; - /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi - inside the loop (in case we are analyzing an outer-loop). */ - vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!half_type) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -1363,6 +1419,137 @@ vect_recog_sad_pattern (vec_info *vinfo, return pattern_stmt; } +/* Function vect_recog_abd_pattern + + Try to find the following ABsolute Difference (ABD) pattern: + + VTYPE x, y, out; + type diff; + loop i in range: + S1 diff = x[i] - y[i] + S2 out[i] = ABS_EXPR ; + + where 'type' is a integer and 'VTYPE' is a vector of integers + the same size as 'type' + + Input: + + * STMT_VINFO: The stmt from which the pattern search begins + + Output: + + * TYPE_out: The type of the output of this pattern + + * Return value: A new stmt that will be used to replace the sequence of + stmts that constitute the pattern; either SABD or UABD: + SABD_EXPR + UABD_EXPR + + UABD expressions are used when the input types are + narrower than the output types or the output type is narrower + than 32 bits + */ + +static gimple * +vect_recog_abd_pattern (vec_info *vinfo, + stmt_vec_info stmt_vinfo, tree *type_out) +{ + /* Look for the following patterns + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + out[i] = DAD + + In which + - X, Y, DIFF, DAD all have the same type + - x, y, out are all vectors of the same type + */ + gassign *last_stmt = dyn_cast (STMT_VINFO_STMT (stmt_vinfo)); + if (!last_stmt) + return NULL; + + tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt)); + + gassign *abs_stmt = last_stmt; + if (gimple_assign_cast_p (last_stmt)) + { + tree last_rhs = gimple_assign_rhs1 (last_stmt); + if (!SSA_VAR_P (last_rhs)) + return NULL; + + abs_stmt = dyn_cast (SSA_NAME_DEF_STMT (last_rhs)); + if (!abs_stmt) + return NULL; + } + + vect_unpromoted_value unprom[2]; + tree diff_oprnds[2]; + tree half_type; + if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type, + false, unprom, diff_oprnds)) + return NULL; + +#define SAME_TYPE(A, B) (TYPE_PRECISION (A) == TYPE_PRECISION (B)) + + tree abd_oprnds[2]; + if (half_type) + { + if (!SAME_TYPE (unprom[0].type, unprom[1].type)) + return NULL; + + tree diff_type = TREE_TYPE (diff_oprnds[0]); + if (TYPE_PRECISION (out_type) != TYPE_PRECISION (diff_type)) + { + vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, half_type, unprom, + get_vectype_for_scalar_type (vinfo, half_type)); + } + else + { + abd_oprnds[0] = diff_oprnds[0]; + abd_oprnds[1] = diff_oprnds[1]; + } + } + else + { + if (unprom[0].op && unprom[1].op + && (!SAME_TYPE (unprom[0].type, unprom[1].type) + || !SAME_TYPE (unprom[0].type, out_type))) + return NULL; + + unprom[0].op = diff_oprnds[0]; + unprom[1].op = diff_oprnds[1]; + tree signed_out = signed_type_for (out_type); + tree signed_out_vectype = get_vectype_for_scalar_type (vinfo, signed_out); + vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, + signed_out, unprom, signed_out_vectype); + + if (!SAME_TYPE (TREE_TYPE (diff_oprnds[0]), TREE_TYPE (abd_oprnds[0]))) + return NULL; + } + + if (!SAME_TYPE (TREE_TYPE (abd_oprnds[0]), TREE_TYPE (abd_oprnds[1])) + || !SAME_TYPE (TREE_TYPE (abd_oprnds[0]), out_type)) + return NULL; + + vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); + + tree vectype = get_vectype_for_scalar_type (vinfo, out_type); + if (!vectype + || !direct_internal_fn_supported_p (IFN_ABD, vectype, + OPTIMIZE_FOR_SPEED)) + return NULL; + + *type_out = STMT_VINFO_VECTYPE (stmt_vinfo); + + tree var = vect_recog_temp_ssa_var (out_type, NULL); + gcall *abd_stmt = gimple_build_call_internal (IFN_ABD, 2, + abd_oprnds[0], abd_oprnds[1]); + gimple_call_set_lhs (abd_stmt, var); + gimple_set_location (abd_stmt, gimple_location (last_stmt)); + return abd_stmt; +} + /* Recognize an operation that performs ORIG_CODE on widened inputs, so that it can be treated as though it had the form: @@ -6439,6 +6626,7 @@ struct vect_recog_func static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_bitfield_ref_pattern, "bitfield_ref" }, { vect_recog_bit_insert_pattern, "bit_insert" }, + { vect_recog_abd_pattern, "abd" }, { vect_recog_over_widening_pattern, "over_widening" }, /* Must come after over_widening, which narrows the shift as much as possible beforehand. */ -- 2.25.1