From patchwork Mon Oct 31 11:56:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13241 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2267701wru; Mon, 31 Oct 2022 05:00:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4pL9ckiQ/t7uNKHCwuRbMQzSiy03Go/8blISCmdOltq+ylyOEVT35yGPHksVJdfUSkeHL8 X-Received: by 2002:a17:907:2705:b0:7ad:855d:1050 with SMTP id w5-20020a170907270500b007ad855d1050mr12700879ejk.443.1667217641824; Mon, 31 Oct 2022 05:00:41 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a31-20020a509ea2000000b0045743696acbsi7367301edf.139.2022.10.31.05.00.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:00:41 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=jgZ3SE1x; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 05CBA384B838 for ; Mon, 31 Oct 2022 11:59:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 05CBA384B838 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217570; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=jgZ3SE1xwNR3otxVS0kEkPWKKwvZHhz7K1zTHKDKmDlW5/wzpaE6REYxXZaEXm0Sk rESuIAwtxZ0lcnR0zOdUnKsHB+Qk0HZTCMJMiAiQXWUCjIOudrNJEEEG8UgqCepzch CMPzDnHWWqkyJi9Fwn+Nu9Sf/J3u8eLpUjqj9LHo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140059.outbound.protection.outlook.com [40.107.14.59]) by sourceware.org (Postfix) with ESMTPS id 7A8AF385482E for ; Mon, 31 Oct 2022 11:57:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7A8AF385482E ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=JPGRJlWlcnK4LgJpFspP7CQNiFSMpi/WndndCYshLKalFc8rzxhqQgsAtfy7kLt+tSXVYHRV6yS/0O0CfneH8vpvB7bK3QtZkU8dnjWPEUWQAkVT0vfIQVVUTPQIk9h/Yft5NclI3F2Cpkvq1S/QxF8gUA92leEHbPVDy2W2ZTcbW7iZ9qEIyS2dVFIDf5b8qINRatM8TGyeeLlmjJ+PG1+TbWndWD7VNYe1vKZ9iceu2HF8ZxpyFDxzrBFz22tUG0MijjEzZ2lC6uIxqLOHJ2103UFadgn2Ci4GMAn7ul2G22ToV4FEIDAs9d2eKmLjtVw95J0Y1GO2fg18sZAGZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; b=imNgik6QKEyzmJ9iS76kL4gzKEJDzLY7bZ/9Lqf1kgpKxAd25rmHiXzGDUALqWpV8BvJlOAz1G9xNy7/z1HmrplSk1Av22nOPZYFxRkMTNMdENmvsfzc6Z1qxQzBIBwlYmzFQmIuCWyZ8GHzCV9WfqXWOKB6rqs+w627zzQRfg0xQPwaEtpyNaF9SWaCSH9XAmd/hXJnwBUVqjiLMiiZoPgtP6QODNho/lOxUBLgu1tFEAG1fvN8N50PWAoFVxuqQcDplmlunOJ6mwlnpoXuw52PpZr2y9i0zgSbjqkbD1cTj5vuHjvZgJf8R9Wj8nm7S+T+jt5cEGrBFfG0/ZjpoQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0301CA0080.eurprd03.prod.outlook.com (2603:10a6:6:30::27) by PAWPR08MB10257.eurprd08.prod.outlook.com (2603:10a6:102:367::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15; Mon, 31 Oct 2022 11:57:32 +0000 Received: from DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com (2603:10a6:6:30:cafe::7f) by DB6PR0301CA0080.outlook.office365.com (2603:10a6:6:30::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:57:32 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT035.mail.protection.outlook.com (100.127.142.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:31 +0000 Received: ("Tessian outbound 0800d254cb3b:v130"); Mon, 31 Oct 2022 11:57:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 98af2278ba981402 X-CR-MTA-TID: 64aa7808 Received: from 98a33b3d9617.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7C6FF9E9-61D8-411B-BB9B-DDCDE57357E6.1; Mon, 31 Oct 2022 11:56:46 +0000 Received: from EUR02-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 98a33b3d9617.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:56:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LOIfRIyycB3o6t+fXwxJSElJDAUjVtT6N4+6IIu/+suBj/18WXinzm5CGQg89trS8xMICDlr+w5rtYB8X5g2wKYKgNnfztGWp+M5G5xcOMXiUGVmjOv8Ck5dONS4Oyy39XXbxKclnqpmqztFaRRnyp0bWn+yGWV3oW/raQ1pNXfMc59y8ZvA0v0XxjykuBQaBf+FJr11o3rh8Itga2CIS0HlR7QdWMrJwrK/pRkZ1N0MfUli0kZ8DUjYoINzY7wtKDbPvZ/y/4wSLdsCV1+bD7tAv9ES+Tt94i77VpqwY38ElFfDUUfnmSKnBoMxWjOmnIuPTIR/igRY3+QOSyrD7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; b=RDyZFvNg5+P9N2HGvfqRyaF/G3YG/E8elVrr3o1v9aE3jE9euadlPZeh2H7+pojk39qCG5LKPQujsWVNG8YsHMgvhsY+E1fqDzxYUq3x8w+Po7F5qllzJS2/KlJ8IZK33vxfI06dDEc5s39CUUwmvh1mTK/YZx7z0Em4spK8XhB+YT51Is9a1uoz7PbLMFBsVFxSDiWaDs2+sIiIPDFdVVnVrOVmohUsRK38qkMfnQrUy0LXsY27ainLRFJnJk3JFzkYS/1Mv0hHzrzk5CHo0CnKGqa9HpH5lgB4YzgEehzSoVH3knq5fTJ09+k5UWXZWDfba1TIYJKI65Fv4JSQSg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB6730.eurprd08.prod.outlook.com (2603:10a6:10:2a2::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:56:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:56:44 +0000 Date: Mon, 31 Oct 2022 11:56:42 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0430.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18b::21) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB6730:EE_|DBAEUR03FT035:EE_|PAWPR08MB10257:EE_ X-MS-Office365-Filtering-Correlation-Id: b6db9dd2-e002-4627-4d18-08dabb371713 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: QJLINOLDq3A0q/8kix+kxsPS1/LtmBppnuDsk0jlEgKnmqG40KnNkfFf8crS+amknDV+vwvq4EyL07qi7SM/y62skhuBeibKpHcr9MHfHRXtYFiVPFu8cALsajnEjGyZxZ16wtB4ebbaUs+5ZKGqBdh2NQpJcfZHyjjLAa6V0iPGVDRz4bRZ8smSsxhkM6yQshYcqL0+ddRPDn7l/0a8xf9dl+v24hrUKQpdQgU5xu0SZOJINEsTWMEFI6YJEEDNCZJLE/xo8Aicj+/2qKgjh4L7inSAvVTDnx2ZbxqmtCnVeC7btrXCWZnMIjesI+MYkhNDX7StAYnXIyG8d4DYKLqIjqTm8Ya6LfSmYphTsFbeERN0lXZX+DmusabckEeA50tLrj3ASw1mX4x/NNQWtz8fjQ+JtabJ2TOAHB89s9lP8BTQm6ZND2QWO8enE1KS2SJk/yxZqPgQbg39U42XwZPmqSIUimag8OOF0+UoY8w309rPLWwpP2dZjjqYYOteFKSPjpTiL2XnTS/Aq8aWhems7bzlFyow85IbPKMF5VxLfkIhB9UtENCmk+/WQXbsW1NKbSarGAKIKc4d450ujZwNWS0MND+ELgCzH4MkBcre53AdQ7VW+LsmNNSUCJEm3aSoA8++SNtMQmzHvDJ0I/+IiPu7HvpRLLuufSydLUrIXMJ+74eyOE42VGylRWDMK8+m0WbnSjqPdVbTiggXDsruTcQjEGEj18PB+uIJIj33a8D85RVYC/MyidJn63I/IpXIGMkAvFzxErhOKv3ABTj1PHvznjj+JpGANvfVY/w= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(396003)(136003)(376002)(39860400002)(451199015)(83380400001)(66899015)(36756003)(6486002)(2906002)(44832011)(86362001)(38100700002)(6512007)(33964004)(44144004)(26005)(4743002)(2616005)(186003)(316002)(66946007)(66556008)(8676002)(478600001)(4326008)(41300700001)(66476007)(235185007)(5660300002)(6916009)(8936002)(6506007)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6730 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: dd27aa71-f569-4ba5-0c1d-08dabb36fb50 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: W9FoGqDbDmFLiBBYOZakT4aqYzWq+1zRRMvCPiqwyd48TtxYUDVh7f6j7PILjFDoptHJa2o3FA2yrjWMZQ7/fD+vvGlJdefLRSphwCJzJEo6U6Y5EMVNpRNA0IP/RfnDom4YDfdmxlkvcYMnnPlK81h+cBA2SA9palTOlM4SZziu4lXjMANN2AkGkg+0qn9iVv/jXJ9jfURJ3AzOPYsSVZhiOcTxrUuNRrdmjYy8PN4udLPMd3uahN8LNKQFrEBQjFDnOszUrZPD76nsJ4M7gwRg8xQLbpRoBoNJtk5wwyfBVgdb8O0lvDf9cXUP4p9SPAUAgoBKRgQSdCdL8C8VQ9bUX/ZB8fFOMXi2ACWoPusZy9S7ExfJzNZ0FxuG4SPoHVyl2tZmUsqsHOAx+gydg/IFJAuug2W4DmFzjpah1o3giu6nv134L5lloDir3xUGiIQgRm4t2Q4/oNW4t7vsZctqNdY5nN+C469JJRx2VYRIBLeLExaocR/jyb8GiXSSFKNCe71BkSdIjjALp6add77mVtEJMMBZRbTrAD5/KiKJLbsV6aoRBo/qqgSvB0T9AoKKKnmCc+BLXYRD5Lls+iiIUzRVXmeXI3cs9HXA5ViXOhdyuy5TCDfP3g2SgEsWkcgnT7kw1lmfJAOZM1Rt/YsVs6Qq8L4ZsawfCBuwpAG5pfK3kSEQivq8Ktg1Ex9uUgQJhsAOjLyNgZ4Gk8aZMUPS8JE25H457woMXFfGFB3gNqSThRJl4DLNg/FbbmjML+9lwuigF0IjpXHK9HWCg/3TNYPJAyO8gJK47hxCMnRFZs1m6tpicPWGQ56DleP0 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(346002)(396003)(39860400002)(376002)(451199015)(36840700001)(46966006)(40470700004)(66899015)(82740400003)(36756003)(86362001)(356005)(81166007)(478600001)(4743002)(2906002)(336012)(83380400001)(40480700001)(44832011)(186003)(44144004)(107886003)(33964004)(2616005)(26005)(6506007)(40460700003)(36860700001)(6512007)(47076005)(4326008)(6916009)(316002)(235185007)(6486002)(82310400005)(70206006)(70586007)(8676002)(41300700001)(8936002)(5660300002)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:31.0926 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b6db9dd2-e002-4627-4d18-08dabb371713 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB10257 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204405729167593?= X-GMAIL-MSGID: =?utf-8?q?1748204405729167593?= Hi All, This patch series is to add recognition of pairwise operations (reductions) in match.pd such that we can benefit from them even at -O1 when the vectorizer isn't enabled. Ths use of these allow for a lot simpler codegen in AArch64 and allows us to avoid quite a lot of codegen warts. As an example a simple: typedef float v4sf __attribute__((vector_size (16))); float foo3 (v4sf x) { return x[1] + x[2]; } currently generates: foo3: dup s1, v0.s[1] dup s0, v0.s[2] fadd s0, s1, s0 ret while with this patch series now generates: foo3: ext v0.16b, v0.16b, v0.16b, #4 faddp s0, v0.2s ret This patch will not perform the operation if the source is not a gimple register and leaves memory sources to the vectorizer as it's able to deal correctly with clobbers. The use of these instruction makes a significant difference in codegen quality for AArch64 and Arm. NOTE: The last entry in the series contains tests for all of the previous patches as it's a bit of an all or nothing thing. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * match.pd (adjacent_data_access_p): Import. Add new pattern for bitwise plus, min, max, fmax, fmin. * tree-cfg.cc (verify_gimple_call): Allow function arguments in IFNs. * tree.cc (adjacent_data_access_p): New. * tree.h (adjacent_data_access_p): New. --- inline copy of patch -- diff --git a/gcc/match.pd b/gcc/match.pd index 2617d56091dfbd41ae49f980ee0af3757f5ec1cf..aecaa3520b36e770d11ea9a10eb18db23c0cd9f7 100644 --- diff --git a/gcc/match.pd b/gcc/match.pd index 2617d56091dfbd41ae49f980ee0af3757f5ec1cf..aecaa3520b36e770d11ea9a10eb18db23c0cd9f7 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3. If not see HONOR_NANS uniform_vector_p expand_vec_cmp_expr_p - bitmask_inv_cst_vector_p) + bitmask_inv_cst_vector_p + adjacent_data_access_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -7195,6 +7196,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Canonicalizations of BIT_FIELD_REFs. */ +/* Canonicalize BIT_FIELD_REFS to pairwise operations. */ +(for op (plus min max FMIN_ALL FMAX_ALL) + ifn (IFN_REDUC_PLUS IFN_REDUC_MIN IFN_REDUC_MAX + IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (op @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, true); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (ifn (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))) + +(for op (lt gt) + ifni (IFN_REDUC_MIN IFN_REDUC_MAX) + ifnf (IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (cond (op @0 @1) @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, false); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (if (SCALAR_FLOAT_MODE_P (TYPE_MODE (type))) + (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) + (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 91ec33c80a41e1e0cc6224e137dd42144724a168..b19710392940cf469de52d006603ae1e3deb6b76 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3492,6 +3492,7 @@ verify_gimple_call (gcall *stmt) { tree arg = gimple_call_arg (stmt, i); if ((is_gimple_reg_type (TREE_TYPE (arg)) + && !is_gimple_variable (arg) && !is_gimple_val (arg)) || (!is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_lvalue (arg))) diff --git a/gcc/tree.h b/gcc/tree.h index e6564aaccb7b69cd938ff60b6121aec41b7e8a59..8f8a9660c9e0605eb516de194640b8c1b531b798 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5006,6 +5006,11 @@ extern bool integer_pow2p (const_tree); extern tree bitmask_inv_cst_vector_p (tree); +/* TRUE if the two operands represent adjacent access of data such that a + pairwise operation can be used. */ + +extern tree adjacent_data_access_p (tree, tree, poly_uint64*, bool); + /* integer_nonzerop (tree x) is nonzero if X is an integer constant with a nonzero value. */ diff --git a/gcc/tree.cc b/gcc/tree.cc index 007c9325b17076f474e6681c49966c59cf6b91c7..5315af38a1ead89ca5f75dc4b19de9841e29d311 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10457,6 +10457,90 @@ bitmask_inv_cst_vector_p (tree t) return builder.build (); } +/* Returns base address if the two operands represent adjacent access of data + such that a pairwise operation can be used. OP1 must be a lower subpart + than OP2. If POS is not NULL then on return if a value is returned POS + will indicate the position of the lower address. If COMMUTATIVE_P then + the operation is also tried by flipping op1 and op2. */ + +tree adjacent_data_access_p (tree op1, tree op2, poly_uint64 *pos, + bool commutative_p) +{ + gcc_assert (op1); + gcc_assert (op2); + if (TREE_CODE (op1) != TREE_CODE (op2) + || TREE_TYPE (op1) != TREE_TYPE (op2)) + return NULL; + + tree type = TREE_TYPE (op1); + gimple *stmt1 = NULL, *stmt2 = NULL; + unsigned int bits = GET_MODE_BITSIZE (GET_MODE_INNER (TYPE_MODE (type))); + + if (TREE_CODE (op1) == BIT_FIELD_REF + && operand_equal_p (TREE_OPERAND (op1, 0), TREE_OPERAND (op2, 0), 0) + && operand_equal_p (TREE_OPERAND (op1, 1), TREE_OPERAND (op2, 1), 0) + && known_eq (bit_field_size (op1), bits)) + { + poly_uint64 offset1 = bit_field_offset (op1); + poly_uint64 offset2 = bit_field_offset (op2); + if (known_eq (offset2 - offset1, bits)) + { + if (pos) + *pos = offset1; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, bits)) + { + if (pos) + *pos = offset2; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == ARRAY_REF + && operand_equal_p (get_base_address (op1), get_base_address (op2))) + { + wide_int size1 = wi::to_wide (array_ref_element_size (op1)); + wide_int size2 = wi::to_wide (array_ref_element_size (op2)); + if (wi::ne_p (size1, size2) || wi::ne_p (size1, bits / 8) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op1, 1)) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op2, 1))) + return NULL; + + poly_uint64 offset1 = tree_to_poly_uint64 (TREE_OPERAND (op1, 1)); + poly_uint64 offset2 = tree_to_poly_uint64 (TREE_OPERAND (op2, 1)); + if (known_eq (offset2 - offset1, 1UL)) + { + if (pos) + *pos = offset1 * bits; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, 1UL)) + { + if (pos) + *pos = offset2 * bits; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == SSA_NAME + && (stmt1 = SSA_NAME_DEF_STMT (op1)) != NULL + && (stmt2 = SSA_NAME_DEF_STMT (op2)) != NULL + && is_gimple_assign (stmt1) + && is_gimple_assign (stmt2)) + { + if (gimple_assign_rhs_code (stmt1) != ARRAY_REF + && gimple_assign_rhs_code (stmt1) != BIT_FIELD_REF + && gimple_assign_rhs_code (stmt2) != ARRAY_REF + && gimple_assign_rhs_code (stmt2) != BIT_FIELD_REF) + return NULL; + + return adjacent_data_access_p (gimple_assign_rhs1 (stmt1), + gimple_assign_rhs1 (stmt2), pos, + commutative_p); + } + + return NULL; +} + /* If VECTOR_CST T has a single nonzero element, return the index of that element, otherwise return -1. */ --- a/gcc/match.pd +++ b/gcc/match.pd @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3. If not see HONOR_NANS uniform_vector_p expand_vec_cmp_expr_p - bitmask_inv_cst_vector_p) + bitmask_inv_cst_vector_p + adjacent_data_access_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -7195,6 +7196,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Canonicalizations of BIT_FIELD_REFs. */ +/* Canonicalize BIT_FIELD_REFS to pairwise operations. */ +(for op (plus min max FMIN_ALL FMAX_ALL) + ifn (IFN_REDUC_PLUS IFN_REDUC_MIN IFN_REDUC_MAX + IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (op @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, true); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (ifn (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))) + +(for op (lt gt) + ifni (IFN_REDUC_MIN IFN_REDUC_MAX) + ifnf (IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (cond (op @0 @1) @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, false); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (if (SCALAR_FLOAT_MODE_P (TYPE_MODE (type))) + (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) + (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 91ec33c80a41e1e0cc6224e137dd42144724a168..b19710392940cf469de52d006603ae1e3deb6b76 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3492,6 +3492,7 @@ verify_gimple_call (gcall *stmt) { tree arg = gimple_call_arg (stmt, i); if ((is_gimple_reg_type (TREE_TYPE (arg)) + && !is_gimple_variable (arg) && !is_gimple_val (arg)) || (!is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_lvalue (arg))) diff --git a/gcc/tree.h b/gcc/tree.h index e6564aaccb7b69cd938ff60b6121aec41b7e8a59..8f8a9660c9e0605eb516de194640b8c1b531b798 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5006,6 +5006,11 @@ extern bool integer_pow2p (const_tree); extern tree bitmask_inv_cst_vector_p (tree); +/* TRUE if the two operands represent adjacent access of data such that a + pairwise operation can be used. */ + +extern tree adjacent_data_access_p (tree, tree, poly_uint64*, bool); + /* integer_nonzerop (tree x) is nonzero if X is an integer constant with a nonzero value. */ diff --git a/gcc/tree.cc b/gcc/tree.cc index 007c9325b17076f474e6681c49966c59cf6b91c7..5315af38a1ead89ca5f75dc4b19de9841e29d311 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10457,6 +10457,90 @@ bitmask_inv_cst_vector_p (tree t) return builder.build (); } +/* Returns base address if the two operands represent adjacent access of data + such that a pairwise operation can be used. OP1 must be a lower subpart + than OP2. If POS is not NULL then on return if a value is returned POS + will indicate the position of the lower address. If COMMUTATIVE_P then + the operation is also tried by flipping op1 and op2. */ + +tree adjacent_data_access_p (tree op1, tree op2, poly_uint64 *pos, + bool commutative_p) +{ + gcc_assert (op1); + gcc_assert (op2); + if (TREE_CODE (op1) != TREE_CODE (op2) + || TREE_TYPE (op1) != TREE_TYPE (op2)) + return NULL; + + tree type = TREE_TYPE (op1); + gimple *stmt1 = NULL, *stmt2 = NULL; + unsigned int bits = GET_MODE_BITSIZE (GET_MODE_INNER (TYPE_MODE (type))); + + if (TREE_CODE (op1) == BIT_FIELD_REF + && operand_equal_p (TREE_OPERAND (op1, 0), TREE_OPERAND (op2, 0), 0) + && operand_equal_p (TREE_OPERAND (op1, 1), TREE_OPERAND (op2, 1), 0) + && known_eq (bit_field_size (op1), bits)) + { + poly_uint64 offset1 = bit_field_offset (op1); + poly_uint64 offset2 = bit_field_offset (op2); + if (known_eq (offset2 - offset1, bits)) + { + if (pos) + *pos = offset1; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, bits)) + { + if (pos) + *pos = offset2; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == ARRAY_REF + && operand_equal_p (get_base_address (op1), get_base_address (op2))) + { + wide_int size1 = wi::to_wide (array_ref_element_size (op1)); + wide_int size2 = wi::to_wide (array_ref_element_size (op2)); + if (wi::ne_p (size1, size2) || wi::ne_p (size1, bits / 8) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op1, 1)) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op2, 1))) + return NULL; + + poly_uint64 offset1 = tree_to_poly_uint64 (TREE_OPERAND (op1, 1)); + poly_uint64 offset2 = tree_to_poly_uint64 (TREE_OPERAND (op2, 1)); + if (known_eq (offset2 - offset1, 1UL)) + { + if (pos) + *pos = offset1 * bits; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, 1UL)) + { + if (pos) + *pos = offset2 * bits; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == SSA_NAME + && (stmt1 = SSA_NAME_DEF_STMT (op1)) != NULL + && (stmt2 = SSA_NAME_DEF_STMT (op2)) != NULL + && is_gimple_assign (stmt1) + && is_gimple_assign (stmt2)) + { + if (gimple_assign_rhs_code (stmt1) != ARRAY_REF + && gimple_assign_rhs_code (stmt1) != BIT_FIELD_REF + && gimple_assign_rhs_code (stmt2) != ARRAY_REF + && gimple_assign_rhs_code (stmt2) != BIT_FIELD_REF) + return NULL; + + return adjacent_data_access_p (gimple_assign_rhs1 (stmt1), + gimple_assign_rhs1 (stmt2), pos, + commutative_p); + } + + return NULL; +} + /* If VECTOR_CST T has a single nonzero element, return the index of that element, otherwise return -1. */ From patchwork Mon Oct 31 11:57:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13239 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2266780wru; Mon, 31 Oct 2022 04:58:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6joyA9mBfDKMM3CzYkNj0QfPlOPJFFpATgjYm97bf62RvSTQixT3mBStBpaa00t/4c7Pc+ X-Received: by 2002:a17:907:6e23:b0:7ad:b962:33ee with SMTP id sd35-20020a1709076e2300b007adb96233eemr10046419ejc.28.1667217510353; Mon, 31 Oct 2022 04:58:30 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id qw5-20020a1709066a0500b007a45e4f4ff2si8728428ejc.853.2022.10.31.04.58.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 04:58:30 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=RUIlKhwj; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8147F3854811 for ; Mon, 31 Oct 2022 11:58:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8147F3854811 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217505; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=RUIlKhwjbBGbu0r27JimGciggzbCEhOyMFYmB0ao2A9EG0yRTuXA7QtVnVLEOIXkm Wg8q3ZI26UTYHjyf/gOpYCsqgbTRlXv/Owix+LDkLEgJ3SwChZ2guqzRXPbnJflMy4 gWqxd+BUf3NDEl/9IjqxEJkkgMnOA2jxxKsBoAK8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2065.outbound.protection.outlook.com [40.107.20.65]) by sourceware.org (Postfix) with ESMTPS id BACA33854838 for ; Mon, 31 Oct 2022 11:57:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BACA33854838 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=EpgbRQc8Xt1M6Rp1bq5O62vlp0ELoMkVb6hjYsxrzEUqv6+uT+BtSh1Fxp1XS3EVl0iGOT1cumI26igwyvgb12PEg/PaIkD8JlgsHOi2zs11mgqLahDQJN3qkWWImZzo9ByzDBlsZRHrz+HZuUft7DjYi5VKGquP+3sV4QvIF+nybV9aA2r3m5VU2pgr8OuhqEEdaI4wpuMJ0OD5o1BhtDKh9RKEVtLb/OQBQi6b/978+eNHgi9rbkAhTjS92a81GZA5MEA9x9tYOlICLCj79JHIBp0MR49q8F1sW6k94CkPtcQo/AH+Uh3tK2SyP7fDJryZgPu4n6G4p1zm1xVYDQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; b=YDgHs/H/RCyagBFX7+SboyzFu+PN3iCRq0Q3gkGgsx7+cA2vtf959hu3lbwpRUc8+esFkNctMUXPsrK7Fa3VaE/6k0SCo1eT2jD7Qb8x85oROaf+tp98dPkvs7vJ1g/BACLUonqZn2KrpO1XMGXrA++CpFU4MDlKZ3vXXWgYqXPupmL+zQWL2E3geaoEajNwZ4AEheFJi3YLrL0dPugYtyqBmXBuNv58TaFXDrBYTpQLIZyEFiUuRNTNkP2LHPdE46l861t5sDdQmxGKzkj/onHYUqGIUxWsNjc/xdAof+XrKJViP7DE0i0JeezrVJqhGs/oepAnwjEFG6HPc5TNYg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0202CA0011.eurprd02.prod.outlook.com (2603:10a6:4:29::21) by PAXPR08MB6655.eurprd08.prod.outlook.com (2603:10a6:102:15d::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:57:30 +0000 Received: from DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:29:cafe::dd) by DB6PR0202CA0011.outlook.office365.com (2603:10a6:4:29::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16 via Frontend Transport; Mon, 31 Oct 2022 11:57:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT063.mail.protection.outlook.com (100.127.142.255) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:30 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:57:30 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 968f3eed71563821 X-CR-MTA-TID: 64aa7808 Received: from b6e485f884c1.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BABE0E2A-D6AC-4E16-AF1C-84AEA68790A2.1; Mon, 31 Oct 2022 11:57:22 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b6e485f884c1.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:57:22 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WktOn7a1MWvxHYivIuMX1XSyP9eVr6phW52k+b7cBZHYL7bcgcgqj+0Nfgzn7AkNHm9dtGWavnSiWU6o+2a33wv1DRzdPkNtGy+zbluIy6Zban0YhD0YfoZkfrmWEzOGdm4U/izBAP45DyFVZut6cq+8YUAPIrN3GaRpeocjgx94ZmRg3KJCo/J0whcpmBG72NXhwEXRFWki31IqYN/+eHmXC5F4Gfbt4EPAZ2Le9wsJyrAEtI8uAiDbImWBWKSf4tyC3hanlOaNNPcbBn1SaWYrYz3pvjtWuWp+cNF3z9aJm3/8H5/Uq3zimnYroZ0Ngg03R1wuJpHs3zRmbrJcow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; b=DHgZzoU3Xxk1Axn9vUCRas0d7vaqG1LX/U1zZJ3gULvCP9rUFL7gosNIADCFOakMadPj8aWAtNhivAxfDSGd84NK8hb11MzFmrB0gNzwjxZdK0zhlfjscRFvmgLgMp2M5CtlHlxsD++G2OkSGbwSqpNP/thPb8rbA1YwoR8vpaMMeaanMhcI9mZeWV/WHZxgT0z2tSo2y6Jh+slRxJ+AThRbt2kNAfTm7Wm3Q40tg20ZhKCUa1asB2hiB020dlasmyozKLl2MPepvdoBQS8RfouK4535EsLbRQtV71gUcd/gy6aX3CpwPZ7E4gE5HjN4IAMPzejEofjSxTeSiwyoZA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB8717.eurprd08.prod.outlook.com (2603:10a6:20b:55d::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14; Mon, 31 Oct 2022 11:57:20 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:57:19 +0000 Date: Mon, 31 Oct 2022 11:57:12 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/8]middle-end: Recognize scalar widening reductions Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA1P222CA0074.NAMP222.PROD.OUTLOOK.COM (2603:10b6:806:2c1::20) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS2PR08MB8717:EE_|DBAEUR03FT063:EE_|PAXPR08MB6655:EE_ X-MS-Office365-Filtering-Correlation-Id: 189bdc31-6b7c-4796-b8ef-08dabb3716aa x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 7Gq7iRX5VFNjQeqE6OCoxD1xbxRtXtwmk57Q5gqJJcjNrklN0CNKwUlblYqmr5gwDqTopDgrFe/Dd4JmrQ3VzESHWI6dQw98cegHktFcR8IvE3JyAE2RC+3vlOxgeJT/4BDrna4Vj0IhGbTeuoqYE9qlq/VFELW0N4WF7FeN8dZP9GerYguPByYylNDh36sQcrAZhZknwZxG0bruuw7LBWtHLDvDKVravzvIQou07o3aALI1vNGmKa/kN/8MFZz/POiLyggM94qge1ZycuJOMehTkF3MbkymTV2quEJorJw8pB5qqo4RBqLIOrOleNKBn/0P/GAOa0zfZq+Edo4BIYx87F96HZvcwecSkDnZOfd0b/K8RIXVZWGnoD0lC/OP00vuSrTchxdc/QxSkjINw7+VEkCKqmWQmCkDYp3Fy2Rwl9wesU/oMy8hm6Bs6d40CLLFQh0c1VBHk4fnsGvjteMtjcqg8fyDeoR5EHhBMHp2YVGJW6HhzTAFrbskuPjgTn1mk5lKhA1mI4em8ZSsKqefFL4X7XvwWV8ErYYHs6f3eQabKTCoQvv3dMdRDKr/lJWoJej8shRm2bqktfzmV04Lqo8+8AlMudd9hIYGDlUvhhTY+2FjrhvVomXyIFmJousGxIQBffBlzSYqDkeqHj7Vvv3shcYG0CGSOjweowmiv3KP27YyOx1RrUMoDUArlJz8JhJlvOBT0O0CDL2C3StOsqFzmeZEBH/NhcD8GX9IqkPvRO+2qG1YKbz1eHvjb2JwfuCH5S/XhkEjbrtTyL/nq4vD3AxdrbysUuXXWUk= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(366004)(451199015)(478600001)(8676002)(38100700002)(6486002)(316002)(6666004)(86362001)(6512007)(26005)(4743002)(66476007)(4326008)(41300700001)(186003)(235185007)(2906002)(33964004)(66946007)(2616005)(8936002)(6916009)(66556008)(36756003)(5660300002)(44832011)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8717 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 87578d1b-655c-4f7d-f979-08dabb371055 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1RDKzcoUK+iNXv9aejZ0Wl/fgOQ6HKqumG+iTrrebVX2duuxqPyMw2j6Qqy9OYWEJ9uo7bYxVyZVwqJl1wkM/KC60A/2V8Ev9dn9OTxXhQzKnQN2R+tXNxYyGGE/mM65rC+hlQ0n9wDGL73KUonJuNlANYPBHf+bukZ/tuiPe0/4NFsAhvznatDWtd12eabE3SQb2c90Vd2aMLBQv9zYjBrAOlkBOvDZ7kmowqHhE6uwvWUD0DPKWCWR2Lz/ZVie1sbHh+57J+TAKrebd97jMn0gF9SBLkAnjFwv2B5in/lQCFKt/pj06htLPu0XEjJKEM52Fw1PePAOf+e0otJy3RDQeX2GogHtgH99C8kuwantr7BawqwjtzivxNS/Vu1VJTjeEElD5/10EEtMdIn4x5PNxm2rRG9uZiYCdUhMiR5iNrMOG8L1bFsJaueisJxOarsfx0shdUDpYs50R2DgsofmqS3Lgpb1FXOdZReRG2Rt9WwGf4bhWxqOXJGXlTQQ8AuQ5EbB75xo/EoUDc8gDPuWFY3+4RXkmuborqailB/1j6SObwuckXwUEw9KOTTsY8XVDT8WIRtAX4x+JoDnVoTN2TPF9QDGudbF2ggprMwFWsw1He+tAX0ELW//lda10Obm8x2dlXzlntNv73ZB7eNS8fZewI1H3f0P3KShPUNBRzhVvbbrd+NW7RliQfkXxoHxLhj9kTM4WgadCXTp66XmXS/q3FQcOxlEK5Vz+P7Fc7vNDrK+FVocK+23VTcj4EzQ059pLMlC8o05wtWWIi1Iz4ytJU8sI3N7fXXYm/Ky7I4Hsr5SieNE5No4NOnkFUiAUaxsKOYLo5cK2LTxGA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(451199015)(46966006)(40470700004)(36840700001)(36860700001)(47076005)(86362001)(82310400005)(356005)(81166007)(82740400003)(41300700001)(2906002)(6506007)(235185007)(44832011)(8676002)(40480700001)(4326008)(70586007)(70206006)(5660300002)(8936002)(316002)(6916009)(4743002)(6512007)(40460700003)(2616005)(26005)(336012)(186003)(107886003)(6486002)(478600001)(33964004)(44144004)(6666004)(36756003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:30.4229 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 189bdc31-6b7c-4796-b8ef-08dabb3716aa X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6655 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204267686502981?= X-GMAIL-MSGID: =?utf-8?q?1748204267686502981?= Hi All, This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting scalar reduction has twice the precision of the input elements. At some point in a later patch I will also teach the vectorizer to recognize this builtin once I figure out how the various bits of reductions work. For now it's generated only by the match.pd pattern. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * internal-fn.def (REDUC_PLUS_WIDEN): New. * doc/md.texi: Document it. * match.pd: Recognize widening plus. * optabs.def (reduc_splus_widen_scal_optab, reduc_uplus_widen_scal_optab): New. --- inline copy of patch -- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644 --- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector. +@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_uplus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and zero-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + +@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_splus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and sign-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + @cindex @code{reduc_and_scal_@var{m}} instruction pattern @item @samp{reduc_and_scal_@var{m}} @cindex @code{reduc_ior_scal_@var{m}} instruction pattern diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary) DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW, reduc_plus_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW, + first, reduc_splus_widen_scal, + reduc_uplus_widen_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first, reduc_smax_scal, reduc_umax_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/match.pd b/gcc/match.pd index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) +/* Widening reduction conversions. */ +(simplify + (convert (IFN_REDUC_PLUS @0)) + (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0)) + && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))) + (IFN_REDUC_PLUS_WIDEN @0))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/optabs.def b/gcc/optabs.def index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a") OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a") OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a") OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a") +OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a") +OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a") OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a") OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a") OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a") --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector. +@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_uplus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and zero-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + +@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_splus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and sign-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + @cindex @code{reduc_and_scal_@var{m}} instruction pattern @item @samp{reduc_and_scal_@var{m}} @cindex @code{reduc_ior_scal_@var{m}} instruction pattern diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary) DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW, reduc_plus_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW, + first, reduc_splus_widen_scal, + reduc_uplus_widen_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first, reduc_smax_scal, reduc_umax_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/match.pd b/gcc/match.pd index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) +/* Widening reduction conversions. */ +(simplify + (convert (IFN_REDUC_PLUS @0)) + (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0)) + && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))) + (IFN_REDUC_PLUS_WIDEN @0))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/optabs.def b/gcc/optabs.def index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a") OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a") OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a") OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a") +OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a") +OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a") OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a") OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a") OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a") From patchwork Mon Oct 31 11:57:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13240 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2266944wru; Mon, 31 Oct 2022 04:59:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM52S9WrMex7oXCrxF5JK7IGTUJRPha4vJ9sRokcEKy2q3O4n/bqF+NxOyFHZgImw6ZmZtt6 X-Received: by 2002:a17:907:1de0:b0:7a7:6a8:1e61 with SMTP id og32-20020a1709071de000b007a706a81e61mr12417013ejc.468.1667217540339; Mon, 31 Oct 2022 04:59:00 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id s4-20020a50d484000000b00447dfae6181si7330888edi.235.2022.10.31.04.59.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 04:59:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="dX/X44Qj"; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 305953853816 for ; Mon, 31 Oct 2022 11:58:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 305953853816 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217527; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=dX/X44QjXaPzC6cgn8xumoHmmCB4b8ArHyuuL2HjxzzZAtj6n/uf/JGydRMRL4yhj lsFWI/9tXwCy9fiOoMiEoYrugL80ak0DTZw1JItNIROOuNLRn2L2SYz4FHHIktwm0P 19o/P5Q+KjfSBE7yj+/xZJqU69+HbxA6++sBoweQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2089.outbound.protection.outlook.com [40.107.105.89]) by sourceware.org (Postfix) with ESMTPS id 5CBEF3854811 for ; Mon, 31 Oct 2022 11:57:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CBEF3854811 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=BBTv1a3v5Cj02IEqRATC1iDfyREAHp9E5qRbtM0lkbYcOD2drpvz+UH1RTJ7IcWQhxVHEcK9KOQrDgAKw6bfI61y1r47g6WcGAd+kcvlWFXBrCpNb7xEh4TBGysqssj+hvK5E+hyoIibcoQvp0gvlj3YtnMSIWbbMIexUeiQjxaLowIvi5ZpMJRxZ1dySVtuaDO5fFiuOtruCZeMKZypEADH8HenKU+iAuJt/jFfpQFS8D10VOq3YTzS8AGjUfZ8un4HmAazQkbOldsBDxFcDUc/iDsrEfC3mRFCYg+aecrWRzI9Hfn9O2QPmO4Ax8UW8VdbMGpsemtsPFuffSERig== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; b=KUo/EQeXu8Tdt1GsuxngCK5Y+T+g5L1cXtT6k/p/RvlXY+f+NQ/IZs2FGGv3Vcagjx1/dh2Nz5a6qJQPKjJ2LbnxvA0WF/+jYehSMfa6Jda324AJC9pnroW/e7VoQ7MSpEnc04RF4Ogo6n2RXLEo0TH4ZbJbaHz4TUMw2pVhOWZ8KQxL6C4WeXEi6oMfPAvZmZG+4rygt4xybdL9VUkqlTbomNpcv84P0Mh2jrUFlw9twi+NmPIyz+nNYBe1JstA64tpSpo0ufoVs9vEGAUuvdK2suCd8+zQx0bBAA6AoqZUneeKr7/e6jOG6nQ0/ThxdnrGnO5XuJXks7nWddkbqQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0802CA0037.eurprd08.prod.outlook.com (2603:10a6:4:a3::23) by GVXPR08MB7703.eurprd08.prod.outlook.com (2603:10a6:150:6b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:57:56 +0000 Received: from DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:a3:cafe::d5) by DB6PR0802CA0037.outlook.office365.com (2603:10a6:4:a3::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Mon, 31 Oct 2022 11:57:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT021.mail.protection.outlook.com (100.127.142.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:55 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:57:55 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6dcb4d70b20071fd X-CR-MTA-TID: 64aa7808 Received: from 3f5959128ad8.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3895C469-ADEB-42BB-A8D0-B57348B8FEDD.1; Mon, 31 Oct 2022 11:57:48 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3f5959128ad8.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:57:48 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XjwVwANzkYrX++qJaUSKGhKhcPzoOl6kIVq+NGOBF7+w+K3FcI8rp/oW3NrnPD/tUrSUa2iAauzndC/tkx3Zei3tH6BH/EF62IpyaH+6jENPlkEDUAqphD3yqUlnRycKFLlOXIxO+WjatzzLjar3X0papyUny0pJVj58eAPt45BnWAu0sk+Lb1j5OyxwBhgEOpSeOlcpwLJrElqPGXioo6N1X7C8JeSFBoRFBx4Qi8guvqQdc1eoyQiHGmxi8Bggor9+jQVBJ91vy+9XWtnpMwwzz+x0t1K2QE4rH0vx08gDmzvn/VMkFX67NI0JAHB4BjZv5y5TS3IVYUTAstTnOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; b=RbGsS4UQZD4jyFrqVzQMUY4UU15vLH0oEfhpXyp7NjnWkwmVasHa7pevf+bCnefb9Uc4VfCs5Qve7/mfOAGexkLJrzJI23fEYbRIaEfxPzS20Td4LnAsZCykNSwLpOjxkbsYPw8BXXwH1Y3/lygxUYyv7GD4uDgKL6v0bkyoXm6v+xU2UH4beAxDMhtZ6RgIzSN8o7fr58XemPUIjBErECznetTncJ4qmP6yuQ+WfJXlIwEpOCgOisxlACnxBs3vBPVWLxp/8nThfxEipmJhpf7+7pqIMB1eDnsS8ASnRrNDvq5RzhyWd7Qbr/l25lRGAZbPFXnClD2GJzULhpgetQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB8385.eurprd08.prod.outlook.com (2603:10a6:10:3da::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Mon, 31 Oct 2022 11:57:47 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:57:47 +0000 Date: Mon, 31 Oct 2022 11:57:42 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0516.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:13b::23) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB8385:EE_|DBAEUR03FT021:EE_|GVXPR08MB7703:EE_ X-MS-Office365-Filtering-Correlation-Id: 94b9cbf5-2d93-4b28-4014-08dabb3725ce x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: UDhc20mtfcYfmTiTnUYBx5uH+d3T5eV14AwjAKzMv7zhZNccdUy7YX05PMh7CO+zpO3yv5xrngwGmmXi9xh75LJ5scWQRrOgSw8DwXr4Gl/iQhb8OJNeTh7EqpIr8eHWERwp9FXXr15uimni2374sK5Z8ic3E9bWiVPHl8voRjE4DA15NpJpY9w/ID5TfziR7veGhH7DCdes5JPdj0PXAtcqpe9AjhuLGP3LCew9VqVcoCGK4ebsdquKJUTwrgw1H5H7Pc+NiNEQW7dg3x06S0JHfVHXoC0afzHTWI4NYdDyo+Jd869d1nomT/bM59gfvO4NnsFin5OCBZfcDeY2KB/BZ8dDn7KDbhd5hIJxqjiwhyPlkiuszCYe0Pol4lx/MFUdNKQWcqttti162aIo3Y4UdMJXz2cBwbwMh6OKryHgdbaCCDH2bDCqrPOoM3R7UH2wtRdfLgwHi2zGQn15WpCjpNQmR3HDnPXspLQQYN9hV3Z+rU1vs/9rw7HsWXxyM99mjiSjptDbFD795OGnHNtlnMz2DCB5vixDzWT9kJ+6im+hEGTi3PN+oF7gHdK11NQB0BGXv4GWvNE4dEZgr6e0AtuHfCybFWxgzqNIo2fM0wjnX26DW0nwryDSyY+yY+ub/rrMaJ2BeWTAEKdMidOXRDLfJNRgA+afntZkP5bghyevYBFcomIVHEmsY08J3flZtfZNEQ9vQk077pQCFQbgifwwdDeRICO97GIeRUJMUFuOPBUinPTQVFklAe4I/8IYxEnEFY763VPa3sUAdJ90AVDP192ewh7DmXiEwdmytxoOgwux4Hh2SOZ9ECf9 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(136003)(396003)(39860400002)(376002)(451199015)(66899015)(36756003)(84970400001)(38100700002)(5660300002)(44832011)(2906002)(235185007)(186003)(86362001)(4326008)(6916009)(4743002)(2616005)(478600001)(6666004)(8676002)(316002)(6486002)(66476007)(66556008)(66946007)(41300700001)(8936002)(26005)(33964004)(6512007)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8385 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: ca3693d2-cdd0-446c-b5a3-08dabb37207b X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aQ+SqH0fOzIP+0WM6SWa09zcWnQGBgKxZl2F7AC+9jcD7yjEtUwOybMdEscrLOrjYRDqwOPxkcFiba/AteCOVnzTNnU0IialYbubpYhwX3IbjQR/UzdF7UR4uhKE/KTh2DEXOPTeOMNbeEIIUYimuv2RQqQ68at7J8I1alIb14XlvFVDgYAfO1ODiqHkrYQseX9aXGj8jyQqWCX3u/McofFST4HmWxGCqMcKNnFouZ378MUtTWhUxe7D+Dn4+2LRKEKF/sHTsP/HUBBOxm7XQWhDTTXFZ+aCGW04ydbm3W7jZoq/ahro7i7qDmHQFAY1y8NN1dJjGo45EPCcxQBxXLhBxW44JBINJK90Vu0MdC+bYRRqaP5vIjUzfK5XxmoGlM2rynlXZ349gU0BlrG06avhunTTOk1sjCN8mvfSJHTZ1Daa02xSCi0tS4lzb93VbesNZeKPzNnNrSUr3xfsWblH7yhOhVsn4kWWMNlg70/HhUQeZ51rByMxqMMt+pp69c16oXXQpJh5jdgMZOGj/gOAJwdJ0S4UbJqWainVBF5++99/5mOe5hxDzrAZRq7QxU2KjzEe29EJ1HjrL5rhaK5ivwIm+DUUbkJH53Kpq2x7OKZDUTLbQUjK4FDawXkl/z20HuhFJbxyREw2I0S49UyCGcw+K+D0qp3mIDA9mTcPnl/dA30FyXCzfvF0N0TQOXb17j1OUzHAzXwS25xreNWG3XocBj9FUHhIN8c3wo81hkkswfsKe93CE03LujCIIEz2SaQdKIu2+agob8kTMZQCasts7fqxpcei3QEDsARsSwagCBk2HaeF/SUiYO8zMcH5Y1CYaMVbcr8yid8BQ464BT/zKtQhbkZkXjGgKGg= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(39850400004)(396003)(376002)(346002)(451199015)(36840700001)(40470700004)(46966006)(8936002)(84970400001)(44832011)(235185007)(41300700001)(478600001)(82740400003)(40460700003)(5660300002)(316002)(2906002)(66899015)(70206006)(70586007)(4326008)(8676002)(6916009)(47076005)(81166007)(4743002)(6512007)(26005)(6486002)(36860700001)(6506007)(82310400005)(33964004)(44144004)(186003)(40480700001)(356005)(2616005)(336012)(107886003)(86362001)(36756003)(6666004)(67856001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:55.8059 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 94b9cbf5-2d93-4b28-4014-08dabb3725ce X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB7703 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204299672086550?= X-GMAIL-MSGID: =?utf-8?q?1748204299672086550?= Hi All, The current vector extract pattern can only extract from a vector when the position to extract is a multiple of the vector bitsize as a whole. That means extract something like a V2SI from a V4SI vector from position 32 isn't possible as 32 is not a multiple of 64. Ideally this optab should have worked on multiple of the element size, but too many targets rely on this semantic now. So instead add a new case which allows any extraction as long as the bit pos is a multiple of the element size. We use a VEC_PERM to shuffle the elements into the bottom parts of the vector and then use a subreg to extract the values out. This now allows various vector operations that before were being decomposed into very inefficient scalar operations. NOTE: I added 3 testcases, I only fixed the 3rd one. The 1st one missed because we don't optimize VEC_PERM expressions into bitfields. The 2nd one is missed because extract_bit_field only works on vector modes. In this case the intermediate extract is DImode. On targets where the scalar mode is tiable to vector modes the extract should work fine. However I ran out of time to fix the first two and so will do so in GCC 14. For now this catches the case that my pattern now introduces more easily. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * expmed.cc (extract_bit_field_1): Add support for vector element extracts. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ext_1.c: New. --- inline copy of patch -- diff --git a/gcc/expmed.cc b/gcc/expmed.cc index bab020c07222afa38305ef8d7333f271b1965b78..ffdf65210d17580a216477cfe4ac1598941ac9e4 100644 --- diff --git a/gcc/expmed.cc b/gcc/expmed.cc index bab020c07222afa38305ef8d7333f271b1965b78..ffdf65210d17580a216477cfe4ac1598941ac9e4 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1718,6 +1718,45 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, return target; } } + else if (!known_eq (bitnum, 0U) + && multiple_p (GET_MODE_UNIT_BITSIZE (tmode), bitnum, &pos)) + { + /* The encoding has a single stepped pattern. */ + poly_uint64 nunits = GET_MODE_NUNITS (new_mode); + int nelts = nunits.to_constant (); + vec_perm_builder sel (nunits, nelts, 1); + int delta = -pos.to_constant (); + for (int i = 0; i < nelts; ++i) + sel.quick_push ((i - delta) % nelts); + vec_perm_indices indices (sel, 1, nunits); + + if (can_vec_perm_const_p (new_mode, new_mode, indices, false)) + { + class expand_operand ops[4]; + machine_mode outermode = new_mode; + machine_mode innermode = tmode; + enum insn_code icode + = direct_optab_handler (vec_perm_optab, outermode); + target = gen_reg_rtx (outermode); + if (icode != CODE_FOR_nothing) + { + rtx sel = vec_perm_indices_to_rtx (outermode, indices); + create_output_operand (&ops[0], target, outermode); + ops[0].target = 1; + create_input_operand (&ops[1], op0, outermode); + create_input_operand (&ops[2], op0, outermode); + create_input_operand (&ops[3], sel, outermode); + if (maybe_expand_insn (icode, 4, ops)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + else if (targetm.vectorize.vec_perm_const != NULL) + { + if (targetm.vectorize.vec_perm_const (outermode, outermode, + target, op0, op0, indices)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + } + } } /* See if we can get a better vector mode before extracting. */ diff --git a/gcc/testsuite/gcc.target/aarch64/ext_1.c b/gcc/testsuite/gcc.target/aarch64/ext_1.c new file mode 100644 index 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e571b3bc2ddf887a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ext_1.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +typedef unsigned int v4si __attribute__((vector_size (16))); +typedef unsigned int v2si __attribute__((vector_size (8))); + +/* +** extract: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract (v4si x) +{ + v2si res = {x[1], x[2]}; + return res; +} + +/* +** extract1: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract1 (v4si x) +{ + v2si res; + memcpy (&res, ((int*)&x)+1, sizeof(res)); + return res; +} + +typedef struct cast { + int a; + v2si b __attribute__((packed)); +} cast_t; + +typedef union Data { + v4si x; + cast_t y; +} data; + +/* +** extract2: +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract2 (v4si x) +{ + data d; + d.x = x; + return d.y.b; +} + --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1718,6 +1718,45 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, return target; } } + else if (!known_eq (bitnum, 0U) + && multiple_p (GET_MODE_UNIT_BITSIZE (tmode), bitnum, &pos)) + { + /* The encoding has a single stepped pattern. */ + poly_uint64 nunits = GET_MODE_NUNITS (new_mode); + int nelts = nunits.to_constant (); + vec_perm_builder sel (nunits, nelts, 1); + int delta = -pos.to_constant (); + for (int i = 0; i < nelts; ++i) + sel.quick_push ((i - delta) % nelts); + vec_perm_indices indices (sel, 1, nunits); + + if (can_vec_perm_const_p (new_mode, new_mode, indices, false)) + { + class expand_operand ops[4]; + machine_mode outermode = new_mode; + machine_mode innermode = tmode; + enum insn_code icode + = direct_optab_handler (vec_perm_optab, outermode); + target = gen_reg_rtx (outermode); + if (icode != CODE_FOR_nothing) + { + rtx sel = vec_perm_indices_to_rtx (outermode, indices); + create_output_operand (&ops[0], target, outermode); + ops[0].target = 1; + create_input_operand (&ops[1], op0, outermode); + create_input_operand (&ops[2], op0, outermode); + create_input_operand (&ops[3], sel, outermode); + if (maybe_expand_insn (icode, 4, ops)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + else if (targetm.vectorize.vec_perm_const != NULL) + { + if (targetm.vectorize.vec_perm_const (outermode, outermode, + target, op0, op0, indices)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + } + } } /* See if we can get a better vector mode before extracting. */ diff --git a/gcc/testsuite/gcc.target/aarch64/ext_1.c b/gcc/testsuite/gcc.target/aarch64/ext_1.c new file mode 100644 index 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e571b3bc2ddf887a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ext_1.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +typedef unsigned int v4si __attribute__((vector_size (16))); +typedef unsigned int v2si __attribute__((vector_size (8))); + +/* +** extract: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract (v4si x) +{ + v2si res = {x[1], x[2]}; + return res; +} + +/* +** extract1: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract1 (v4si x) +{ + v2si res; + memcpy (&res, ((int*)&x)+1, sizeof(res)); + return res; +} + +typedef struct cast { + int a; + v2si b __attribute__((packed)); +} cast_t; + +typedef union Data { + v4si x; + cast_t y; +} data; + +/* +** extract2: +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract2 (v4si x) +{ + data d; + d.x = x; + return d.y.b; +} + From patchwork Mon Oct 31 11:58:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13242 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2268188wru; Mon, 31 Oct 2022 05:01:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM67NneMKkBogQd7NlTdBEm1Ze8arrzpMkME2FprCwBvEPnKcoZFUHmUbu5wq3qnlrpvRwmu X-Received: by 2002:a50:c302:0:b0:463:26d6:25fb with SMTP id a2-20020a50c302000000b0046326d625fbmr8450415edb.204.1667217691007; Mon, 31 Oct 2022 05:01:31 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g18-20020a1709065d1200b0078e27ef9501si8300769ejt.750.2022.10.31.05.01.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:01:31 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=l6LlUo2o; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0945F382DE21 for ; Mon, 31 Oct 2022 11:59:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0945F382DE21 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217593; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=l6LlUo2oRq0PQ4haCO/EGYkg9oQHi6ST1EIwDCS8n0QewXZsXmQQ3Q9XIfnD1KM58 v1sGMKvkkR5ZuPNaCqQt1zJPXcOiRQDUZM22JfH4uYeIWFGQuKkno//QAdF7izTxNU tCCjDHuw6FeIiMqFYm6bcWPR5xJHIoi6tDV5EJvw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2084.outbound.protection.outlook.com [40.107.247.84]) by sourceware.org (Postfix) with ESMTPS id AF1F7385483B for ; Mon, 31 Oct 2022 11:58:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AF1F7385483B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=EVVR3bm+uum24EmGqIE21KOttfbql1gWS2G6j1eDTaBhSOOhtHpefb+ERiez9nQA20VO1zdBt05PqIlAC/IyRT9sJHlK4qQMQ+/N2L8cXhXSgMa0fqqXrE9E0+HQj8KR8SQdnqzjxo3JYgYG8quEncu10yxDI0+AhqDqworNIpsE8NKeerzafjj5yf07GJgFyfDNKhsnf4blty4DXVN/39zUDhyF1eqeIQWCPJVLU2wHf6KopIH4axsMy4L8WYE7BpyZO1Jf01i/WOWOAx62M1SphqmdMOsHeocBxILMWu96qWIYD7gdxpNqZgVexmOrL1OmKMY/rSy+3E6BLJfNKw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; b=RnF1MecwAvvTkGhm2uqEtFjREMgk9zkMh5D6WwN46oVPfEyy12RKUAQajBoAncVuCeVi7V/CCUQfd27W2A7Ib6M8KWg48DhPUlkTACZDMK5sjSKRbd8cWHc6jj/T4ym5vj1XP/qKprCrF8ioQsirfiPsZHta9gif0tfcPUOYJaB/HdWfEZWpdxJ/KGbqECENfLTvwvcwAKzeLSQa8FBJ9VeiFi/b4CZ2dRIFQRckhfQiGY39E/rz8xxadmeEJ+9qvfHO2pIK3A8uLy0oaxeQA0v3BZunayGOq5ZUcan3xbt8Yjo9lwkN4/RA9i/U03kcB4pcviB4v72P/VSXRIPSgg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB9PR02CA0019.eurprd02.prod.outlook.com (2603:10a6:10:1d9::24) by GVXPR08MB7728.eurprd08.prod.outlook.com (2603:10a6:150:6a::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:58:20 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d9:cafe::42) by DB9PR02CA0019.outlook.office365.com (2603:10a6:10:1d9::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:58:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:58:20 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Mon, 31 Oct 2022 11:58:20 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 70de9d9dc337fd6f X-CR-MTA-TID: 64aa7808 Received: from bf87c50e828e.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 415339F3-1947-48EE-9AF7-87E54FEBE504.1; Mon, 31 Oct 2022 11:58:13 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id bf87c50e828e.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:58:13 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PZeFbwDhSEqJllBizXPtq9W55V35QSG7gzEqBsX6ZfPF1c3Lo35T3eTNukylftmRqxMpfdtBoxb9Z14QTmS4A5edCPWVzkbfE0m3FE1UYFV0rAJOE0zNgjAkpJLuGGIWPe3LpdX7J9ClMa1XU4pTWfANcdr8Sj2h07JJSSMQq+cBhe42bTQWgvxaC3JpSNuMlTlBtCqEbQ812dPcgQhShF1/cU7Dymo7w14M40QDnk5fnCMOheMAAjZxlUc/T/jwYMoqoO0PwU81eIHmPw+5BQ9aM9yGIcnksVmO4m0PgPF7LPuo9kUNIlQraIk4VTEAWf+RBu17FPMhX3pLYqMS3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; b=UctEiDRG3XI+HOTVaXxwwKbHobLbTQAreLhFfSgZk5+mxD1Y4SGmTEhzLhvlpbwkiawDMpw0K6t9vMmJZre1on7Yy21NEDyjA5Np6itmm1EmZieboEfToEAIS6wQTwv0wVCXuFM2xI5nCE93sdr3q/tibW8S3xd6WJ+l19soQxtTfdndem1SS915xASM8Ahjg/M6QM6jCgJVVcYWyRvQABdt1S5mqpbQ035Fhxrh50jBBYTywJhhel+JJTPE0/92dWymzZtH9qXDSb8sAIdzNaNxWx338KOn6RFDpbfAygLCqoO5kTrpwxZO8gItlu+otuhw5q4aCMlrdk8y5Tqpmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB8385.eurprd08.prod.outlook.com (2603:10a6:10:3da::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Mon, 31 Oct 2022 11:58:11 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:58:11 +0000 Date: Mon, 31 Oct 2022 11:58:09 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 4/8]AArch64 aarch64: Implement widening reduction patterns Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0323.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:197::22) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB8385:EE_|DBAEUR03FT023:EE_|GVXPR08MB7728:EE_ X-MS-Office365-Filtering-Correlation-Id: 18863909-ebb8-4201-fea8-08dabb37346f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: pVAp4GsMDHJ8GhMvyMQfIgULuH6vwOLucx7yK0Ae7yRjHY2TWe6ywp16+HGgoFciLoE02QTzuT6NPFE0QQfL/mMWEyIegeJxWf9ATYc61yfNYTww8AH5N+AAPV0KxufcfQV6VY28xSP0Uan1Dg+L5Xl37rHvZ3eW5evLyIZA0POEax57RrFkgOOg+ojCyjesvxEDiX/dZlGawS6f3wFjnxfdPIfqPAjGb823LCI0skITlB+tH9jHagg61zpL0heLL3ENYuDIPDFzeB+U+Hfe9oxm/CW9RpSTitzFlSMBbBGJ3c+q5Rpitd6RKXqyqx568NloqjAgQnK/N8idXgqBLm5XKJdx3u+PdoZ1jsPSKCslZTiDSJsmX/aYKTMnjI+66RfjXBwDow1qAC+C7eDvkPLA8pkvFh1YkMCiYzUkHLxAgSgJfGPaoVGObcaoQwamTsekVlBfWb7WYl5qOXcYxUKHxtJ5ZkSHRSynXhQp1lyOGGgjLRzNO7/grkCZXQcj6WDipd0cdAvxbdqxv0dMAfyL1qms8Nmn/hAIcgJgjn1jmrNJC1hyWXVs8lwKNzwFDPhPwL1Q3c/RqlXW7jiSEGvniRvPOFSzrCBjteVIs1wIMHLKti8fH7g21a5PXNiX0I6zTxc6zx0TC0xnsVr3Zv4Q/rXEtTHbmQ4qQi/MIUxJtQQ2383gphT9x1hhmAfkyj82pUuuddA8+iiF8jBhnTO28pA0JosCbgxcnee2k01klsvURocF3TBqhhln9CvNFO8ji7bYA1GquI8maWO4DyS1p+tOyAEwFcopb20K0h4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(136003)(396003)(39860400002)(376002)(451199015)(36756003)(38100700002)(5660300002)(44832011)(2906002)(235185007)(186003)(86362001)(4326008)(6916009)(4743002)(2616005)(478600001)(8676002)(316002)(6486002)(66476007)(66556008)(66946007)(41300700001)(8936002)(26005)(33964004)(6512007)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8385 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 4081c3d2-425b-4665-e72a-08dabb372f2e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pCFP8OSkA/4gsn/cSjKQAAZHpcm7bj5t3AmLBBVD1zr3ovOUQ2nwzHEXPGfyTJpeuSiVtyr+xADUbjiYj0BTKLRUE7U1Ur3+OApjZXziOveocS6Ye0fvcyRHLxpK4zbrbwpAoIDw0FQr9VcDpzRsgjt8YA94qIuHEOc+PgqzbQUbl89UXxJ/x73KiuGffnwccadaORVpTfsSb2+ko3WgmaC+gSwQUJeTH3Qe6siN0VvyssquhieLvb9VjElLLfT3Qgp2b0DdFkHLhIEzE8zsvujTw+Br9TslzCp6rLlfm0uVHGQuGOBOpIi+7EfIPr9x4LYM9YSBX5RUTna6HK593sNu5fpRpOnqZ60MR5pFob+IlOd+RVY8G8gV27JPgvK+9ZVZz90YRG5+r28STMwhC/D09vCayga5pdqC5XA7HlWx2Z/dVEMbxFdcmFz5yxWm/Hkv7OHjFD41lYoqsFKx9VT2eUH5yjxn7UIBJypGHyIOLMRU1lQ6U2MCPf5OCeVJDTOH1EFE2LGfcudWblgKQzISkA3XmgX+dHuTq8CVtkzSYDw9SWWojpThSnGP26R270Twn8xJl4Xr2tvlVdb+ofOGZARks71dhnLdvQKFQQ+UYLAG7ylJ5h4y4wNkd7qcpynNRmQ7R3YgVNpzWOw4Ls6Enxpfx2oZaUlvyMBwxb4j+maV3pRKOK43Lqa9owK1jdJkM04gJ/267i+5Fz8/RKecJcHRPcL/Rg6SVv6h1PH/uyyn+4JQyfj8ssWdU/8eRhwJ5q15nckbyamnK8XIs5sszrHHZyq2GBNs/qoFqADywQPGJmqF/GoDB7QfbQSbFxrdps4Tkg3v2kkfGQT0Sw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(376002)(346002)(39860400002)(396003)(136003)(451199015)(46966006)(40470700004)(36840700001)(336012)(47076005)(40460700003)(6486002)(2906002)(44832011)(86362001)(81166007)(356005)(36756003)(40480700001)(82740400003)(82310400005)(186003)(4743002)(26005)(6512007)(44144004)(2616005)(33964004)(36860700001)(70586007)(8936002)(316002)(70206006)(478600001)(8676002)(4326008)(6506007)(41300700001)(6916009)(5660300002)(235185007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:58:20.3540 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 18863909-ebb8-4201-fea8-08dabb37346f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB7728 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204457287832887?= X-GMAIL-MSGID: =?utf-8?q?1748204457287832887?= Hi All, This implements the new widening reduction optab in the backend. Instead of introducing a duplicate definition for the same thing I have renamed the intrinsics defintions to use the same optab. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (saddlv, uaddlv): Rename to reduc_splus_widen_scal_ and reduc_uplus_widen_scal_ respectively. * config/aarch64/aarch64-simd.md (aarch64_addlv): Renamed to ... (reduc_plus_widen_scal_): ... This. * config/aarch64/arm_neon.h (vaddlv_s8, vaddlv_s16, vaddlv_u8, vaddlv_u16, vaddlvq_s8, vaddlvq_s16, vaddlvq_s32, vaddlvq_u8, vaddlvq_u16, vaddlvq_u32, vaddlv_s32, vaddlv_u32): Use it. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index cf46b31627b84476a25762ffc708fd84a4086e43..a4b21e1495c5699d8557a4bcb9e73ef98ae60b35 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index cf46b31627b84476a25762ffc708fd84a4086e43..a4b21e1495c5699d8557a4bcb9e73ef98ae60b35 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -190,9 +190,9 @@ BUILTIN_VDQV_L (UNOP, saddlp, 0, NONE) BUILTIN_VDQV_L (UNOPU, uaddlp, 0, NONE) - /* Implemented by aarch64_addlv. */ - BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE) - BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE) + /* Implemented by reduc_plus_widen_scal_. */ + BUILTIN_VDQV_L (UNOP, reduc_splus_widen_scal_, 10, NONE) + BUILTIN_VDQV_L (UNOPU, reduc_uplus_widen_scal_, 10, NONE) /* Implemented by aarch64_abd. */ BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index cf8c094bd4b76981cef2dd5dd7b8e6be0d56101f..25aed74f8cf939562ed65a578fe32ca76605b58a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3455,7 +3455,7 @@ (define_expand "reduc_plus_scal_v4sf" DONE; }) -(define_insn "aarch64_addlv" +(define_insn "reduc_plus_widen_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV_L 1 "register_operand" "w")] USADDLV))] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index cf6af728ca99dae1cb6ab647466cfec32f7e913e..7b2c4c016191bcd6c3e075d27810faedb23854b7 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -3664,70 +3664,70 @@ __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s8 (int8x8_t __a) { - return __builtin_aarch64_saddlvv8qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s16 (int16x4_t __a) { - return __builtin_aarch64_saddlvv4hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4hi (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u8 (uint8x8_t __a) { - return __builtin_aarch64_uaddlvv8qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u16 (uint16x4_t __a) { - return __builtin_aarch64_uaddlvv4hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4hi_uu (__a); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s8 (int8x16_t __a) { - return __builtin_aarch64_saddlvv16qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v16qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s16 (int16x8_t __a) { - return __builtin_aarch64_saddlvv8hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8hi (__a); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s32 (int32x4_t __a) { - return __builtin_aarch64_saddlvv4si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4si (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u8 (uint8x16_t __a) { - return __builtin_aarch64_uaddlvv16qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v16qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u16 (uint16x8_t __a) { - return __builtin_aarch64_uaddlvv8hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8hi_uu (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u32 (uint32x4_t __a) { - return __builtin_aarch64_uaddlvv4si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4si_uu (__a); } __extension__ extern __inline float32x2_t @@ -6461,14 +6461,14 @@ __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s32 (int32x2_t __a) { - return __builtin_aarch64_saddlvv2si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v2si (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u32 (uint32x2_t __a) { - return __builtin_aarch64_uaddlvv2si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v2si_uu (__a); } __extension__ extern __inline int16x4_t --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -190,9 +190,9 @@ BUILTIN_VDQV_L (UNOP, saddlp, 0, NONE) BUILTIN_VDQV_L (UNOPU, uaddlp, 0, NONE) - /* Implemented by aarch64_addlv. */ - BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE) - BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE) + /* Implemented by reduc_plus_widen_scal_. */ + BUILTIN_VDQV_L (UNOP, reduc_splus_widen_scal_, 10, NONE) + BUILTIN_VDQV_L (UNOPU, reduc_uplus_widen_scal_, 10, NONE) /* Implemented by aarch64_abd. */ BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index cf8c094bd4b76981cef2dd5dd7b8e6be0d56101f..25aed74f8cf939562ed65a578fe32ca76605b58a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3455,7 +3455,7 @@ (define_expand "reduc_plus_scal_v4sf" DONE; }) -(define_insn "aarch64_addlv" +(define_insn "reduc_plus_widen_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV_L 1 "register_operand" "w")] USADDLV))] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index cf6af728ca99dae1cb6ab647466cfec32f7e913e..7b2c4c016191bcd6c3e075d27810faedb23854b7 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -3664,70 +3664,70 @@ __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s8 (int8x8_t __a) { - return __builtin_aarch64_saddlvv8qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s16 (int16x4_t __a) { - return __builtin_aarch64_saddlvv4hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4hi (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u8 (uint8x8_t __a) { - return __builtin_aarch64_uaddlvv8qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u16 (uint16x4_t __a) { - return __builtin_aarch64_uaddlvv4hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4hi_uu (__a); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s8 (int8x16_t __a) { - return __builtin_aarch64_saddlvv16qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v16qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s16 (int16x8_t __a) { - return __builtin_aarch64_saddlvv8hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8hi (__a); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s32 (int32x4_t __a) { - return __builtin_aarch64_saddlvv4si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4si (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u8 (uint8x16_t __a) { - return __builtin_aarch64_uaddlvv16qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v16qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u16 (uint16x8_t __a) { - return __builtin_aarch64_uaddlvv8hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8hi_uu (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u32 (uint32x4_t __a) { - return __builtin_aarch64_uaddlvv4si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4si_uu (__a); } __extension__ extern __inline float32x2_t @@ -6461,14 +6461,14 @@ __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s32 (int32x2_t __a) { - return __builtin_aarch64_saddlvv2si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v2si (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u32 (uint32x2_t __a) { - return __builtin_aarch64_uaddlvv2si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v2si_uu (__a); } __extension__ extern __inline int16x4_t From patchwork Mon Oct 31 11:58:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13244 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2269846wru; Mon, 31 Oct 2022 05:04:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM483HyYci0RvLdiLBBRapSvj2EXTrxOuvTq0tf+HiCErkc5ByKFU0WBLxznQjPL6KugQlFN X-Received: by 2002:a17:907:7fa5:b0:791:9a5f:101a with SMTP id qk37-20020a1709077fa500b007919a5f101amr12545324ejc.453.1667217843812; Mon, 31 Oct 2022 05:04:03 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id l15-20020a170906794f00b007add6be8c86si2945021ejo.762.2022.10.31.05.04.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:04:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=cySlX0ao; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8DEDB3854171 for ; Mon, 31 Oct 2022 12:01:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8DEDB3854171 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217665; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=cySlX0aoOT29RIrjk+NI+QEwPaAQ53HnIZcPRODnYDI+b98GG7XZ/l7XTzwGAGl6i +ehdK503/EIv35ZW6InDWFWM0HVbTQrLFigA+WOZnii5b4vDhgeSinbDBMG/891A2Z 1SU9QOlj84cw6ooSn4UzAMRgyLzJ/Blp02ExUvnQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-eopbgr150088.outbound.protection.outlook.com [40.107.15.88]) by sourceware.org (Postfix) with ESMTPS id 6E96B3851C1E for ; Mon, 31 Oct 2022 11:59:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E96B3851C1E ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Uxiw146OAHf0ZFp0k4bMVgfidX4qPAXlwaj4+ZWCYNzOJgBgIoqTE0kud5LdgALkB4a/PuBv490pOSrgb7fKeuV1e/4qpXlXCwTKkrdJQ7zDDGbXJT+W+wB8fpcLtvi88dWihS9ehrxEVNK5r9SAfd+c0hr1vaduEJ33bkisT9QduT4nvV2+jA40S6GaFw/QeI1qxQco/DuYsXQvkslk+7Eu9PYN/iCsmGKXgZpmB++1W4tq1eSBOqXPavePvC6QI7pTvIyYKOqgLZI3+gxQvW3R56NjUkJRG+pEsTejsk7W3UHJrIGPA2f3qj+F4qbS3bkJxME8z8VfL0SP+zXfbA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; b=RufwG+pkB6E8Yd3o5ohsHRm9c777NemS68pyFe19LHh4WUwVFPu6frUZ7lT8vGsJPMh5t5ZHQW/W1T+GhlGLBIKEyZIw45MKGRm1IjNDAHWMFOYhPHJrwC2aezp38Nn3Fome0+zi2YM6EF5RS+5YztAzgIbbWJkkPvbfTO3cNZNV18IT9ZWMQSrIp0JzrZIsvgnmjib5Vg8PaYA9UG7Lp8H92VghAdcYtknhPOh3LDxU53njmc4Qw5pQ8csMk3eZYgJd2aVsh4LhJ++kGvap8F9sL5z9lNAIyvKxytNXFWusUt96j70ZO4HBDoyfmyVAOptLMYRzop3m0DWulaQ44w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB9PR02CA0028.eurprd02.prod.outlook.com (2603:10a6:10:1d9::33) by PA4PR08MB7385.eurprd08.prod.outlook.com (2603:10a6:102:2a0::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15; Mon, 31 Oct 2022 11:59:02 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d9:cafe::a) by DB9PR02CA0028.outlook.office365.com (2603:10a6:10:1d9::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:59:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:02 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:59:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b0559d7bca7c3507 X-CR-MTA-TID: 64aa7808 Received: from c1694a330e7c.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C4D91AC3-947B-427E-B065-375B4160E6A9.1; Mon, 31 Oct 2022 11:58:55 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c1694a330e7c.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:58:55 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ahZbY2ODi7NNQEyld05kSM8tyei0ijttGQvF31BNCLgNBb3dnqhrZ3RuId5k7AeRjG0W1WvwZVFsVQA17CqDbrBqBeChWfKIznyHwEKN5yQnSaB858aG7guE0moKTso7YRj2d/nOEuK1LhE/60zXju3cWut5OqolSVpCpsrEZCfM93lPpnSgUmY8Sv0yOCdcHWsr3RZ30QcEBRUKYrp0ufW7+whKFBimBPRWbhi9PotHiXGgWmlm++lpo/PRTP89OqJxcGBiMxAfHhlsjSe9M+/yrF7VFkkZ4nLP6OsFpn6c6q+UG6SrH1AEEDaio43g3B7lLwA7McSeBOFCBHl4Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; b=eR4xOrcBbgfLF32W+I69LEiOli4c5nF7kqR1Jbw7jTq2cJ9s36LlLLDTw1AnN/CW3oOKvabYW9cPeoNYPVzLm6t6Z9i5J25f6Fd6M1RI4LHJbqIcXYBamx0wkm9OWGT5vJz843Oj7ejZ5o7mecUrYZBTQ+2pppIuSDv4LWumAXSeNh9kDNJkZXGv1rYarAJF85hKrN89NB9aPO3WujsTPcfMwf0pjFWBpnpPrsmxgZdp4vm9BPqdw1qmjIFvot+VU24K5laoSN9dkmCvg6aKhgP3r3Vnh95x6t2Qt+ptfVSJ4707uZqRH7K8013t40oSjcO5HEwdB4MdveKDxydgRw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB8717.eurprd08.prod.outlook.com (2603:10a6:20b:55d::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14; Mon, 31 Oct 2022 11:58:53 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:58:53 +0000 Date: Mon, 31 Oct 2022 11:58:50 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0489.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1ab::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS2PR08MB8717:EE_|DBAEUR03FT023:EE_|PA4PR08MB7385:EE_ X-MS-Office365-Filtering-Correlation-Id: 73d59646-5faa-4bc8-2867-08dabb374d90 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: gPgn4EpSlKRuG+Em7xxvjqpDQ7gDGplOxqy3tGFed9MQSOFL9aZkbwWz4d+9j+EdWhhUJ3BZm8puHbIUORi6VYheHCBkB0acGQdWhznJCeX/ltthor2Jl3IA6dn3Gd3K/OpJkBDp8VYFFtmnTizQU+2Wi5RDolPZnxp8JeBXG/k/RD2AL6r/rm4BE+3Z8IiSjj2SOXJpXRcYQPtcPD390Ksru42ZrYXCvJivpj1VdmGSCARvkGCmuJfyUHc1qzEpt3APuMLYTbdeB0RomZoiVASUr2A/a32dMpu/Rl21WiNSCBW9h+JfCDRMnc4tRoYiNEdhyfejIsjAOPmzDDJZpsP90RcX5IPyH0aDP9V8BszW0nFzwzzjaTVyjDz/BAgcGw4y0oQHaleYiFr0zZw7UPLt+9arrtuNb1lZEy2/5t77YNjq9eGyC4TzpMiMjJWdErvBs5VfZo9RJQUprh4a74R/Iqq6DHqYT/BuN0WcMhw8EEcUGauXcf7g0YWJLwa1xwhVOH/IR24d6LjDfv0KMEkeLq/PapwKON4E/U/g1ZRoz4/Aj0QsouYtIWST31frtxAOROyyP2VV2qktj0eV8M3aQcNnWJI26jNKaIv5GKPgAlhnvtNz9R8ckB7ZfEV9DGbDXL703ZLP3VJp4NGRMZgcJCXB41hxGUkQ7MoV5MyiypxY0f+jMeD00cnFOan+it/SGoOJdRbyL4pQUIi9svPo9IbUdGVPFWixXI1LigNaqeEr4pTdljbTRxl74BmbcUD5TKEAttHqykVxnTVcOPFG8tUFrWL0SDOGb6CajyRrt9IcG39k8bhzeoAu98j1aO3af75kg+WMflqzYVk59zlNCoMeOjKWOoj5sB5Bm+4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(366004)(451199015)(478600001)(8676002)(38100700002)(84970400001)(6486002)(316002)(86362001)(6512007)(26005)(4743002)(66476007)(4326008)(41300700001)(186003)(83380400001)(235185007)(2906002)(33964004)(66946007)(2616005)(8936002)(30864003)(6916009)(66556008)(36756003)(5660300002)(44832011)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8717 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: e9b592e1-fc7a-4b6a-1cca-08dabb3747bc X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: yS9pZjM7O72lD66W8R85RvOUmHNAkCYY2ta8L6rOdeR5zdUoLJ5WHfO8ltCudKrpEd31QjuC+RmTnzx2J5YerhAZ6lcOwVx3CHlC01MusWbZhc9skB0dgp9PDlaNxu2OpXTBc1l0PvkN5VhtgRPUQlpHwKe/CYP8p6oreS4Eqj0j9CnHB8aauiCJM33wM5eh9/3HAEwbo3qGNIAQdoAfjt7s6RU/aahDYMF03jCBx1JXjCYcr/hAUq5BnLbnHcEGzmX9blPKjavIp871gIpjTB5QyxHYfouhloeUgACdtYpeVVr7WxErJKsmubuEbPXG93/refFewi6kq+t7WPH3EMQZka1jN7CT2j0zqdAq48IXECDWPOvJrzcdmZSivRiZgzfv/0i4wCCXXc6p45PQSupG1oCTRA94Era17bYdKI3KoogfEchAlfWyQrYOcdOggWm3KEHIhxRsYHjoGVr1tEIIzfzDMzfZVPt3+Iev292BEI2UsIpmxxQ+lShzQIyUDOEVTtuqXV3i/gM72qjzbZxUSjV9F+/hYZFOXR4gvnydX9WXcrJ7u2EuUoilxouvidZcKbBP4R9DG//nNVtacveuKnYGKTqzc78wo0AEe6V37VVRGN1/U5B0FHaggSG8i6iQik6UWdddQbPvWIZ41Lv3ziL/QuNW/eRbWzoIOkBVg+wkEnppN18yW72/sfiwkzbsCgJEsRa9GKJPO4vQ3Zm8PAL67scFf/7DlNS/Su6Gynx4ZIpX0Vy9X+nLhM8gXLIZF6YPbqoyCnaXoviU9+nVpvO52rbKssv12PpqAw7xlO69RYeFdU0CY/e7ByX1yPBiqdvqKlvAAPSExR9JyqqmTa/sMb10/AWUep8mveWdKLTgKzqsp27TXdtUDniMgTthZ5K9CDU7nr6d2q/q6A== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(396003)(39860400002)(376002)(136003)(346002)(451199015)(46966006)(40470700004)(36840700001)(47076005)(81166007)(36860700001)(40460700003)(356005)(82740400003)(86362001)(44832011)(2906002)(30864003)(5660300002)(235185007)(8676002)(6512007)(41300700001)(70206006)(2616005)(4326008)(82310400005)(6506007)(70586007)(33964004)(6486002)(186003)(336012)(478600001)(4743002)(8936002)(316002)(6916009)(44144004)(26005)(40480700001)(84970400001)(83380400001)(36756003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:02.5078 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 73d59646-5faa-4bc8-2867-08dabb374d90 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB7385 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204618108820505?= X-GMAIL-MSGID: =?utf-8?q?1748204618108820505?= Hi All, The backend has an existing V2HFmode that is used by pairwise operations. This mode was however never made fully functional. Amongst other things it was never declared as a vector type which made it unusable from the mid-end. It's also lacking an implementation for load/stores so reload ICEs if this mode is every used. This finishes the implementation by providing the above. Note that I have created a new iterator VHSDF_P instead of extending VHSDF because the previous iterator is used in far more things than just load/stores. It's also used for instance in intrinsics and extending this would force me to provide support for mangling the type while we never expose it through intrinsics. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New. (mov, movmisalign, aarch64_dup_lane, aarch64_store_lane0, aarch64_simd_vec_set, @aarch64_simd_vec_copy_lane, vec_set, reduc__scal_, reduc__scal_, aarch64_reduc__internal, aarch64_get_lane, vec_init, vec_extract): Support V2HF. * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Add E_V2HFmode. * config/aarch64/iterators.md (VHSDF_P): New. (V2F, VALL_F16_FULL, nunits, Vtype, Vmtype, Vetype, stype, VEL, Vel, q, vp): Add V2HF. * config/arm/types.md (neon_fp_reduc_add_h): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/slp_1.c: Update testcase. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25aed74f8cf939562ed65a578fe32ca76605b58a..93a2888f567460ad10ec050ea7d4f701df4729d1 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25aed74f8cf939562ed65a578fe32ca76605b58a..93a2888f567460ad10ec050ea7d4f701df4729d1 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -19,10 +19,10 @@ ;; . (define_expand "mov" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD" - " +{ /* Force the operand into a register if it is not an immediate whose use can be replaced with xzr. If the mode is 16 bytes wide, then we will be doing @@ -46,12 +46,11 @@ (define_expand "mov" aarch64_expand_vector_init (operands[0], operands[1]); DONE; } - " -) +}) (define_expand "movmisalign" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD && !STRICT_ALIGNMENT" { /* This pattern is not permitted to fail during expansion: if both arguments @@ -85,10 +84,10 @@ (define_insn "aarch64_simd_dup" ) (define_insn "aarch64_dup_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]) )))] "TARGET_SIMD" @@ -142,6 +141,29 @@ (define_insn "*aarch64_simd_mov" mov_reg, neon_move")] ) +(define_insn "*aarch64_simd_movv2hf" + [(set (match_operand:V2HF 0 "nonimmediate_operand" + "=w, m, m, w, ?r, ?w, ?r, w, w") + (match_operand:V2HF 1 "general_operand" + "m, Dz, w, w, w, r, r, Dz, Dn"))] + "TARGET_SIMD_F16INST + && (register_operand (operands[0], V2HFmode) + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" + "@ + ldr\\t%s0, %1 + str\\twzr, %0 + str\\t%s1, %0 + mov\\t%0.2s[0], %1.2s[0] + umov\\t%w0, %1.s[0] + fmov\\t%s0, %1 + mov\\t%0, %1 + movi\\t%d0, 0 + * return aarch64_output_simd_mov_immediate (operands[1], 32);" + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ + neon_logic, neon_to_gp, f_mcr,\ + mov_reg, neon_move, neon_move")] +) + (define_insn "*aarch64_simd_mov" [(set (match_operand:VQMOV 0 "nonimmediate_operand" "=w, Umn, m, w, ?r, ?w, ?r, w") @@ -182,7 +204,7 @@ (define_insn "*aarch64_simd_mov" (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") - (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") + (vec_select: (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] "TARGET_SIMD && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" @@ -1035,11 +1057,11 @@ (define_insn "one_cmpl2" ) (define_insn "aarch64_simd_vec_set" - [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w,w,w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) - (match_operand:VALL_F16 3 "register_operand" "0,0,0") + (match_operand:VALL_F16_FULL 3 "register_operand" "0,0,0") (match_operand:SI 2 "immediate_operand" "i,i,i")))] "TARGET_SIMD" { @@ -1061,14 +1083,14 @@ (define_insn "aarch64_simd_vec_set" ) (define_insn "@aarch64_simd_vec_copy_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 3 "register_operand" "w") + (match_operand:VALL_F16_FULL 3 "register_operand" "w") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) - (match_operand:VALL_F16 1 "register_operand" "0") + (match_operand:VALL_F16_FULL 1 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD" { @@ -1376,7 +1398,7 @@ (define_insn "vec_shr_" ) (define_expand "vec_set" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand: 1 "aarch64_simd_nonimmediate_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" @@ -3503,7 +3525,7 @@ (define_insn "popcount2" ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP smax/smin). (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINV)] "TARGET_SIMD" { @@ -3518,7 +3540,7 @@ (define_expand "reduc__scal_" (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINNMV)] "TARGET_SIMD" { @@ -3562,8 +3584,8 @@ (define_insn "aarch64_reduc__internalv2si" ) (define_insn "aarch64_reduc__internal" - [(set (match_operand:VHSDF 0 "register_operand" "=w") - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] + [(set (match_operand:VHSDF_P 0 "register_operand" "=w") + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" "w")] FMAXMINV))] "TARGET_SIMD" "\\t%0, %1." @@ -4208,7 +4230,7 @@ (define_insn "*aarch64_get_lane_zero_extend" (define_insn_and_split "aarch64_get_lane" [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv") (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w, w, w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w, w, w") (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] "TARGET_SIMD" { @@ -7989,7 +8011,7 @@ (define_expand "aarch64_st1" ;; Standard pattern name vec_init. (define_expand "vec_init" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand 1 "" "")] "TARGET_SIMD" { @@ -8068,7 +8090,7 @@ (define_insn "aarch64_urecpe" (define_expand "vec_extract" [(match_operand: 0 "aarch64_simd_nonimmediate_operand") - (match_operand:VALL_F16 1 "register_operand") + (match_operand:VALL_F16_FULL 1 "register_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" { diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f05bac713e88ea8c7feaa2367d55bd523ca66f57..1e08f8453688210afe1566092b19b59c9bdd0c97 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode mode) case E_V8BFmode: case E_V4SFmode: case E_V2DFmode: + case E_V2HFmode: return TARGET_SIMD ? VEC_ADVSIMD : 0; default: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 37d8161a33b1c399d80be82afa67613a087389d4..1df09f7fe2eb35aed96113476541e0faa5393551 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") (V8HF "TARGET_SIMD_F16INST") V2SF V4SF V2DF]) +;; Advanced SIMD Float modes suitable for pairwise operations. +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") + (V8HF "TARGET_SIMD_F16INST") + V2SF V4SF V2DF (V2HF "TARGET_SIMD_F16INST")]) ;; Advanced SIMD Float modes, and DF. (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 +192,23 @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI]) (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) ;; Advanced SIMD Float modes with 2 elements. -(define_mode_iterator V2F [V2SF V2DF]) +(define_mode_iterator V2F [V2SF V2DF V2HF]) ;; All Advanced SIMD modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) -;; All Advanced SIMD modes suitable for moving, loading, and storing. +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; except V2HF. (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; including V2HF +(define_mode_iterator VALL_F16_FULL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI + V4HF V8HF V4BF V8BF V2SF V4SF V2DF + (V2HF "TARGET_SIMD_F16INST")]) + + ;; The VALL_F16 modes except the 128-bit 2-element ones. (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF]) @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI "16") (V2SF "2") (V4SF "4") (V1DF "1") (V2DF "2") (DI "1") (DF "1") - (V8DI "8")]) + (V8DI "8") (V2HF "2")]) ;; Map a mode to the number of bits in it, if the size of the mode ;; is constant. @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI "s") (DI "d")]) ;; Give the length suffix letter for a sign- or zero-extension. (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) ;; Give the number of bits in the mode (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) @@ -1134,8 +1147,9 @@ (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") (V2SI "2s") (V4SI "4s") (DI "1d") (DF "1d") (V2DI "2d") (V2SF "2s") - (V4SF "4s") (V2DF "2d") - (V4HF "4h") (V8HF "8h") + (V2HF "2h") (V4SF "4s") + (V2DF "2d") (V4HF "4h") + (V8HF "8h") (V2x8QI "8b") (V2x4HI "4h") (V2x2SI "2s") (V2x1DI "1d") (V2x4HF "4h") (V2x2SF "2s") @@ -1175,9 +1189,10 @@ (define_mode_attr Vmtype [(V8QI ".8b") (V16QI ".16b") (V4HI ".4h") (V8HI ".8h") (V2SI ".2s") (V4SI ".4s") (V2DI ".2d") (V4HF ".4h") - (V8HF ".8h") (V4BF ".4h") - (V8BF ".8h") (V2SF ".2s") - (V4SF ".4s") (V2DF ".2d") + (V8HF ".8h") (V2HF ".2h") + (V4BF ".4h") (V8BF ".8h") + (V2SF ".2s") (V4SF ".4s") + (V2DF ".2d") (DI "") (SI "") (HI "") (QI "") (TI "") (HF "") @@ -1193,7 +1208,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h") (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (V4HI "h") (V8HI "h") (V2SI "s") (V4SI "s") - (V2DI "d") + (V2DI "d") (V2HF "h") (V4HF "h") (V8HF "h") (V2SF "s") (V4SF "s") (V2DF "d") @@ -1285,7 +1300,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") (VNx8QI "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF "s") (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") (SI "s") (DI "d")]) @@ -1360,8 +1375,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") (V4HF "HF") (V8HF "HF") (V2SF "SF") (V4SF "SF") (DF "DF") (V2DF "DF") - (SI "SI") (HI "HI") - (QI "QI") + (SI "SI") (V2HF "HF") + (QI "QI") (HI "HI") (V4BF "BF") (V8BF "BF") (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI "QI") (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 +1396,7 @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") (V2SF "sf") (V4SF "sf") (V2DF "df") (DF "df") (SI "si") (HI "hi") - (QI "qi") + (QI "qi") (V2HF "hf") (V4BF "bf") (V8BF "bf") (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 +1881,7 @@ (define_mode_attr q [(V8QI "") (V16QI "_q") (V4HF "") (V8HF "_q") (V4BF "") (V8BF "_q") (V2SF "") (V4SF "_q") - (V2DF "_q") + (V2HF "") (V2DF "_q") (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") (V2x8QI "") (V2x16QI "_q") (V2x4HI "") (V2x8HI "_q") @@ -1905,6 +1920,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") (V2SI "p") (V4SI "v") (V2DI "p") (V2DF "p") (V2SF "p") (V4SF "v") + (V2HF "p") (V4HF "v") (V8HF "v")]) (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e51d0a147c5722247 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -483,6 +483,7 @@ (define_attr "autodetect_type" ; neon_fp_minmax_s_q ; neon_fp_minmax_d ; neon_fp_minmax_d_q +; neon_fp_reduc_add_h ; neon_fp_reduc_add_s ; neon_fp_reduc_add_s_q ; neon_fp_reduc_add_d @@ -1033,6 +1034,7 @@ (define_attr "type" neon_fp_minmax_d,\ neon_fp_minmax_d_q,\ \ + neon_fp_reduc_add_h,\ neon_fp_reduc_add_s,\ neon_fp_reduc_add_s_q,\ neon_fp_reduc_add_d,\ @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ neon_fp_minmax_s_q, neon_fp_minmax_d, neon_fp_minmax_d_q,\ neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, neon_fp_neg_d_q,\ - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, neon_fp_reduc_add_d,\ - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, + neon_fp_reduc_add_h, neon_fp_reduc_add_s, neon_fp_reduc_add_s_q,\ + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s,\ neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 07d71a63414b1066ea431e287286ad048515711a..8e35e0b574d49913b43c7d8d4f4ba75f127f42e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, int n) \ TEST_ALL (VEC_PERM) /* We should use one DUP for each of the 8-, 16- and 32-bit types, - although we currently use LD1RW for _Float16. We should use two - DUPs for each of the three 64-bit types. */ + We should use two DUPs for each of the three 64-bit types. */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } */ /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -19,10 +19,10 @@ ;; . (define_expand "mov" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD" - " +{ /* Force the operand into a register if it is not an immediate whose use can be replaced with xzr. If the mode is 16 bytes wide, then we will be doing @@ -46,12 +46,11 @@ (define_expand "mov" aarch64_expand_vector_init (operands[0], operands[1]); DONE; } - " -) +}) (define_expand "movmisalign" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD && !STRICT_ALIGNMENT" { /* This pattern is not permitted to fail during expansion: if both arguments @@ -85,10 +84,10 @@ (define_insn "aarch64_simd_dup" ) (define_insn "aarch64_dup_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]) )))] "TARGET_SIMD" @@ -142,6 +141,29 @@ (define_insn "*aarch64_simd_mov" mov_reg, neon_move")] ) +(define_insn "*aarch64_simd_movv2hf" + [(set (match_operand:V2HF 0 "nonimmediate_operand" + "=w, m, m, w, ?r, ?w, ?r, w, w") + (match_operand:V2HF 1 "general_operand" + "m, Dz, w, w, w, r, r, Dz, Dn"))] + "TARGET_SIMD_F16INST + && (register_operand (operands[0], V2HFmode) + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" + "@ + ldr\\t%s0, %1 + str\\twzr, %0 + str\\t%s1, %0 + mov\\t%0.2s[0], %1.2s[0] + umov\\t%w0, %1.s[0] + fmov\\t%s0, %1 + mov\\t%0, %1 + movi\\t%d0, 0 + * return aarch64_output_simd_mov_immediate (operands[1], 32);" + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ + neon_logic, neon_to_gp, f_mcr,\ + mov_reg, neon_move, neon_move")] +) + (define_insn "*aarch64_simd_mov" [(set (match_operand:VQMOV 0 "nonimmediate_operand" "=w, Umn, m, w, ?r, ?w, ?r, w") @@ -182,7 +204,7 @@ (define_insn "*aarch64_simd_mov" (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") - (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") + (vec_select: (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] "TARGET_SIMD && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" @@ -1035,11 +1057,11 @@ (define_insn "one_cmpl2" ) (define_insn "aarch64_simd_vec_set" - [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w,w,w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) - (match_operand:VALL_F16 3 "register_operand" "0,0,0") + (match_operand:VALL_F16_FULL 3 "register_operand" "0,0,0") (match_operand:SI 2 "immediate_operand" "i,i,i")))] "TARGET_SIMD" { @@ -1061,14 +1083,14 @@ (define_insn "aarch64_simd_vec_set" ) (define_insn "@aarch64_simd_vec_copy_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 3 "register_operand" "w") + (match_operand:VALL_F16_FULL 3 "register_operand" "w") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) - (match_operand:VALL_F16 1 "register_operand" "0") + (match_operand:VALL_F16_FULL 1 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD" { @@ -1376,7 +1398,7 @@ (define_insn "vec_shr_" ) (define_expand "vec_set" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand: 1 "aarch64_simd_nonimmediate_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" @@ -3503,7 +3525,7 @@ (define_insn "popcount2" ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP smax/smin). (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINV)] "TARGET_SIMD" { @@ -3518,7 +3540,7 @@ (define_expand "reduc__scal_" (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINNMV)] "TARGET_SIMD" { @@ -3562,8 +3584,8 @@ (define_insn "aarch64_reduc__internalv2si" ) (define_insn "aarch64_reduc__internal" - [(set (match_operand:VHSDF 0 "register_operand" "=w") - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] + [(set (match_operand:VHSDF_P 0 "register_operand" "=w") + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" "w")] FMAXMINV))] "TARGET_SIMD" "\\t%0, %1." @@ -4208,7 +4230,7 @@ (define_insn "*aarch64_get_lane_zero_extend" (define_insn_and_split "aarch64_get_lane" [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv") (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w, w, w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w, w, w") (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] "TARGET_SIMD" { @@ -7989,7 +8011,7 @@ (define_expand "aarch64_st1" ;; Standard pattern name vec_init. (define_expand "vec_init" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand 1 "" "")] "TARGET_SIMD" { @@ -8068,7 +8090,7 @@ (define_insn "aarch64_urecpe" (define_expand "vec_extract" [(match_operand: 0 "aarch64_simd_nonimmediate_operand") - (match_operand:VALL_F16 1 "register_operand") + (match_operand:VALL_F16_FULL 1 "register_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" { diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f05bac713e88ea8c7feaa2367d55bd523ca66f57..1e08f8453688210afe1566092b19b59c9bdd0c97 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode mode) case E_V8BFmode: case E_V4SFmode: case E_V2DFmode: + case E_V2HFmode: return TARGET_SIMD ? VEC_ADVSIMD : 0; default: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 37d8161a33b1c399d80be82afa67613a087389d4..1df09f7fe2eb35aed96113476541e0faa5393551 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") (V8HF "TARGET_SIMD_F16INST") V2SF V4SF V2DF]) +;; Advanced SIMD Float modes suitable for pairwise operations. +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") + (V8HF "TARGET_SIMD_F16INST") + V2SF V4SF V2DF (V2HF "TARGET_SIMD_F16INST")]) ;; Advanced SIMD Float modes, and DF. (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 +192,23 @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI]) (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) ;; Advanced SIMD Float modes with 2 elements. -(define_mode_iterator V2F [V2SF V2DF]) +(define_mode_iterator V2F [V2SF V2DF V2HF]) ;; All Advanced SIMD modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) -;; All Advanced SIMD modes suitable for moving, loading, and storing. +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; except V2HF. (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; including V2HF +(define_mode_iterator VALL_F16_FULL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI + V4HF V8HF V4BF V8BF V2SF V4SF V2DF + (V2HF "TARGET_SIMD_F16INST")]) + + ;; The VALL_F16 modes except the 128-bit 2-element ones. (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF]) @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI "16") (V2SF "2") (V4SF "4") (V1DF "1") (V2DF "2") (DI "1") (DF "1") - (V8DI "8")]) + (V8DI "8") (V2HF "2")]) ;; Map a mode to the number of bits in it, if the size of the mode ;; is constant. @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI "s") (DI "d")]) ;; Give the length suffix letter for a sign- or zero-extension. (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) ;; Give the number of bits in the mode (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) @@ -1134,8 +1147,9 @@ (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") (V2SI "2s") (V4SI "4s") (DI "1d") (DF "1d") (V2DI "2d") (V2SF "2s") - (V4SF "4s") (V2DF "2d") - (V4HF "4h") (V8HF "8h") + (V2HF "2h") (V4SF "4s") + (V2DF "2d") (V4HF "4h") + (V8HF "8h") (V2x8QI "8b") (V2x4HI "4h") (V2x2SI "2s") (V2x1DI "1d") (V2x4HF "4h") (V2x2SF "2s") @@ -1175,9 +1189,10 @@ (define_mode_attr Vmtype [(V8QI ".8b") (V16QI ".16b") (V4HI ".4h") (V8HI ".8h") (V2SI ".2s") (V4SI ".4s") (V2DI ".2d") (V4HF ".4h") - (V8HF ".8h") (V4BF ".4h") - (V8BF ".8h") (V2SF ".2s") - (V4SF ".4s") (V2DF ".2d") + (V8HF ".8h") (V2HF ".2h") + (V4BF ".4h") (V8BF ".8h") + (V2SF ".2s") (V4SF ".4s") + (V2DF ".2d") (DI "") (SI "") (HI "") (QI "") (TI "") (HF "") @@ -1193,7 +1208,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h") (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (V4HI "h") (V8HI "h") (V2SI "s") (V4SI "s") - (V2DI "d") + (V2DI "d") (V2HF "h") (V4HF "h") (V8HF "h") (V2SF "s") (V4SF "s") (V2DF "d") @@ -1285,7 +1300,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") (VNx8QI "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF "s") (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") (SI "s") (DI "d")]) @@ -1360,8 +1375,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") (V4HF "HF") (V8HF "HF") (V2SF "SF") (V4SF "SF") (DF "DF") (V2DF "DF") - (SI "SI") (HI "HI") - (QI "QI") + (SI "SI") (V2HF "HF") + (QI "QI") (HI "HI") (V4BF "BF") (V8BF "BF") (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI "QI") (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 +1396,7 @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") (V2SF "sf") (V4SF "sf") (V2DF "df") (DF "df") (SI "si") (HI "hi") - (QI "qi") + (QI "qi") (V2HF "hf") (V4BF "bf") (V8BF "bf") (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 +1881,7 @@ (define_mode_attr q [(V8QI "") (V16QI "_q") (V4HF "") (V8HF "_q") (V4BF "") (V8BF "_q") (V2SF "") (V4SF "_q") - (V2DF "_q") + (V2HF "") (V2DF "_q") (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") (V2x8QI "") (V2x16QI "_q") (V2x4HI "") (V2x8HI "_q") @@ -1905,6 +1920,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") (V2SI "p") (V4SI "v") (V2DI "p") (V2DF "p") (V2SF "p") (V4SF "v") + (V2HF "p") (V4HF "v") (V8HF "v")]) (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e51d0a147c5722247 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -483,6 +483,7 @@ (define_attr "autodetect_type" ; neon_fp_minmax_s_q ; neon_fp_minmax_d ; neon_fp_minmax_d_q +; neon_fp_reduc_add_h ; neon_fp_reduc_add_s ; neon_fp_reduc_add_s_q ; neon_fp_reduc_add_d @@ -1033,6 +1034,7 @@ (define_attr "type" neon_fp_minmax_d,\ neon_fp_minmax_d_q,\ \ + neon_fp_reduc_add_h,\ neon_fp_reduc_add_s,\ neon_fp_reduc_add_s_q,\ neon_fp_reduc_add_d,\ @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ neon_fp_minmax_s_q, neon_fp_minmax_d, neon_fp_minmax_d_q,\ neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, neon_fp_neg_d_q,\ - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, neon_fp_reduc_add_d,\ - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, + neon_fp_reduc_add_h, neon_fp_reduc_add_s, neon_fp_reduc_add_s_q,\ + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s,\ neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 07d71a63414b1066ea431e287286ad048515711a..8e35e0b574d49913b43c7d8d4f4ba75f127f42e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, int n) \ TEST_ALL (VEC_PERM) /* We should use one DUP for each of the 8-, 16- and 32-bit types, - although we currently use LD1RW for _Float16. We should use two - DUPs for each of the three 64-bit types. */ + We should use two DUPs for each of the three 64-bit types. */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } */ /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ From patchwork Mon Oct 31 11:59:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13243 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2269452wru; Mon, 31 Oct 2022 05:03:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6RjHziGEBumh5bfNUpXwkcnYCtt7Kso7Hx51r4J7JcvKMObpZWYiVSBBUuYBGrR1FoEWAp X-Received: by 2002:a17:907:7f02:b0:73d:dffa:57b3 with SMTP id qf2-20020a1709077f0200b0073ddffa57b3mr13047642ejc.19.1667217802683; Mon, 31 Oct 2022 05:03:22 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o11-20020a170906974b00b00782933fe436si8613475ejy.965.2022.10.31.05.03.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:03:22 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=oyHrp8c9; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE4543851407 for ; Mon, 31 Oct 2022 12:00:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE4543851407 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217645; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=oyHrp8c9oN264FoJeni8ZQTSYgBmEB82aHZKgobL6Uv1PZtWsP6d+2COAELxUPCTP hCcoTFhWmB/GE+U7/iCUnWz+4dVBZe/uQ3WLV6ZaTulWGZRPP+K/AJIbWbUtU23wmZ lRUFs/e//YofZ6VxPThKb+Zdg1KKQ5fpm4WkB8s4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70083.outbound.protection.outlook.com [40.107.7.83]) by sourceware.org (Postfix) with ESMTPS id BECA2385381D for ; Mon, 31 Oct 2022 11:59:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BECA2385381D ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=I2AXpv8BbiVZN7q4pjgu/eJv3uSfp7h5IF/D3JWMu7PFbMUz1H0C9ek6bfhftey0ecub4LIhyjjjFvSpCiG+peVbNGr7lPv8gK8dS6ZkL9ZxS+nxsotAo4CBufShdWRm/tgJ+9gFznTKSoLvtIgLBwwfk+V+0CsojqhmfJJwsiz47W6qVf3lsiOt3bP/YIjGkrDALkYfKAnBz7wy0OQLwuS1XbFim5bb/DmHxXGs/ZrJm3GFY7pTPND0OCKdtINp2vWhyYI2xuSiAHcl2nzbTKWeyCVyZVQXN2V04N/XxL9OsU66BM+OesTQWKC3h12q/nTjc8LJUhx+pihyXUAmqQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; b=QB20LsCkLsZJLASnhSG1jo5IbdAadS7Gvw8TBajInC2/NGlDal5lNfMKUZiAYnPguauQCtz1X9K75tCLHmELYJ+Ljl+b7Y64edzrvI6OxUTSn3oOzQHwcDMB8hyCtWZX+WYA+2yZdVxB0XbxA0gmrQxydrs7O82WWJqpVc0EU52H0/ULBh4vWzlaCwHQujUmDbe3nw2m5P4N1XLJYCLm6wANuydsf/OcE1GghaHChSffmvStuPwiDe39MV7qbbbC0VhMt/jzs8rxWmOEWjTRja3trBm0uNgGy3MYmFVyc9snPsA51SoDUTMUF9+WiEsN2QCdZYlxdHlKGAw2WUNCUQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR04CA0128.eurprd04.prod.outlook.com (2603:10a6:20b:531::10) by PA4PR08MB6254.eurprd08.prod.outlook.com (2603:10a6:102:f3::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.18; Mon, 31 Oct 2022 11:59:31 +0000 Received: from AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:531:cafe::89) by AS9PR04CA0128.outlook.office365.com (2603:10a6:20b:531::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT056.mail.protection.outlook.com (100.127.140.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:31 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Mon, 31 Oct 2022 11:59:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 86b378ac9063a837 X-CR-MTA-TID: 64aa7808 Received: from 23c15775faa1.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 79764953-D9D6-4575-B055-B007F5552F0E.1; Mon, 31 Oct 2022 11:59:20 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 23c15775faa1.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:59:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LosX++JY6fBhj/i+7wIFWFpY6/WozxsjPUzcjDVnqdMxU6G0I0O9IZqkVpmBjW7kqGYZYQibMed4F2JMPyav43CJoJNhQl6dAxUdsfqAFVOyuA4GFjzN4fWNRgt2ZdYhkpLyY1pli0Yz8gLKzRQEJ6zzUzptS8kaOkSQUsLeA2AmtVwTQKyf9GuY3mormnHB1tY1fwKRSQGP3xTzfirheRXC9Y1xnhEpyGMBsfHrDjlQwaW7/lVNESsE4Yw3sJra8tKHT9twgxGWit6EF4FBCjjHYqpoOTmmVN5hOBcbQt8/MYehEK9gmpR3qRbX97aQeoIImc2vu68TFk31CW6KBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; b=oC49Ko2EHnE8dt8EFhfUTz797X3yjjXkt/JCQGn0tjWlBj3jdup8+RE4KhqnZBUbmNuzSVz3KaSAeWS0V10O+al9xxNUPR0vA2ps7IG9CRNItGIB1XJqF5wOJLNWiQ3MtiUMJuLB3aGQqkmsLmDmlMwSpnrKPVd1tle+hAEH2LdiGoQVqwsngXTcqdS4lv3vbHnMVT98vhZLu3Iiv3mVx7YKDzvCYplyiljOIFfNq6QGbe0E/gdG+kXu+RlKzdgNKlgHTPVbCHarhd73FOdq8Wm9roq9jjV47sjpykaOpgfcrwmnLc9UwePDOYhZ4EuC2L3nIooPvMzU6bEEWWHIqg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:17 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:59:17 +0000 Date: Mon, 31 Oct 2022 11:59:14 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 6/8]AArch64: Add peephole and scheduling logic for pairwise operations that appear late in RTL. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0031.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:61::19) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|AM7EUR03FT056:EE_|PA4PR08MB6254:EE_ X-MS-Office365-Filtering-Correlation-Id: fd0488e7-f991-4d0f-23ac-08dabb375ef5 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: OBcggxxwVXZdJlgF1bA6d3Yx10fbv2/xCk3y3roGqQL0UiBu9RlsiFUK3kz1ka6TvJsFHlmxwBETV4hIWQSVN2S/j7Kf24OzA2IoVmI/jiY75DSa2R8ssX+jg4dWm6K6pIqQ6ODdcRz5wnp0caZNLBFvOBcSd+7oZly4ZUZ4L7Y0t17xFzVk644fvqK7E5oPg02HhBeo6n0dpH9W9SabvTPgzaYRRsSZZ+J+GnzdfYo/T5XkqNjJa0Fu3veDvWnWb7L4QDBaF30ZGvK2ennOOQaDFP2bGRVz+vIRmqz3lc9eUblkXyj49gqMxD2t6zHzv2x6B45W9/2bPw0tnv68oA5NJP5hhwmH0RsqE5VRbh1gjKJafB/LiCFlmb4dWZ1Wijafo+vWN5QuUKsLqd++6KqyBaS+5k+h26ek5jO5JV6CuO8MqDUL0LaRLDt6mXGu31tnmXe7pNIHfmq6PZ6Ex7m/OaJDBwKa5O7xgle1G32G0YwEYRhlp6wLhKwEXniTssc7HaORxsCLK/tqzUNW7jffhKRBux4Cy2smkTDdXHzwW4DOpruMgojpcSBfyLQW6mWkdxsTqWsVwb0VgtHkruotXV6gL1JNdoSct7reE9VCLNXubxOw9ptEn2xjH4bO6oBXWym6DQKCSGGCYDCif0SYuFX/oIWTdMmspeB1ZeadthzFrUel8+f2PPVhBQ9XjPmgcUaEh+/nq6VmO5qDPBUusdBPhZ0B0T2N2yzn+iWLfvWB2LNEhClEgYKs4qjy//hulhpbWXeK8TTKK5uHVQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(2906002)(38100700002)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(6666004)(8936002)(26005)(4743002)(6512007)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a1852387-5bfd-404a-6efd-08dabb375622 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2hfWXmx9zh1XRzJjDfa2n2Rn34fUWM4gsae/5Nnb6He/omTEobAw+H0SUxU+ogw4wkK+EsQ43ShHzzjYtLxcMnpWit0IZBJWkjH3llb1nj8xCLfbbRawk5odqnFaAek6LjpDy38ypFbXKJe1emJMDyLeadbRfDUMo60XPiDvyr5pICZRAs2XJhfjnHs9Be1EYKKnPsEIG62+jhBelCl3H8JzKWd8OCtZSYoyl9Iviw5yGhAQor6ZpATRnSEFROxZFxJqLXzX5AJtRnRYLvsP2IAPYumuPcnkSCp5/+2huEZJWq1OV8+xjSYS3TS3+pe8rLId7yULiFnj1Ba1pv5rdpu5p3IgCAMD1zgCBYq4bZdmqSXjPywdycyieku0881os4Zksd7JI9rXF0FKZ1gYw2Uo2CTZK0SzcLKJFf32TTKTmYwn8xNDeR//NZ3E3rMdYIcCoNsFMWe4yyoiEGv3hZcS4fVUOI+yI5ugsSxxaFuEcavEAlN0Upby9plQMzipWtXlbwdAyqTeKJBL1IsUoczVsYTY3rYo1og/HUI9aR3hbtC/MFOdDO7HbNMEvyWzzJVMz/G5fcfSAfsc/6WgoTj4+Z9RfaVnZv2XzESpH7i2rSE1gs8xizklVlktfrQPvLw+2UQV7EV1uxG9zOMCUtzRrH/Cr3LKwvQiPqsNcQgo5MXc/P7Ai23zOgI0KuCVrEloKTXonB3hddVU3cXkyXblH6oCDPrx0SVBL19+R2QG8axGeRQWWHqoHe4spJFZE81lOSt0HjG006niOBOsHn3nFiUQ3p/d2zHVZzxb+tA= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(346002)(136003)(376002)(396003)(39860400002)(451199015)(40470700004)(46966006)(36840700001)(47076005)(336012)(40480700001)(40460700003)(6666004)(6486002)(2906002)(44832011)(44144004)(356005)(81166007)(36756003)(86362001)(82740400003)(82310400005)(4743002)(6512007)(33964004)(26005)(36860700001)(2616005)(186003)(316002)(70206006)(8676002)(478600001)(4326008)(70586007)(41300700001)(6506007)(235185007)(8936002)(5660300002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:31.6284 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fd0488e7-f991-4d0f-23ac-08dabb375ef5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6254 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204574463934850?= X-GMAIL-MSGID: =?utf-8?q?1748204574463934850?= Hi All, Says what it does on the tin. In case some operations form in RTL due to a split, combine or any RTL pass then still try to recognize them. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md: Add new peepholes. * config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Schedule sequential PLUS operations next to each other to increase the chance of forming pairwise operations. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 93a2888f567460ad10ec050ea7d4f701df4729d1..20e9adbf7b9b484f9a19f0c62770930dc3941eb2 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 93a2888f567460ad10ec050ea7d4f701df4729d1..20e9adbf7b9b484f9a19f0c62770930dc3941eb2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3425,6 +3425,22 @@ (define_insn "aarch64_faddp" [(set_attr "type" "neon_fp_reduc_add_")] ) +(define_peephole2 + [(set (match_operand: 0 "register_operand") + (vec_select: + (match_operand:VHSDF 1 "register_operand") + (parallel [(match_operand 2 "const_int_operand")]))) + (set (match_operand: 3 "register_operand") + (plus: + (match_dup 0) + (match_operand: 5 "register_operand")))] + "TARGET_SIMD + && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1 + && REGNO (operands[5]) == REGNO (operands[1]) + && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 3) (unspec: [(match_dup 1)] UNSPEC_FADDV))] +) + (define_insn "reduc_plus_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f3bd71c9f10868f9e6ab50d8e36ed3ee3d48ac22..4023b1729d92bf37f5a2fc8fc8cd3a5194532079 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25372,6 +25372,29 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) } } + /* Try to schedule vec_select and add together so the peephole works. */ + if (simple_sets_p && REG_P (SET_DEST (prev_set)) && REG_P (SET_DEST (curr_set)) + && GET_CODE (SET_SRC (prev_set)) == VEC_SELECT && GET_CODE (SET_SRC (curr_set)) == PLUS) + { + /* We're trying to match: + prev (vec_select) == (set (reg r0) + (vec_select (reg r1) n) + curr (plus) == (set (reg r2) + (plus (reg r0) (reg r1))) */ + rtx prev_src = SET_SRC (prev_set); + rtx curr_src = SET_SRC (curr_set); + rtx parallel = XEXP (prev_src, 1); + auto idx + = ENDIAN_LANE_N (GET_MODE_NUNITS (GET_MODE (XEXP (prev_src, 0))), 1); + if (GET_CODE (parallel) == PARALLEL + && XVECLEN (parallel, 0) == 1 + && known_eq (INTVAL (XVECEXP (parallel, 0, 0)), idx) + && GET_MODE (SET_DEST (prev_set)) == GET_MODE (curr_src) + && GET_MODE_INNER (GET_MODE (XEXP (prev_src, 0))) + == GET_MODE (XEXP (curr_src, 1))) + return true; + } + /* Fuse compare (CMP/CMN/TST/BICS) and conditional branch. */ if (aarch64_fusion_enabled_p (AARCH64_FUSE_CMP_BRANCH) && prev_set && curr_set && any_condjump_p (curr) --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3425,6 +3425,22 @@ (define_insn "aarch64_faddp" [(set_attr "type" "neon_fp_reduc_add_")] ) +(define_peephole2 + [(set (match_operand: 0 "register_operand") + (vec_select: + (match_operand:VHSDF 1 "register_operand") + (parallel [(match_operand 2 "const_int_operand")]))) + (set (match_operand: 3 "register_operand") + (plus: + (match_dup 0) + (match_operand: 5 "register_operand")))] + "TARGET_SIMD + && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1 + && REGNO (operands[5]) == REGNO (operands[1]) + && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 3) (unspec: [(match_dup 1)] UNSPEC_FADDV))] +) + (define_insn "reduc_plus_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f3bd71c9f10868f9e6ab50d8e36ed3ee3d48ac22..4023b1729d92bf37f5a2fc8fc8cd3a5194532079 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25372,6 +25372,29 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) } } + /* Try to schedule vec_select and add together so the peephole works. */ + if (simple_sets_p && REG_P (SET_DEST (prev_set)) && REG_P (SET_DEST (curr_set)) + && GET_CODE (SET_SRC (prev_set)) == VEC_SELECT && GET_CODE (SET_SRC (curr_set)) == PLUS) + { + /* We're trying to match: + prev (vec_select) == (set (reg r0) + (vec_select (reg r1) n) + curr (plus) == (set (reg r2) + (plus (reg r0) (reg r1))) */ + rtx prev_src = SET_SRC (prev_set); + rtx curr_src = SET_SRC (curr_set); + rtx parallel = XEXP (prev_src, 1); + auto idx + = ENDIAN_LANE_N (GET_MODE_NUNITS (GET_MODE (XEXP (prev_src, 0))), 1); + if (GET_CODE (parallel) == PARALLEL + && XVECLEN (parallel, 0) == 1 + && known_eq (INTVAL (XVECEXP (parallel, 0, 0)), idx) + && GET_MODE (SET_DEST (prev_set)) == GET_MODE (curr_src) + && GET_MODE_INNER (GET_MODE (XEXP (prev_src, 0))) + == GET_MODE (XEXP (curr_src, 1))) + return true; + } + /* Fuse compare (CMP/CMN/TST/BICS) and conditional branch. */ if (aarch64_fusion_enabled_p (AARCH64_FUSE_CMP_BRANCH) && prev_set && curr_set && any_condjump_p (curr) From patchwork Mon Oct 31 11:59:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13245 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2270547wru; Mon, 31 Oct 2022 05:05:19 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4uhMZ2P35Ygnr97o8u4BKtjhfa+JZRuMvy5lJzMD728ldduO6LwlPn94E5kygSNyLltRu5 X-Received: by 2002:a17:907:b08:b0:78e:2f4c:882c with SMTP id h8-20020a1709070b0800b0078e2f4c882cmr12326632ejl.293.1667217919768; Mon, 31 Oct 2022 05:05:19 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u2-20020a50d502000000b004615bae2376si7340017edi.147.2022.10.31.05.05.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:05:19 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=JW11IKx1; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E8AC03810B71 for ; Mon, 31 Oct 2022 12:01:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E8AC03810B71 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217691; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=JW11IKx1mNnANxgJGxf+Wiq3wiiAnxMe0oLWIIxx8xEFGlSxSae2iZkfb6oG+Waeo J3r0ql/wTAw0IR0TNHBlUY8WNWBZRUNzRy7SrauWkEGaiKcogtza+pNStme0QGiUZT DTRtbAlQu+fvovfvqMfBxaHfjWa3oE3WS81pHmFk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70087.outbound.protection.outlook.com [40.107.7.87]) by sourceware.org (Postfix) with ESMTPS id 50A13382DE3B for ; Mon, 31 Oct 2022 11:59:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 50A13382DE3B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Hdfl/CiFPzsP06L1QRay/OvabpZSJCCQsBplN5Avm6btnZ9g1ESc+zfCPvRcEW/XvIDTLSBhdcC9ijXHF4zvS16CJtSVOSMD1DwHmPpxuccseJqk7OI6dKn1g8Q8y6DugLd3yhoL6YRbqjrBGR2ZfUcl2ktOgK5xhx0YJUjNn2VD7H9ESCS6sAvJBW9XFE4hitN6nVovAGMGGnLb3Wj1Jjat3DmTS0JoUyDn6/5sMNBfub10x15/Iim5RxV8TwQ3qFqo/WIv92BD0hYqgDCQSKoDhE8o6dMIskssFC+wrYHPztq9yvLXWdB74O2hapENMlw6lGCVBIdXCKNmh5/rSw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; b=jDBwEA2H/oiE/ase4+ecTt+sjg6E+SCQEeYMya1vCwTLPpYI1G16mJr7O4fmf3mgJ75vkfyjdFi2GdL2sWWS4h0t4Dvzb+OXcyl/bEoXvMZ0sLAFDAkP22K1XwV3eFnSTTUYYXoufJF67gMFOxsZij0peocaJXEJfqSI9xTYhlIDyRG+uj0ycXdVEIcQP9iopxxeNBm7M5pgUrh57E3rArbGBG8sEUBNnjpUkvlkXPNF7niCrCK1lcV06YeIoxsPiJaCvMlIGhkghVfD9nHWYEosoxLLD4dv1u1mde4iVdbKCvxYPsgVvzoJkmpOiSD800i09ccESxgaiewN4Mqt8Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB8PR03CA0025.eurprd03.prod.outlook.com (2603:10a6:10:be::38) by AS8PR08MB6408.eurprd08.prod.outlook.com (2603:10a6:20b:33a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:52 +0000 Received: from DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:be:cafe::df) by DB8PR03CA0025.outlook.office365.com (2603:10a6:10:be::38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:59:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT059.mail.protection.outlook.com (100.127.142.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:52 +0000 Received: ("Tessian outbound 2ff13c8f2c05:v130"); Mon, 31 Oct 2022 11:59:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3f99d9352cb862cb X-CR-MTA-TID: 64aa7808 Received: from ddc7967abf4e.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 29CAEA04-A315-48CB-8FDE-B197E82F023B.1; Mon, 31 Oct 2022 11:59:46 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ddc7967abf4e.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:59:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xgrcpk1bqo9qAzJLKf9MyX+C6z9K7TuAupgVpig45MQobLVLzxmWsmUcfCf2ihOb8hLn/xUnFMHDkLIY9Ily4t59cCxvspVGdlTybONWwSY80ImmyPRSRCYrwWGwSc6qERthvM8dbwvJdoT10QV7TsAP00jIldy/ILEzKXJKR6zctqqknHSOJG6X3q0uJdgSnTXwNFgpSVixYx+WKJPsAxoo2ev1Htu//SOOIjHy6dsqGvVcMxu/kofwA1F0HXIN6s8NvCPP/CYKKPacJyC4FPq8ASJAgQifsfAUoWfIuKwdUxkXB6PaytNkr0Rody/JoLfLFK1kRXFW2/wgUgModA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; b=gLZh+FheczprzOxeW65uF6F7Q9f6C8AmXbmDGkbHPMzPPwd26q/ra0BSWp7kOfnKTXtahMeK5MW2Jafb9jVlywm4EzXPJuIr+pvrjWBHfDswhevOdDOpb/DJ0ppQpI4Uas5G+GG+lIieSpZemeAO3rLf89ogrz5V5FPKGMiZemR9OD8vFjnxkeQ2srNAN/2MR1+kOHEqqNW73ma4ZbiIws8d5RzTi00yEvSAtt/4zwY8l0bBJYsbLsD+EmV5iqB+8HBHBzbb5YlJSkYuHpSddp3Mh19bp/TbhUi8c96H+JzTwOzVy7etDudmMRpLWfiq5y6G4CHaYiE3h6pTQMiaGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:59:44 +0000 Date: Mon, 31 Oct 2022 11:59:36 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO3P123CA0006.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:ba::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|DBAEUR03FT059:EE_|AS8PR08MB6408:EE_ X-MS-Office365-Filtering-Correlation-Id: b4f64721-9547-4f9d-f816-08dabb376b64 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Bl8lSbwJnI2ZY8wLlgsqk7fMN3OeT6/rutuvhiNs8oVt+YXFnrEnPlV7G4GSWBux6FE3SJcP6mZDwMZPF+pX2yZLeNV6gLiBi+q8yU6KTlQ9R2U/0jxP9Zf3WUEpyO5ZPIF0hb08JiSg1qEVFFek5u3xUpccHU7qFhYmoLITnkp8Ky/3CNLJyLOLnvlUpICqV3wJ5KQggB97c9mMXV0QTr+UWJiOvvnhB7Llg5akCMyWbKMtQro/kcX2ufcC6vYYtEZh0XfqLSZBWlT+HGuAgTM+eZYwP3yIALbxeSTuFFfMqwZhH3rrmhRkqjIClA+pqt0iqeDbyqp7Ykzg9y6Yv4qb7Ov8jF4r+sCWPFs1OTvs5TXnnGqc3GAj3d01j02yhfu2nE8p+T9pQ95O3SiBNIXFgWXwjvaMY2on6qhD3JPrQYfI2IiFU6GtzmZOg0f9nMauouyYxRuokjqGS2AYEBIApCMfxPLpBBzQAL0ACO9jUfmcZrdtxZNWyRSC2+zPAlNmp31Bez6cp/L/4E0KzSlWVBYWARB0FJZniibbP7YzSVSyGHRXRtLXsnIXJY/oZqDBig3XypXXvpMiO6G26m9q3CwfKmkgN0BVrgcZxB9y0UIpK+Db78EpEZugZljghARxDM4x92eBoL8Jm6UCW/F6qMKZubg+tgcs0usEZzpH/j+j6WQ0JoNS6Rg7ktjsrtcUL0eHskCemzIXMXzwpbnSNPowvpNOq3PySW3FJNZmDZyg5KBedppctxs9S8nmdTvn8L2OcdgyxTyCvNMAR2NZB3I0Vu+v5VVe4FVaN7GtIGi2a9d/80QzfUfGLXnKancMwku36CVYRaJa7LEVebXYHtkS/LefXkNuFWyMgWc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(84970400001)(2906002)(38100700002)(83380400001)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(30864003)(6666004)(8936002)(26005)(4743002)(6512007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: dd95116c-94f3-48f4-8f2f-08dabb376691 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: co1f2U42wxc070x4BNB9CpEDKLL2lg4CcxdaaIIOMtwsHDya5AcYnQX/4xywIjDsxAzVTY8j5nIoQ0FhcUNBTd1gEp+kF5g7K2o1LSml8fyp3KjJQeoFCDyR4BxFmir22ZQKF6JmGj1AHlywB75VJZq2RIOoETrcV4WG4aU3Z36lT4ATWGSBybZF6UT9nrD7GiPvW2jx2eacmp2MDLJ3LeFuZXHJl0HwboPd4SrpGemNpP8izVbrGeKEmgEbEL5vLvsfCjBNkVoWX0TvMNtL6/NPxPqv8GQmx3NMC/A/LwO7JBaGwT3Km5ejwjYjRxKDUEI3Mb3BaSy/XAD1VWyqDloFZMFP8CV6Itm2K4eM8thqSZHOrNCClI50l+Kl26DN7DwgvtGpaQQHeGkkjZq0qVON/DQetTv7wCqjf3HxNvr2m38Bl3iIhP2cmzf/OB9Recsmr15ya+w23QPX9Yr4qYRVIqS6KroNiAOGGlOUNA/RrRoyTczFBlZb8pmdV19I0rM3zi+uPOYe/BWWLtE7u/tIhUv9Fbq4gXOr59/DLIyfp/WzP/q6YHwJfFFfeUHZ1b5vWOdblMsTALD8xOq91Pj6VUduYDJMAhYYvuN0+uvgtC7gVatleaw94H/4LGxVT+04xWrzBjFVoLPijzqESpjo8OOnkwkC8vE0IgDgDO2vwB02UnMFtIpfLaOhl5F+vXCW2rRdygqUylq9VYIaHnutRWviqhoVYZPgckEcXUmYVRktHAmt5IY7F1fyBhLWlu8hCU1a3E3Aum369d/Dni2rXT92Rlyxv+imoINcE5+VBihdlnVqvmY434QyVyP12LnUfVGBmFTcVd6zQ7j7Kj8q2quwfNySy3YQ1YwiF6S+oSOwNHLl4j4TV0Dn2PPTUS3Pw75o0b4WgDXImiy8WA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(39860400002)(376002)(346002)(396003)(451199015)(40470700004)(36840700001)(46966006)(6916009)(4326008)(41300700001)(8676002)(70206006)(70586007)(8936002)(235185007)(84970400001)(30864003)(44832011)(5660300002)(6666004)(478600001)(2906002)(356005)(81166007)(6486002)(336012)(6512007)(44144004)(2616005)(26005)(36756003)(4743002)(6506007)(186003)(33964004)(47076005)(83380400001)(316002)(82310400005)(36860700001)(86362001)(40480700001)(82740400003)(40460700003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:52.5721 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b4f64721-9547-4f9d-f816-08dabb376b64 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6408 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204697288274771?= X-GMAIL-MSGID: =?utf-8?q?1748204697288274771?= Hi All, The target has various zero and sign extension patterns. These however live in various locations around the MD file and almost all of them are split differently. Due to the various patterns we also ended up missing valid extensions. For instance smov is almost never generated. This change tries to make this more manageable by consolidating the patterns as much as possible and in doing so fix the missing alternatives. There were also some duplicate patterns. Note that the zero_extend<*_ONLY:mode>2 patterns are nearly identical however QImode lacks an alternative that the others don't have, so I have left them as 3 different patterns next to each other. In a lot of cases the wrong iterator was used leaving out cases that should exist. I've also changed the masks used for zero extensions to hex instead of decimal as it's more clear what they do that way, and aligns better with output of other compilers. This leave the bulk of the extensions in just 3 patterns. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_get_lane_zero_extend): Changed to ... (*aarch64_get_lane_zero_extend): ... This. (*aarch64_get_lane_extenddi): New. * config/aarch64/aarch64.md (sidi2, *extendsidi2_aarch64, qihi2, *extendqihi2_aarch64, *zero_extendsidi2_aarch64): Remove duplicate patterns. (2, *extend2_aarch64): Remove, consolidate into ... (extend2): ... This. (*zero_extendqihi2_aarch64, *zero_extend2_aarch64): Remove, consolidate into ... (zero_extend2, zero_extend2, zero_extend2): (*ands_compare0): Renamed to ... (*ands_compare0): ... This. * config/aarch64/iterators.md (HI_ONLY, QI_ONLY): New. (short_mask): Use hex rather than dec and add SI. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ands_3.c: Update codegen. * gcc.target/aarch64/sve/slp_1.c: Likewise. * gcc.target/aarch64/tst_5.c: Likewise. * gcc.target/aarch64/tst_6.c: Likewise. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4237,19 +4237,34 @@ (define_insn "*aarch64_get_lane_extend" [(set_attr "type" "neon_to_gp")] ) -(define_insn "*aarch64_get_lane_zero_extend" +(define_insn "*aarch64_get_lane_extenddi" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (vec_select: + (match_operand:VS 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] + "TARGET_SIMD" + { + operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); + return "smov\\t%x0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" [(set (match_operand:GPI 0 "register_operand" "=r") (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") + (vec_select: + (match_operand:VDQV_L 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (mode, + operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); - return "umov\\t%w0, %1.[%2]"; + return "umov\\t%w0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ea16dbc2557c6a4f37104d44a49f77f768eb53d..09ae1118371f82ca63146fceb953eb9e820d05a4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1911,22 +1911,6 @@ (define_insn "storewb_pair_" ;; Sign/Zero extension ;; ------------------------------------------------------------------- -(define_expand "sidi2" - [(set (match_operand:DI 0 "register_operand") - (ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] - "" - "@ - sxtw\t%0, %w1 - ldrsw\t%0, %1" - [(set_attr "type" "extend,load_4")] -) - (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) @@ -1940,21 +1924,6 @@ (define_insn "*load_pair_extendsidi2_aarch64" [(set_attr "type" "load_8")] ) -(define_insn "*zero_extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r,w,w,r,w") - (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m,r,m,w,w")))] - "" - "@ - uxtw\t%0, %w1 - ldr\t%w0, %1 - fmov\t%s0, %w1 - ldr\t%s0, %1 - fmov\t%w0, %s1 - fmov\t%s0, %s1" - [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") - (set_attr "arch" "*,*,fp,fp,fp,fp")] -) - (define_insn "*load_pair_zero_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r,w") (zero_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump"))) @@ -1971,61 +1940,64 @@ (define_insn "*load_pair_zero_extendsidi2_aarch64" (set_attr "arch" "*,fp")] ) -(define_expand "2" - [(set (match_operand:GPI 0 "register_operand") - (ANY_EXTEND:GPI (match_operand:SHORT 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,r") - (sign_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,w")))] +(define_insn "extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,r") + (sign_extend:SD_HSDI + (match_operand:ALLX 1 "nonimmediate_operand" "r,m,w")))] "" "@ - sxt\t%0, %w1 - ldrs\t%0, %1 - smov\t%0, %1.[0]" + sxt\t%0, %w1 + ldrs\t%0, %1 + smov\t%0, %1.[0]" [(set_attr "type" "extend,load_4,neon_to_gp") (set_attr "arch" "*,*,fp")] ) -(define_insn "*zero_extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") - (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m,w")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:SI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - and\t%0, %1, - ldr\t%w0, %1 - ldr\t%0, %1 - umov\t%w0, %1.[0]" - [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp") - (set_attr "arch" "*,*,fp,fp")] -) - -(define_expand "qihi2" - [(set (match_operand:HI 0 "register_operand") - (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand")))] - "" + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + fmov\t%w0, %1 + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp,fp")] ) -(define_insn "*extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:HI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - sxtb\t%w0, %w1 - ldrsb\t%w0, %1" - [(set_attr "type" "extend,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp16,fp,fp,fp16")] ) -(define_insn "*zero_extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,r,w") + (zero_extend:SD_HSDI + (match_operand:QI_ONLY 1 "nonimmediate_operand" "r,m,m,w,w")))] "" "@ - and\t%w0, %w1, 255 - ldrb\t%w0, %1" - [(set_attr "type" "logic_imm,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + dup\t%0, %1.[0]" + [(set_attr "type" "mov_reg,load_4,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp")] ) ;; ------------------------------------------------------------------- @@ -5029,15 +5001,15 @@ (define_insn "*and_compare0" [(set_attr "type" "alus_imm")] ) -(define_insn "*ands_compare0" +(define_insn "*ands_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ - (zero_extend:GPI (match_operand:SHORT 1 "register_operand" "r")) + (zero_extend:SD_HSDI (match_operand:ALLX 1 "register_operand" "r")) (const_int 0))) - (set (match_operand:GPI 0 "register_operand" "=r") - (zero_extend:GPI (match_dup 1)))] + (set (match_operand:SD_HSDI 0 "register_operand" "=r") + (zero_extend:SD_HSDI (match_dup 1)))] "" - "ands\\t%0, %1, " + "ands\\t%0, %1, " [(set_attr "type" "alus_imm")] ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 1df09f7fe2eb35aed96113476541e0faa5393551..e904407b2169e589b7007ff966b2d9347a6d0fd2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -41,6 +41,8 @@ (define_mode_iterator SHORT [QI HI]) ;; Iterators for single modes, for "@" patterns. (define_mode_iterator SI_ONLY [SI]) (define_mode_iterator DI_ONLY [DI]) +(define_mode_iterator HI_ONLY [HI]) +(define_mode_iterator QI_ONLY [QI]) ;; Iterator for all integer modes (up to 64-bit) (define_mode_iterator ALLI [QI HI SI DI]) @@ -1033,7 +1035,7 @@ (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")]) ;; For width of fp registers in fcvt instruction (define_mode_attr fpw [(DI "s") (SI "d")]) -(define_mode_attr short_mask [(HI "65535") (QI "255")]) +(define_mode_attr short_mask [(SI "0xffffffff") (HI "0xffff") (QI "0xff")]) ;; For constraints used in scalar immediate vector moves (define_mode_attr hq [(HI "h") (QI "q")]) diff --git a/gcc/testsuite/gcc.target/aarch64/ands_3.c b/gcc/testsuite/gcc.target/aarch64/ands_3.c index 42cb7f0f0bc86a4aceb09851c31eb2e888d93403..421aa5cea7a51ad810cc9c5653a149cb21bb871c 100644 --- a/gcc/testsuite/gcc.target/aarch64/ands_3.c +++ b/gcc/testsuite/gcc.target/aarch64/ands_3.c @@ -9,4 +9,4 @@ f9 (unsigned char x, int y) return x; } -/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*255" } } */ +/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 8e35e0b574d49913b43c7d8d4f4ba75f127f42e9..03288976b3397cdbe0e822f94f2a6448d9fa9a52 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -51,7 +51,6 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */ /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */ /* { dg-final { scan-assembler-not {\tldr} } } */ -/* { dg-final { scan-assembler-times {\tstr} 2 } } */ -/* { dg-final { scan-assembler-times {\tstr\th[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {\tins\tv[0-9]+\.h\[1\], v[0-9]+\.h\[0\]} 1 } } */ /* { dg-final { scan-assembler-not {\tuqdec} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_5.c b/gcc/testsuite/gcc.target/aarch64/tst_5.c index 0de40a6c47a7d63c1b7a81aeba438a096c0041b8..19034cd74ed07ea4d670c25d9ab3d1cff805a483 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_5.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_5.c @@ -4,7 +4,7 @@ int f255 (int x) { - if (x & 255) + if (x & 0xff) return 1; return x; } @@ -12,10 +12,10 @@ f255 (int x) int f65535 (int x) { - if (x & 65535) + if (x & 0xffff) return 1; return x; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*255" } } */ -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_6.c b/gcc/testsuite/gcc.target/aarch64/tst_6.c index f15ec114c391fed79cc43b7740fde83fb3d4ea53..1c047cfae214b60e5bf003e6781a277202fcc588 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_6.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_6.c @@ -7,4 +7,4 @@ foo (long x) return ((short) x != 0) ? x : 1; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4237,19 +4237,34 @@ (define_insn "*aarch64_get_lane_extend" [(set_attr "type" "neon_to_gp")] ) -(define_insn "*aarch64_get_lane_zero_extend" +(define_insn "*aarch64_get_lane_extenddi" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (vec_select: + (match_operand:VS 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] + "TARGET_SIMD" + { + operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); + return "smov\\t%x0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" [(set (match_operand:GPI 0 "register_operand" "=r") (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") + (vec_select: + (match_operand:VDQV_L 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (mode, + operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); - return "umov\\t%w0, %1.[%2]"; + return "umov\\t%w0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ea16dbc2557c6a4f37104d44a49f77f768eb53d..09ae1118371f82ca63146fceb953eb9e820d05a4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1911,22 +1911,6 @@ (define_insn "storewb_pair_" ;; Sign/Zero extension ;; ------------------------------------------------------------------- -(define_expand "sidi2" - [(set (match_operand:DI 0 "register_operand") - (ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] - "" - "@ - sxtw\t%0, %w1 - ldrsw\t%0, %1" - [(set_attr "type" "extend,load_4")] -) - (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) @@ -1940,21 +1924,6 @@ (define_insn "*load_pair_extendsidi2_aarch64" [(set_attr "type" "load_8")] ) -(define_insn "*zero_extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r,w,w,r,w") - (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m,r,m,w,w")))] - "" - "@ - uxtw\t%0, %w1 - ldr\t%w0, %1 - fmov\t%s0, %w1 - ldr\t%s0, %1 - fmov\t%w0, %s1 - fmov\t%s0, %s1" - [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") - (set_attr "arch" "*,*,fp,fp,fp,fp")] -) - (define_insn "*load_pair_zero_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r,w") (zero_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump"))) @@ -1971,61 +1940,64 @@ (define_insn "*load_pair_zero_extendsidi2_aarch64" (set_attr "arch" "*,fp")] ) -(define_expand "2" - [(set (match_operand:GPI 0 "register_operand") - (ANY_EXTEND:GPI (match_operand:SHORT 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,r") - (sign_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,w")))] +(define_insn "extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,r") + (sign_extend:SD_HSDI + (match_operand:ALLX 1 "nonimmediate_operand" "r,m,w")))] "" "@ - sxt\t%0, %w1 - ldrs\t%0, %1 - smov\t%0, %1.[0]" + sxt\t%0, %w1 + ldrs\t%0, %1 + smov\t%0, %1.[0]" [(set_attr "type" "extend,load_4,neon_to_gp") (set_attr "arch" "*,*,fp")] ) -(define_insn "*zero_extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") - (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m,w")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:SI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - and\t%0, %1, - ldr\t%w0, %1 - ldr\t%0, %1 - umov\t%w0, %1.[0]" - [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp") - (set_attr "arch" "*,*,fp,fp")] -) - -(define_expand "qihi2" - [(set (match_operand:HI 0 "register_operand") - (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand")))] - "" + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + fmov\t%w0, %1 + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp,fp")] ) -(define_insn "*extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:HI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - sxtb\t%w0, %w1 - ldrsb\t%w0, %1" - [(set_attr "type" "extend,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp16,fp,fp,fp16")] ) -(define_insn "*zero_extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,r,w") + (zero_extend:SD_HSDI + (match_operand:QI_ONLY 1 "nonimmediate_operand" "r,m,m,w,w")))] "" "@ - and\t%w0, %w1, 255 - ldrb\t%w0, %1" - [(set_attr "type" "logic_imm,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + dup\t%0, %1.[0]" + [(set_attr "type" "mov_reg,load_4,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp")] ) ;; ------------------------------------------------------------------- @@ -5029,15 +5001,15 @@ (define_insn "*and_compare0" [(set_attr "type" "alus_imm")] ) -(define_insn "*ands_compare0" +(define_insn "*ands_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ - (zero_extend:GPI (match_operand:SHORT 1 "register_operand" "r")) + (zero_extend:SD_HSDI (match_operand:ALLX 1 "register_operand" "r")) (const_int 0))) - (set (match_operand:GPI 0 "register_operand" "=r") - (zero_extend:GPI (match_dup 1)))] + (set (match_operand:SD_HSDI 0 "register_operand" "=r") + (zero_extend:SD_HSDI (match_dup 1)))] "" - "ands\\t%0, %1, " + "ands\\t%0, %1, " [(set_attr "type" "alus_imm")] ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 1df09f7fe2eb35aed96113476541e0faa5393551..e904407b2169e589b7007ff966b2d9347a6d0fd2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -41,6 +41,8 @@ (define_mode_iterator SHORT [QI HI]) ;; Iterators for single modes, for "@" patterns. (define_mode_iterator SI_ONLY [SI]) (define_mode_iterator DI_ONLY [DI]) +(define_mode_iterator HI_ONLY [HI]) +(define_mode_iterator QI_ONLY [QI]) ;; Iterator for all integer modes (up to 64-bit) (define_mode_iterator ALLI [QI HI SI DI]) @@ -1033,7 +1035,7 @@ (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")]) ;; For width of fp registers in fcvt instruction (define_mode_attr fpw [(DI "s") (SI "d")]) -(define_mode_attr short_mask [(HI "65535") (QI "255")]) +(define_mode_attr short_mask [(SI "0xffffffff") (HI "0xffff") (QI "0xff")]) ;; For constraints used in scalar immediate vector moves (define_mode_attr hq [(HI "h") (QI "q")]) diff --git a/gcc/testsuite/gcc.target/aarch64/ands_3.c b/gcc/testsuite/gcc.target/aarch64/ands_3.c index 42cb7f0f0bc86a4aceb09851c31eb2e888d93403..421aa5cea7a51ad810cc9c5653a149cb21bb871c 100644 --- a/gcc/testsuite/gcc.target/aarch64/ands_3.c +++ b/gcc/testsuite/gcc.target/aarch64/ands_3.c @@ -9,4 +9,4 @@ f9 (unsigned char x, int y) return x; } -/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*255" } } */ +/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 8e35e0b574d49913b43c7d8d4f4ba75f127f42e9..03288976b3397cdbe0e822f94f2a6448d9fa9a52 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -51,7 +51,6 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */ /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */ /* { dg-final { scan-assembler-not {\tldr} } } */ -/* { dg-final { scan-assembler-times {\tstr} 2 } } */ -/* { dg-final { scan-assembler-times {\tstr\th[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {\tins\tv[0-9]+\.h\[1\], v[0-9]+\.h\[0\]} 1 } } */ /* { dg-final { scan-assembler-not {\tuqdec} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_5.c b/gcc/testsuite/gcc.target/aarch64/tst_5.c index 0de40a6c47a7d63c1b7a81aeba438a096c0041b8..19034cd74ed07ea4d670c25d9ab3d1cff805a483 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_5.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_5.c @@ -4,7 +4,7 @@ int f255 (int x) { - if (x & 255) + if (x & 0xff) return 1; return x; } @@ -12,10 +12,10 @@ f255 (int x) int f65535 (int x) { - if (x & 65535) + if (x & 0xffff) return 1; return x; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*255" } } */ -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_6.c b/gcc/testsuite/gcc.target/aarch64/tst_6.c index f15ec114c391fed79cc43b7740fde83fb3d4ea53..1c047cfae214b60e5bf003e6781a277202fcc588 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_6.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_6.c @@ -7,4 +7,4 @@ foo (long x) return ((short) x != 0) ? x : 1; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ From patchwork Mon Oct 31 12:00:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 13246 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2270682wru; Mon, 31 Oct 2022 05:05:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6TlgaaNNvygwVuKIXBBcr7ZZVGdesMiCRQZm7lKFp+eythEJk3FIjtmDWLpA7Qw33iE6fc X-Received: by 2002:a17:906:9b93:b0:78d:eb36:1ce7 with SMTP id dd19-20020a1709069b9300b0078deb361ce7mr12947865ejc.621.1667217936080; Mon, 31 Oct 2022 05:05:36 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t24-20020a508d58000000b0045be16903d0si6857581edt.310.2022.10.31.05.05.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 05:05:36 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=MXXbApD7; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DD75938A817F for ; Mon, 31 Oct 2022 12:01:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DD75938A817F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217697; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=MXXbApD75tPzceeyQUBE79vNJ/Npb79tH4CY5MFH+r8RAanMBVQ3kmFI9KVbKhmXM R8IpG8m5jLC5RJ33o2g97TTMKk2knqD7ZvJKr2VOMUIZq11RPOE+zDikPgPqVlwtOG sWVLGC3yi94E1kdoQtTGQ3ER1YaqH+EIKOOnNA9c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2063.outbound.protection.outlook.com [40.107.105.63]) by sourceware.org (Postfix) with ESMTPS id 185473865C1B for ; Mon, 31 Oct 2022 12:00:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 185473865C1B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=nmv00S0GzoSVy8mgKj6frbrTYTcETmNrEa/jP7DK/sQ2ZmqAZPm4WEEkK4+LRLdZfmg5B2KRBHZL9nDnx05PpNJkR7dbO8PdSln0n+NWALDYfbPSxBdurq3VC0jrG6q9V0dl3KlyH2jNn1j+JEiqxG/vC259iMCbQrCla6CdLlMiJ60zLHsgD6hHZuQ9YAZlI2oW/j7FXp/fYu+5D7QxrmnYkgWOF5++dROrwFxYIFXGarGLWJ1THLeJ7druPAEaVKVqQFTosMiTzohBCPGTjvyNXIAhNUzLUxSU11fCWIcdQOWLCzbQylEYr8i1U9AkXACIzGr2REQW2PVQfqw4rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=PYAfBdNZzmfpKe+dXRq0kycDbXkgVOTfA0ML5nhYUPh49lfbuoQQ2m3N2i7mSV8584XbncmQT5P9gODvfaHKKuoiPnmrl+AHXBk1PycUUj+uZqvaVMq0WQtSWAwQy9paCEpye3n5lWoooA92hvcyx7XkFeCqbNhNnruub5KQgAg/lt4uExjJVZfisLL5jIAOpDfp4lFGnnI/EV5kFxCdBUQm03tHY/E3ZxMNTSLhA0tbQmYa+pWlM93+Psel5iWZUA+JV/qI1i4L3ID4qePliy+zCtaSxJhtybZW5I05ykSTebxq39Ds7fqqecBh4/S5jGLI10k9QNJHLAk8VKlvWA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR06CA0117.eurprd06.prod.outlook.com (2603:10a6:20b:465::32) by AS8PR08MB6614.eurprd08.prod.outlook.com (2603:10a6:20b:338::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.18; Mon, 31 Oct 2022 12:00:43 +0000 Received: from AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:465:cafe::4e) by AS9PR06CA0117.outlook.office365.com (2603:10a6:20b:465::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT018.mail.protection.outlook.com (100.127.140.97) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 Received: ("Tessian outbound aeae1c7b66fd:v130"); Mon, 31 Oct 2022 12:00:43 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 021da67ec2d85891 X-CR-MTA-TID: 64aa7808 Received: from 321a9a2ad8b3.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 0BDE44F5-19B4-41BF-A5E2-9E71891F53C6.1; Mon, 31 Oct 2022 12:00:32 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 321a9a2ad8b3.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 12:00:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZZ9KW7ipb367bcMvy87VfkmKqIocu99xxePw+6GmCxrUJuMp1VHN/PCz0Z8gSEyVi5C73LIGOvUiOmcQ/UH/hfgtEuKHcUXH5XPaBB4g+XIp38AxY3UsbsM/krndt6g8tap36HWpMP7JasdtIsm8LImT/5clN4rViEUQm6vtb4A+jUT46v/YX1GVEheCb2wGr32NCPCulaRGXNv1UjzoEVvR403oYXUWIdPh3H3dL3S5xgeNUtByq3Lxhhq2ts1ZldIJYDRYaB+0DNUHeo5kHXXtIt+gDvb+tI/q56nPV8LlFcxn0aTrhyi5Mlx5HaMpMsHQCoUxtFV6HXhptVa3Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=SfeiN4KolgPb83SwEFmtEKu2GkfTB97EEWDy+ccA8CqTtFeM46CEYyvsok1h5fkbHAGetAy4zFIDF0VgftA/U4jTso1UdV+V3pQ97vTBy66kf5+29kFSfyIk7Q6GLNTBQaEIiDcX70LpHKhHV6xDY+Sepi2fJ4UZjEGii/L2BCfAKfXJ4Le+PFPt1RtU42SQuvg7Y7zBkuk0+iUyhdYYUpgyq8RR1jQZFtbiUCdyhiAwemYYU8tvDAXvMwMnzF7STi9KNQqsBnqQb82kVw8cuthArw7p8UpNqzWOrzaaYQDZee0WELrU7dEnbvb8tLMf2IL95t5CwdvQiiF0+nbufg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 12:00:30 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 12:00:30 +0000 Date: Mon, 31 Oct 2022 12:00:22 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA9PR13CA0177.namprd13.prod.outlook.com (2603:10b6:806:28::32) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|AM7EUR03FT018:EE_|AS8PR08MB6614:EE_ X-MS-Office365-Filtering-Correlation-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: zS8tPJ3I4HPOER8Bd8ZlgIiuWcqHxpW5U1Iy5ILFhs1M/AQKC/X/2TblJAAVV8dB9kjzLDN+rToZjyh/+8iskwoPuTSFI6W7mMInS7CUCcqNuXOa3Y35K7kqjqAvLTRVkg4mTJgcAB+QljLEne2CY/QZw6hNofyYbp9OM0Z776MBTtMD95hjft2DPvibEIOSBZ0BJhmh0Xs/kwpXADKwK9NTjMgoK0ZTXSAAyMnp4gvi0S9pFYseQUf2FfKIG76zmRkN8ZOqWxtFVqrL7DQtFtSx4eiBrfwRoErTx2gYoRyiCjay8xA8UG7WOJMGO2ywtgCbL5qhp10POeLAvbxChJUFw1se6NQINOw1lNIQqmLKHL2V7juk5v/GCH/Fp4ti2RhEYaxR3SvAadenpnJK6L+MGA+8P46ib+Ofx54BQj82Zhu9+bN1YmSUgn0qSwS6xre2RsocPJIALz6Ixwz7OLBFc4J6dJAkJ//9wvA0QOSqf7wBp5TEQKXhAqAcT15/LE8Ubnx9QZDAKXTX4ngILP1J1Z/qghDEgJZLDbpoSgG8c8H+o0Ac7ouDAmNFBq/b2W4jHZwxKbDyeGy+sqJoPHrYofIdJ3oZelH4VGB4atnQ54ZkcAXXgorSzjNGGeZSMPN+i12obsdQDh/Xd4v5TqmBDyFl7VR0wGSjlBowXou31knP5JWFiy8b78MI2aBdoIZ5xt7K0LU/GBe2fDfMkLxAGhYSNN2rXgFOJ5psSwrXQyXM8FvJEJYz80USBrFJejC0CsjYKR5zer6ULlBRAwVTtdjiYhgTlV/FZps0MUXnTvRrstWzdLMOZMpYvA4M5M7Jif20Xm0B2lNV721kkrn/wf/JnpXgF0Yyb9/VUQc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(84970400001)(2906002)(38100700002)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(30864003)(6666004)(8936002)(26005)(4743002)(6512007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b343adcb-faee-42be-f7cc-08dabb3781cf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cBTYLxg+PDAL1i1AOT7wUqVf1uvn4Z5XgtA1Uhxcgu89nM5r6X9kC+UH42PG9MwiCYhG+NhWjTbkqkIC4PUGJjC9FcN8mSyiHWDfqdJfxcMNNoZSV51WLxq8lcm00nSWqpjemTsHQjcOl3VBrkDe48cyct1eFeUmusLQj8Go6NF0lKnS/hpsIwaIKKFG1snJwey4j5e0mjcXpEIRPukJ3B8OojNI584wdEKammUNur6IiLrHfSG7qmFwa8c5bMKDmZ/23NX3VwhxNWG23eviASyQnYC1gmT7K6aUVLDnexPezV2ujb5xkb9+KdPnSjc8NmGCoH1XW7HlAZy7eBOQKyzn4WZILL9M3UOxssgnOG/PQSIIfrqhvhRnI6E2Saj5BwpnsS9jTsI3V4O7dwrySgm2Od/DsQGGpiUMjwbcixs/Wx5AnnGa95o1Becx2wSzUQusMF04AjqwCRzSxo+uIl5J4BX5mP4IqhAiNehcGYQSQ+eICddbwRKIsR3hveKRlnzsFudzatHn3smDmCjK7IRa7rYb1zZv7rejW8b3NFzaxAob5+hzDxzDjY7mvVJcr/qWrFDw+khNtwu8g01rXwRmq/0ZAkJOBy02xbSYupJC8JK8sGSiSLMTYfujBA9BzebsE3LBnHiFmVDoWu3nV9heu5CZ9XDqOuefX84qz3uawOQ2ftQSPTDEWtTbW4uMF3vCYqRPTS9JL9urgPffjW5rYYYcBCFfAMWhTDQlDhLj+lOFfDZWEcC+Bn21mSBMJcs3fxAvFEp4HjUlV6AStNhHz0XuPTlLy4hfkzZgiR9bORCO1TqMWqgD/RW/OqTANWPxdSg2DMmGRUCvMk6p23q2t73OjDOdUrlMX06ZAk9an3wR31YvG8pKYFjEmTGzR5SHQaAAkuvHVVpIMKNvDw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(136003)(396003)(346002)(376002)(451199015)(36840700001)(46966006)(40470700004)(4743002)(41300700001)(6512007)(81166007)(356005)(2906002)(336012)(82310400005)(36860700001)(186003)(26005)(36756003)(47076005)(8936002)(44832011)(2616005)(235185007)(82740400003)(5660300002)(30864003)(6486002)(40480700001)(478600001)(6666004)(86362001)(40460700003)(33964004)(6506007)(44144004)(84970400001)(70586007)(70206006)(8676002)(4326008)(316002)(6916009)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 12:00:43.6486 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6614 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748204714837973464?= X-GMAIL-MSGID: =?utf-8?q?1748204714837973464?= Hi All, Currently we often times generate an r -> r add even if it means we need two reloads to perform it, i.e. in the case that the values are on the SIMD side. The pairwise operations expose these more now and so we get suboptimal codegen. Normally I would have liked to use ^ or $ here, but while this works for the simple examples, reload inexplicably falls apart on examples that should have been trivial. It forces a move to r -> w to use the w ADD, which is counter to what ^ and $ should do. However ! seems to fix all the regression and still maintains the good codegen. I have tried looking into whether it's our costings that are off, but I can't seem anything logical here. So I'd like to push this change instead along with test that augment the other testcases that guard the r -> r variants. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64.md (*add3_aarch64): Add ! to the r -> r alternative. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/scalar_addp.c: New test. * gcc.target/aarch64/simd/scalar_faddp.c: New test. * gcc.target/aarch64/simd/scalar_faddp2.c: New test. * gcc.target/aarch64/simd/scalar_fmaxp.c: New test. * gcc.target/aarch64/simd/scalar_fminp.c: New test. * gcc.target/aarch64/simd/scalar_maxp.c: New test. * gcc.target/aarch64/simd/scalar_minp.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +}