From patchwork Fri Sep 23 09:34:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1410 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp126230wrt; Fri, 23 Sep 2022 02:35:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4c6h3657spa6u8dwYB9enU0sTxjQYct+0nKYJAA3S6NHZ7M6816sw5XG4SS9lj4W+Ux+VX X-Received: by 2002:a05:6402:a43:b0:44e:cf0a:5e82 with SMTP id bt3-20020a0564020a4300b0044ecf0a5e82mr7288479edb.118.1663925756391; Fri, 23 Sep 2022 02:35:56 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id qa19-20020a170907869300b0077b4a3c47d6si8352834ejc.679.2022.09.23.02.35.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Sep 2022 02:35:56 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=N7ZpzaWt; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4AE003858C52 for ; Fri, 23 Sep 2022 09:35:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4AE003858C52 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663925712; bh=4f1g8URBut6lHOwfegV091KG5mbVWa6mY1k4aYskdEM=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=N7ZpzaWtEYPWi1lMwEaRSqMjJHpPV32ndEYZehVuny2dX1H/03uGQBkVJ2StwCu1g 363fFyUd6lhg4X33NwlkEsShxU0KPqU6EpErvpID+VZS44KTdYoCBTuNAiWELPEZ/1 ivM39Wdi/iF5nAVWKNCd1kcnD2+mCbynaySm380E= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80042.outbound.protection.outlook.com [40.107.8.42]) by sourceware.org (Postfix) with ESMTPS id BD445385734F for ; Fri, 23 Sep 2022 09:34:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BD445385734F ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=IyqxRZuLCFXI6wy0LruWDXRbAsobCGoWPSrFIFuoni+2kW37t6eD7i7HaYfm4TfInxt+MJDRx/cd/Hh+UCnYEoZgMNBoBDYbbnj5qchK82Qu9veNrlcGjSNtH7g3iiTlfbsRhlN0vUxy+m2x9KuhkXOnZXEKbR6StGhGnvyRAKVQgpGRWNvfASSTCwFeeraj58euVBvn8NBGo9fndwIfhGuht8rmhVahLLzayzAOrT6VllzsmskvcPhFKQMmeKMI54h/Wb5KSDVuAqBtFERlM9mSjEeUHLJIt7mm6kFJXQtoyFz6LK5iSv8jDQXD+Qex2BV18O4EqYdqOpUvlgYrEw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4f1g8URBut6lHOwfegV091KG5mbVWa6mY1k4aYskdEM=; b=eqU+sOpoqLwl4KfLjDjZG1CiArBhBCJpQd7r5GxMVE8Xg638NLWcEVX7KeJMxF5Hl2kpIUpqSPmVNDB6AOkrZpbIl59lqNsV7KKKkucfTDxjOxHroFT0TlR522/oyCpxM6pFsOl/vAZx4hNXr3hbkubEWy8ip50kaqZmyPR/9UYLW/eHm1Oe9uUqLvznyFmDYkJ+Xm8DC8G9In1WLskXHq2gDyRQ1Cgpe3j2LpyU/Q3NWYDssX+MFuab7HZ3O/XxCSSXtXeEUdXA1nf80dTKg7ECLw04iVztyJpBrVNoqQFqK9A6HoyzdJfdw6+kM3VdAox02UZQa8s/mD91pAi88Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from FR3P281CA0012.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:1d::15) by DBAPR08MB5752.eurprd08.prod.outlook.com (2603:10a6:10:1ac::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.18; Fri, 23 Sep 2022 09:34:21 +0000 Received: from VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com (2603:10a6:d10:1d:cafe::2d) by FR3P281CA0012.outlook.office365.com (2603:10a6:d10:1d::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.9 via Frontend Transport; Fri, 23 Sep 2022 09:34:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT026.mail.protection.outlook.com (10.152.18.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.14 via Frontend Transport; Fri, 23 Sep 2022 09:34:21 +0000 Received: ("Tessian outbound 8ec96648b960:v124"); Fri, 23 Sep 2022 09:34:20 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 70e7534a792decd4 X-CR-MTA-TID: 64aa7808 Received: from 29dfe5c51192.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 728758E5-2CF6-4CD9-AEDA-DC679913B7A6.1; Fri, 23 Sep 2022 09:34:13 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 29dfe5c51192.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 23 Sep 2022 09:34:13 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fR27bwQ0ZKkcdelQYrqx8g3dch8CYvdgVXrkPzEuedRozvKVqvLCVAhfjlo69mbkfBGBQSe4L+gthC+2Q9uUC6hc8qkP3lnpWRjc4AQz8pXj06+9r3D5h84NDuA0k+QY7zZMQGab81o0EPQm6jieOl4BApPxy8tujma281Zrjq3LKlXtV9+OeJvrnOsKhO3zImF/978mEm8qEw3rJP6vqSFqGOZsayoUR+CAdO0rjIs3JT3a8DLizDJV2T8XjtE4IKag2lYUCUIMqjXj4MBIn6KvqIOA/1ExKebxmjNoUnl4WHl/4q7QrSwa9v+pMHnC7Q4HbklSHATwt3DEOZS73Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4f1g8URBut6lHOwfegV091KG5mbVWa6mY1k4aYskdEM=; b=dIX6MRj39gRJ8ewkFsuclawZf9g64ljNRiWRZ12sHghbItkJZh4ATDyWJYro1Q65rKN9XWmSXbjMzRTMcvYP6WGh8qThnjTv6Og0GBxTKQ9ARkPdB5oHdm78quiL1EaSz4PTRdx7+EHyJeQ0jYKSnjvfDix9Kb0bLnz+2pfX9G/Dfc3ANKQAnnoiufzWgekrUlAHaibdrYTK6oCpYByTw6TTppqUDcAVgo0ij8wNNFs6pFvVMFHzbqAgjzUVQHEmxZUUqoUApkfIHtInE8rQmgZa/zC39y9PyaCTSzY9/979OOl6HgsPltBZnHSi+DiEm1QB4EFmPbO0E1qHuRB1TA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAXPR08MB6414.eurprd08.prod.outlook.com (2603:10a6:102:12e::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.16; Fri, 23 Sep 2022 09:34:11 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40%4]) with mapi id 15.20.5632.021; Fri, 23 Sep 2022 09:34:11 +0000 Date: Fri, 23 Sep 2022 10:34:09 +0100 To: gcc-patches@gcc.gnu.org Subject: [PATCH 4/4]AArch64 sve2: rewrite pack + NARROWB + NARROWB to NARROWB + NARROWT Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0530.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:2c5::13) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|PAXPR08MB6414:EE_|VE1EUR03FT026:EE_|DBAPR08MB5752:EE_ X-MS-Office365-Filtering-Correlation-Id: 0ff91629-e379-46aa-752b-08da9d46cb7f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: xqf3GNN5Q4m9KxQhbIK4iOUufhcPfOw4fZ0JDXnfL/ntrBeXk8THezsLr9dwPzXVfqWDYn6SMGcn/2/1UV/YE1KqkfJpcFsFO/zpTQ9fsiM26kHgy8fvdCZwr0kAiSXi9vST3goyjDpmRrRiKtAGDE83wFVoBZ9t1Vzw1uj/UPLM/NaDSyvwOtv8tRxuvl9bfjQiQDsan5Mx+JJcuSkvg2R6V2C2OE+SdUM1EToTvb3xXfwAOfj+WvkXzZMj0sVW2uNV7INs1Yi8YepWWM3Q9FQZd4vzHQb5PL/pOF+AlkUajhw+NNyjo5t5snigf7pTA3NBXIUPrqWboO6Drv3HM02Ipl/Lz+Z7nZO4PE94jsYUVzxY3aDtjajW6QCtol2zBKrxnjuoWQ3V5HCJsScsivy/VkRIuCoO2tXmBmfb6J23OV1vAzz7JhhavPfLFWeTTsbruTi27pGeynvkIp3m//zIlBmH92bbB5tPJ5qNBNgML7uYsywuTEfp7MH8RpHy4MPPTcPJa2eL6wRs8KROtdvNPVDHLzWYJ0L+EQk/vVg5qFfvlmWtLsMhFoq9/FIM9qT2ib7IaVEWA/OmBM0L4a5v2rGlWUZUidd9egUoUICh2wDbCI7Qm+MgFUdWszm8v2WYRU5iENH9haIoFgTgMHlgmmcoBk9P6IZmTTzKCDVawRKblGgwrT49a5bd3+ZOgBkdZzb1Vy5A7mXYM1Dnn/A1pOmpWFfjOQtOJp8JgDV089tD16CzFekpYQaGQfR2jaGWpiu3imLvHx50hY5wRljg/v6lTqi91WBjLS+bCI12v8TVxzqQJk0eOJbrWFywDfI/COKWGbN/pcpaIkJb+g== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(396003)(366004)(346002)(376002)(136003)(451199015)(2616005)(83380400001)(4743002)(186003)(8676002)(235185007)(26005)(44832011)(41300700001)(38100700002)(5660300002)(6512007)(6486002)(33964004)(478600001)(44144004)(6506007)(8936002)(66946007)(84970400001)(4326008)(86362001)(66476007)(66556008)(2906002)(6916009)(316002)(36756003)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6414 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 104b1ba2-c9fa-4323-7813-08da9d46c5a2 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XhFmF2O/VIbQh/OQgBMDD0vH3TPJZRE/NyJb3IBIfT169BO4/+koLwSi0tYXUxtO2zNylsP5z+glU1nhj97KZkh+vwDHXPjkUYcmr89HeIbiqXzxwwWf+csElZxa90z/bEfzE7630KJldV1RKNZdVXjjEuO61uTBejmYii8kIT1YXxktUlUsAxYDXgOYlPPyGKD3prG0meLqXDtlx3ZoOCoFhEEiGctedUMfxLxBE8sNRrgEftAVKfGlB1CnXtkGMGCC/mmsHE6Bs4w/oOwtvwHFcPIFEgzNQFkZVGiXG3Ys5F06M9OctID+p6nt4q3bKJe0MlPivvTgVtd3l5kRqB3S1/kuEX3XsLm2rKnxcsh86aGg913f1YeLMH5FvzaA7YQ/HAaQq5qGd6lTHQPkcmavM0fQi9Wu/isM9AdNfh1Bu8nLbGNQ8bfivaxXLqYfg3rDx3SnM8Rh+MBqsjoY5sTU62DyVjLtc9yZXWszCoqAJSCtNV0XG3jxio0pOhKX541sm30aR7HxZbclFg09ym8uECksFHmj4OoAwB0/SIg5K3H+kgG5zBtJpjl5JXcRzktcn8Dg0E14hS3Ikg+E9xQYxGSSNFkoZXuDHNUZmxD1INxGW9sDHcTVLNQZyGmIi+d2/Cw+4tV4IQUyjkqDC+sQw2gV6gKi0kgrWdux31G5QrWjYYMGcBmtE9CCBf1BEfXc0OsH3sd4QivOQdBYCwh83vIpmRenABssssyUItNrAoATfq+JfszECqfPRpS5nxQF8gpyd+y22VvxndIfaL60o+62V4/BE+4c/F8YHfUKwZvHYlkteKgTsw2mWg4/2Ygl3TqTDMy3nWBeRsNOu0tiPOOvMQ4mn/wsMHRUxHOaUjVVDcmMwMcvF1KPRmUk X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(376002)(346002)(39860400002)(396003)(136003)(451199015)(36840700001)(46966006)(40470700004)(81166007)(82310400005)(82740400003)(235185007)(84970400001)(6512007)(8676002)(5660300002)(186003)(44832011)(316002)(70206006)(26005)(70586007)(4326008)(6916009)(6486002)(36860700001)(478600001)(33964004)(83380400001)(36756003)(44144004)(40480700001)(356005)(8936002)(2616005)(4743002)(2906002)(6506007)(40460700003)(336012)(86362001)(47076005)(41300700001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2022 09:34:21.2183 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0ff91629-e379-46aa-752b-08da9d46cb7f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT026.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR08MB5752 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1744752613728046060?= X-GMAIL-MSGID: =?utf-8?q?1744752613728046060?= Hi All, This adds an RTL pattern for when two NARROWB instructions are being combined with a PACK. The second NARROWB is then transformed into a NARROWT. For the example: void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n) { for (int i = 0; i < (n & -16); i+=1) pixel[i] += (pixel[i] * level) / 0xff; } we generate: addhnb z6.b, z0.h, z4.h addhnb z5.b, z1.h, z4.h addhnb z0.b, z0.h, z6.h addhnt z0.b, z1.h, z5.h add z0.b, z0.b, z2.b instead of: addhnb z6.b, z1.h, z4.h addhnb z5.b, z0.h, z4.h addhnb z1.b, z1.h, z6.h addhnb z0.b, z0.h, z5.h uzp1 z0.b, z0.b, z1.b add z0.b, z0.b, z2.b Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_sve_pack_): New. * config/aarch64/iterators.md (binary_top): New. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-div-bitmask-4.c: New test. * gcc.target/aarch64/sve2/div-by-bitmask_2.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index ab5dcc369481311e5bd68a1581265e1ce99b4b0f..0ee46c8b0d43467da4a6b98ad3c41e5d05d8cf38 100644 --- diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index ab5dcc369481311e5bd68a1581265e1ce99b4b0f..0ee46c8b0d43467da4a6b98ad3c41e5d05d8cf38 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -1600,6 +1600,25 @@ (define_insn "@aarch64_sve_" "\t%0., %2., %3." ) +(define_insn_and_split "*aarch64_sve_pack_" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: + [(match_operand:SVE_FULL_HSDI 1 "register_operand" "w") + (subreg:SVE_FULL_HSDI (unspec: + [(match_operand:SVE_FULL_HSDI 2 "register_operand" "w") + (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")] + SVE2_INT_BINARY_NARROWB) 0)] + UNSPEC_PACK))] + "TARGET_SVE2" + "#" + "&& true" + [(const_int 0)] +{ + rtx tmp = lowpart_subreg (mode, operands[1], mode); + emit_insn (gen_aarch64_sve (, mode, + operands[0], tmp, operands[2], operands[3])); +}) + ;; ------------------------------------------------------------------------- ;; ---- [INT] Narrowing right shifts ;; ------------------------------------------------------------------------- diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0dd9dc66f7ccd78acacb759662d0cd561cd5b4ef..37d8161a33b1c399d80be82afa67613a087389d4 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3589,6 +3589,11 @@ (define_int_attr brk_op [(UNSPEC_BRKA "a") (UNSPEC_BRKB "b") (define_int_attr sve_pred_op [(UNSPEC_PFIRST "pfirst") (UNSPEC_PNEXT "pnext")]) +(define_int_attr binary_top [(UNSPEC_ADDHNB "UNSPEC_ADDHNT") + (UNSPEC_RADDHNB "UNSPEC_RADDHNT") + (UNSPEC_RSUBHNB "UNSPEC_RSUBHNT") + (UNSPEC_SUBHNB "UNSPEC_SUBHNT")]) + (define_int_attr sve_int_op [(UNSPEC_ADCLB "adclb") (UNSPEC_ADCLT "adclt") (UNSPEC_ADDHNB "addhnb") diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c new file mode 100644 index 0000000000000000000000000000000000000000..0df08bda6fd3e33280307ea15c82dd9726897cfd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c @@ -0,0 +1,26 @@ +/* { dg-require-effective-target vect_int } */ +/* { dg-additional-options "-fno-vect-cost-model" { target aarch64*-*-* } } */ + +#include +#include "tree-vect.h" + +#define N 50 +#define TYPE uint32_t + +__attribute__((noipa, noinline, optimize("O1"))) +void fun1(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +} + +__attribute__((noipa, noinline, optimize("O3"))) +void fun2(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +} + +#include "vect-div-bitmask.h" + +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: detected" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c new file mode 100644 index 0000000000000000000000000000000000000000..cddcebdf15ecaa9dc515f58cdbced36c8038db1b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c @@ -0,0 +1,56 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +/* +** draw_bitmap1: +** ... +** addhnb z6.b, z0.h, z4.h +** addhnb z5.b, z1.h, z4.h +** addhnb z0.b, z0.h, z6.h +** addhnt z0.b, z1.h, z5.h +** ... +*/ +void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xff; +} + +void draw_bitmap2(uint8_t* restrict pixel, uint8_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xfe; +} + +/* +** draw_bitmap3: +** ... +** addhnb z6.h, z0.s, z4.s +** addhnb z5.h, z1.s, z4.s +** addhnb z0.h, z0.s, z6.s +** addhnt z0.h, z1.s, z5.s +** ... +*/ +void draw_bitmap3(uint16_t* restrict pixel, uint16_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xffffU; +} + +/* +** draw_bitmap4: +** ... +** addhnb z6.s, z0.d, z4.d +** addhnb z5.s, z1.d, z4.d +** addhnb z0.s, z0.d, z6.d +** addhnt z0.s, z1.d, z5.d +** ... +*/ +void draw_bitmap4(uint32_t* restrict pixel, uint32_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +} --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -1600,6 +1600,25 @@ (define_insn "@aarch64_sve_" "\t%0., %2., %3." ) +(define_insn_and_split "*aarch64_sve_pack_" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: + [(match_operand:SVE_FULL_HSDI 1 "register_operand" "w") + (subreg:SVE_FULL_HSDI (unspec: + [(match_operand:SVE_FULL_HSDI 2 "register_operand" "w") + (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")] + SVE2_INT_BINARY_NARROWB) 0)] + UNSPEC_PACK))] + "TARGET_SVE2" + "#" + "&& true" + [(const_int 0)] +{ + rtx tmp = lowpart_subreg (mode, operands[1], mode); + emit_insn (gen_aarch64_sve (, mode, + operands[0], tmp, operands[2], operands[3])); +}) + ;; ------------------------------------------------------------------------- ;; ---- [INT] Narrowing right shifts ;; ------------------------------------------------------------------------- diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0dd9dc66f7ccd78acacb759662d0cd561cd5b4ef..37d8161a33b1c399d80be82afa67613a087389d4 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3589,6 +3589,11 @@ (define_int_attr brk_op [(UNSPEC_BRKA "a") (UNSPEC_BRKB "b") (define_int_attr sve_pred_op [(UNSPEC_PFIRST "pfirst") (UNSPEC_PNEXT "pnext")]) +(define_int_attr binary_top [(UNSPEC_ADDHNB "UNSPEC_ADDHNT") + (UNSPEC_RADDHNB "UNSPEC_RADDHNT") + (UNSPEC_RSUBHNB "UNSPEC_RSUBHNT") + (UNSPEC_SUBHNB "UNSPEC_SUBHNT")]) + (define_int_attr sve_int_op [(UNSPEC_ADCLB "adclb") (UNSPEC_ADCLT "adclt") (UNSPEC_ADDHNB "addhnb") diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c new file mode 100644 index 0000000000000000000000000000000000000000..0df08bda6fd3e33280307ea15c82dd9726897cfd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c @@ -0,0 +1,26 @@ +/* { dg-require-effective-target vect_int } */ +/* { dg-additional-options "-fno-vect-cost-model" { target aarch64*-*-* } } */ + +#include +#include "tree-vect.h" + +#define N 50 +#define TYPE uint32_t + +__attribute__((noipa, noinline, optimize("O1"))) +void fun1(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +} + +__attribute__((noipa, noinline, optimize("O3"))) +void fun2(TYPE* restrict pixel, TYPE level, int n) +{ + for (int i = 0; i < n; i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +} + +#include "vect-div-bitmask.h" + +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: detected" "vect" { target aarch64*-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c new file mode 100644 index 0000000000000000000000000000000000000000..cddcebdf15ecaa9dc515f58cdbced36c8038db1b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c @@ -0,0 +1,56 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +/* +** draw_bitmap1: +** ... +** addhnb z6.b, z0.h, z4.h +** addhnb z5.b, z1.h, z4.h +** addhnb z0.b, z0.h, z6.h +** addhnt z0.b, z1.h, z5.h +** ... +*/ +void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xff; +} + +void draw_bitmap2(uint8_t* restrict pixel, uint8_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xfe; +} + +/* +** draw_bitmap3: +** ... +** addhnb z6.h, z0.s, z4.s +** addhnb z5.h, z1.s, z4.s +** addhnb z0.h, z0.s, z6.s +** addhnt z0.h, z1.s, z5.s +** ... +*/ +void draw_bitmap3(uint16_t* restrict pixel, uint16_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * level) / 0xffffU; +} + +/* +** draw_bitmap4: +** ... +** addhnb z6.s, z0.d, z4.d +** addhnb z5.s, z1.d, z4.d +** addhnb z0.s, z0.d, z6.d +** addhnt z0.s, z1.d, z5.d +** ... +*/ +void draw_bitmap4(uint32_t* restrict pixel, uint32_t level, int n) +{ + for (int i = 0; i < (n & -16); i+=1) + pixel[i] += (pixel[i] * (uint64_t)level) / 0xffffffffUL; +}