From patchwork Fri Dec 29 14:42:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 183916 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp2611574dyb; Fri, 29 Dec 2023 06:43:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IFR6WVu2/Pyg8z5vpbCr1ukjfR5ZKuAp9hkFIawIqKPGkPEOtFFE1MDRy9I5aX8U6xb+kCb X-Received: by 2002:a05:620a:4727:b0:77f:76f2:1b with SMTP id bs39-20020a05620a472700b0077f76f2001bmr17988484qkb.150.1703861037586; Fri, 29 Dec 2023 06:43:57 -0800 (PST) ARC-Seal: i=4; a=rsa-sha256; t=1703861037; cv=pass; d=google.com; s=arc-20160816; b=Se2etLm/tQSiw3X23Z+SiWSG5OP2GyPsNZAMHVQrQHjnMUG7DfpuRMclOSgdfpjQCN qNCSlxxWLNA6GSd/yuCl4CxpHiZ/1PS7xoV9YsUMPudNu75wPQDgxY09R+OM6hqPd7L8 32I4cyV+qf/ZUnsS18EmbXzySRmGdhrkf4xcLtPWXKakCKo4+Djb0UyHDnNtNYUdmmbE gNpp+WsLGmSizgJ5I9DD2TX+5TH1QpXdMDfCe2eI4s8un19XMXpAbljvtCOovU1wZcFb JY2i42xUdlNwPov6rZLu7I6c8cHpc5vH9hOvdROhxqcHxNYe1uwBqGUAgPSk4dz3vbPR cnMA== ARC-Message-Signature: i=4; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:original-authentication-results :nodisclaimer:mime-version:content-disposition:message-id:subject:cc :to:from:date:authentication-results-original:dkim-signature :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=tB55N/rj14wPoWgfAO/wp7VprO6nU8K9bsKupQFjCcg=; fh=Yt1FGz6RyV7+RqQlNAyvJO9M2CgMZgoxOe6Taq+wFaM=; b=QEE8pm9Xw95cUtkolCszFtTYQ+cyjiX2uLAwEwxV+LpWEa+JFgGEJx02ebNfv9Rf3F QZrGAa+KoG7XMdFzd+PlVXo01Qv48Tn49/y7A+M6lDcFFPLiRPsHJPBKgPHOKuVkemec 5ivIb0xP17XSXURig33QSZXiTnuYrhwqM1wa9QNmPMmksIPetnn6Cm47QFGH/bAsu62T Ar+aOocvgQlOjlMXvCUr0z9//eifnWuGoVCZhkEJwq6AR5MhorZ9PgA8dKH+tGOpVRBq Tq/yJ6MEJnVmt0uBD6vJaVmHUb5sljq7pqjRAOJSq6SmFevxyLyGrIenoWaT2KVi4NKQ gHDw== ARC-Authentication-Results: i=4; mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=PYRJTt5Q; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=PYRJTt5Q; arc=pass (i=3); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b5-20020a05620a088500b007815a43df9asi6833621qka.339.2023.12.29.06.43.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Dec 2023 06:43:57 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=PYRJTt5Q; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=PYRJTt5Q; arc=pass (i=3); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4691A3858403 for ; Fri, 29 Dec 2023 14:43:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on2084.outbound.protection.outlook.com [40.107.15.84]) by sourceware.org (Postfix) with ESMTPS id 227DA3858D39 for ; Fri, 29 Dec 2023 14:43:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 227DA3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 227DA3858D39 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.15.84 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1703860995; cv=pass; b=NxAyXZ5uvbH7s5ARciNx5bwiaBHAHkg047ShpcYayoQpLDGLGxDkAj8zknwxUlUYovd3Hqow6OxaC/bgSKjII4/BR6nUNwEopt6oi8ZWBEG1e1m3XpXYjS2u4cVROGEnBcS0CUFTcw5ZiQtz7KcUT/T+EnbYTf6mbUyWzC9e9t4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1703860995; c=relaxed/simple; bh=PVYauj0MYFCVuW8S/gwPcyVJFr01L6ptRBifl+YPKSo=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=t1gICCh/A52TEVruOX9G581OGEwtSP62oZLUPWZSyb3Y/PZc6r39Mdd8a3rFMJKDcdikvDW8AXpK0Nb+k82WFsav1ysHqS7nzvLB8oY5THo8FMqovbeOQlCEMpc9aukJq7/28i8LFY1joxl3A/YksTHCJVMCN7oM9ihPBGwDf+s= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=WjLsIAlblDEsIrluWUr0VKa00M+D4DxepbKwPP00pp2Z78FZ1qFmKre0CFA+X8Kci8865CNAiH+w6oOb0P6pKhV+zDP4mbBy1/kN/0ChSGzl2wUFzpmlRWjN311GjlUbejgSsg8yO8crdRzfsEV54lhj1zp1zJzxEy7t3vAmObLTws1B1UGwky4pFZ8zurUHarInOyVKSbhm2XreCOf19YtjLFe2iOCN+qJyNf44CmP9yht0kX5Sljuom8sB4W75Y2tVoeRt8BBHjbaGvtWtv4Ts952Chn384uB209t1CnwD3lA+LwLVpJhb385GUF5PoIckiT2lbnL7fZyxjZP1ZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tB55N/rj14wPoWgfAO/wp7VprO6nU8K9bsKupQFjCcg=; b=caI7GiXaB/lSthcfUREo5U5SEGa+aJtQwK/zLJVIaEat+pZNC90aoe4sK3Kcv6dsSGK1x89OKmLJPK2nrjIJnR4zWmq/2diy3+nMUCkvhskVNYJRvSmJYc00yDVS8YP19f8W6cg5XreQUd33VhRu1EKxB9NRLsmUGG+004DF6qmuRjAVAxJYnkyhN3NtXReFDHeqYMV8x0VAe2T+h7nyJLBGPyqxJgrRSFc2O6MyJrGZTms9B/PiauBlDTBBkT3q/lyk3M3CgGPw30ND/aDebd3EHUdwygae5K4YP0JdmzSx4IIKFizMpF3yfXhTPDp1y4eoL1TbUG1ImaE7zJUljw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tB55N/rj14wPoWgfAO/wp7VprO6nU8K9bsKupQFjCcg=; b=PYRJTt5QtHfV9vqyOcma4f9BO3DgvaD1PFzsuA9+F9v7jqWn/b1QY3wpy+2MvLMGUKEqotfK+KYFukTE8kBkdYejJ3HYpJUadMMyK/NmOWcy2BJg5h/5sqEDBvstP+o6oZdwAUQbq+8f6BTebd7ed/u1TywXY4NSyS1cuzkwKws= Received: from AS9PR06CA0008.eurprd06.prod.outlook.com (2603:10a6:20b:462::22) by GV1PR08MB8379.eurprd08.prod.outlook.com (2603:10a6:150:8a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7135.20; Fri, 29 Dec 2023 14:43:09 +0000 Received: from AMS0EPF000001A8.eurprd05.prod.outlook.com (2603:10a6:20b:462:cafe::99) by AS9PR06CA0008.outlook.office365.com (2603:10a6:20b:462::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7135.21 via Frontend Transport; Fri, 29 Dec 2023 14:43:09 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001A8.mail.protection.outlook.com (10.167.16.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7159.9 via Frontend Transport; Fri, 29 Dec 2023 14:43:09 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Fri, 29 Dec 2023 14:43:08 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 5c4b67355aa199ac X-CR-MTA-TID: 64aa7808 Received: from 7030b4ce203d.3 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 567F7654-18FA-4ACB-ADE4-ABD40DE02B03.1; Fri, 29 Dec 2023 14:42:57 +0000 Received: from EUR02-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 7030b4ce203d.3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 29 Dec 2023 14:42:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Yal7CNrujlKfZmhsDCTe/kQWap+7pTCKdJ3uM4YQqhsHL2XOLMm8RXX48P7O0qosQlwtZQijuMYIvzLnEWKcmXPEimi1NVpHabRdaiXQddNYRwwtmVazmDBbDnEyhdCDPVwSeZB5HApKp2wrW9Im9mzUrXyr1E+9KcVpK+95WH4U+5RylOjAbFGE+LxMWqCzxXRgI+JwoefA3rrVfnbBpG4VOBZiMwxLc5tpijtyVv7x4X9vu1qY0BY8hh/t0Fec/ogFr0sesz+YTYcM4fpgKWWjsx8QVhE8cD9fsB8eFH6m3k+qn2tPMp1VnNfem6am1Cshq173NmzsjoIxE/rhJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tB55N/rj14wPoWgfAO/wp7VprO6nU8K9bsKupQFjCcg=; b=nSo7qxKbFRcxy4ySD6QZrttKm1kDxV8bCPdISuTjZWyOtR43hST4iYqzlRAvh6LA1Zs5ovSkq4VShVVRWQPp7/9T6cu6BqEOuhtRU+i7QQujSEpqB7dMveBI49JE3D8ym4JwL45nqY1VdYNHNG2DvkMPcAq6nqk6QJC6FQjJpXFYHzKr2ykPQVilspZEKzJmysWjIMZ5ros8YIhKE71/uzjMCjKHnSI/lFW5TZPXaG1kNA4VZb0ZJoHNrYgA85MJ6qTrfOe4l/fv4Tu+8nKgWAfWy8da32tNsEHUwtX1NR/zEDK1ebvuFJDxe+kYJQr1kngTPD9zNsXcUBvV6PhcZQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tB55N/rj14wPoWgfAO/wp7VprO6nU8K9bsKupQFjCcg=; b=PYRJTt5QtHfV9vqyOcma4f9BO3DgvaD1PFzsuA9+F9v7jqWn/b1QY3wpy+2MvLMGUKEqotfK+KYFukTE8kBkdYejJ3HYpJUadMMyK/NmOWcy2BJg5h/5sqEDBvstP+o6oZdwAUQbq+8f6BTebd7ed/u1TywXY4NSyS1cuzkwKws= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAXPR08MB6653.eurprd08.prod.outlook.com (2603:10a6:102:15f::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7135.21; Fri, 29 Dec 2023 14:42:54 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3%6]) with mapi id 15.20.7135.019; Fri, 29 Dec 2023 14:42:54 +0000 Date: Fri, 29 Dec 2023 14:42:52 +0000 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Ramana.Radhakrishnan@arm.com, Richard.Earnshaw@arm.com, nickc@redhat.com, Kyrylo.Tkachov@arm.com Subject: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P265CA0086.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2bd::19) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|PAXPR08MB6653:EE_|AMS0EPF000001A8:EE_|GV1PR08MB8379:EE_ X-MS-Office365-Filtering-Correlation-Id: c43f345a-00db-4957-639e-08dc087c79cb X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: ga+nL+w2dxqhtNjVUxejPmE7bnZGqQaUl4JzxxC/a7F41rvbyd7Yxqnw0I+l8DrsElrMTmn0WDOFYvH57feb0VRfsAN03wFYdwbWNl3XJt7THe6wPKVKJ4fhBnMjY4aJCXyFnrzKiMlzjCcTiybsbJmDCYq7NMUh82zumctAnI1YzSuIl0lzMG1ZUKmib6dj33OKy2T/ogKH85YUyXbbC2/OGtLRNfSLX4+DJwGKHaYD5vyFo5iPCAoBVa80+Nl5fUA26AONdlfgmyQsZp3MWOVrlEhF9DNLorKOMZ6ZUHWofMMNQroQ0O63+kOZc7hia40yV5R472DUSmAqgV+OT42u2A4bJeAh46ahqADNkHbNMWygTqCirqkvdkVW0DyFH6+Fel4CwScKx8fVUuPFegABJoI+q/jHMCn2zM/XJTiQkbzCsfs6V5ZuYGT+79usyZb0Wz8MKSypFyh+lMqbb1Xqnsr50jU3hL3B43ikF8WNrqo7GehDz72ekfU34++GOTSU9r7t0ke6AjISVwF2BhsHIkqMRmp0UXMaH/gEUUGkjjgH33YV5Vmam0SicfvuIUM6rmZq7UV8d8oYO5JILaAaTddrhGW+/iMdgtuQuGPCHh9TMD5AZ0w+p2vG9+aRrXzRGWGcblBm5X8HqFDwugCckBykt0wA8hFZgwKquxE= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(346002)(136003)(396003)(39860400002)(230922051799003)(1800799012)(451199024)(186009)(64100799003)(44832011)(30864003)(5660300002)(235185007)(8676002)(8936002)(4326008)(2906002)(478600001)(6486002)(44144004)(6506007)(6512007)(33964004)(6916009)(66476007)(66556008)(66946007)(316002)(41300700001)(86362001)(2616005)(26005)(36756003)(38100700002)(83380400001)(4743002)(84970400001)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6653 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001A8.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 58504483-f795-47b1-31e1-08dc087c70c1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5Gwv+3Nr9ZaLIoqAl16eZjKS9/dqdgqmSoJtSTNo/zuAVF53X5zn+c9MJosXxfO4dlA0ps/JodRPh7MEgSpnP8+SzFHFpMFPanPAvj1f8+9EO9A/hfRt7q0spMiK0mU/UKR9ikTs7UqsieSW+IwyQZWl5TfMg2wuLWbPIOR7f0L6JMTVPMh/bPfclV3XAqyZU57smmZswZWC4F3eJaLGYgeJNUPornWdxw72yYK91nva8HUyu4S6sVY3RHU0iu6fl3hIj3tm57L3AH0IAR9QGIFORp0QL/2YgRumb6sE7oYpiaXpFYJX1u9u0KrWD1eY0LXZagoSd7MLA91wDeeY9aRLtaVUGsfIFKPN2OjOM+1g17q6SG9EnRfubaupFLt0QCh6IRjJ1upEoBo7dDhwve3GihvZ9iwqGWF5ITzX051lbtP1zk7O5RC6bl5GEkmmr3RUQ2n7TXilK8vkvYqTQjV3/FGCi8Tw1XCkLxMZhmSvVB+ZYAI9nncjA6n2bana156jt6GKnbvVb2tPTWsytppTOYpavXG+aFHmD0rJERKyqQPVNMO+cg2UQozxKUAOj2RQeQ9HCZhTzCfYQWA4xgIxnID3hvcB1Nv5nyI/b0R5IjqrPAcvhFd7w9naaHdNIgZxSH6sxXhoYvtPpDSzhqOXuunKb9OaP2iwVVLvMOAB40Xj6BK1DbHui+9YRtGAapNU+oUpfWAhdKRko6gWA+Cr1Y6eIYUFslBMJN+3kAaEZkWETSqdgIiahpoGNZgVPS3QZp8x1hyTPMA7dsWmZsXV5jJCabnFV9X7KbVf4PMG5m8O5k7/z+foUj7PtXBg X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(39860400002)(396003)(376002)(346002)(230922051799003)(186009)(64100799003)(82310400011)(1800799012)(451199024)(36840700001)(46966006)(40470700004)(6506007)(44832011)(235185007)(30864003)(44144004)(33964004)(5660300002)(316002)(40460700003)(6486002)(2906002)(26005)(4326008)(6512007)(84970400001)(478600001)(36756003)(70586007)(70206006)(6916009)(36860700001)(81166007)(82740400003)(2616005)(8676002)(8936002)(40480700001)(356005)(336012)(41300700001)(4743002)(83380400001)(86362001)(47076005)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Dec 2023 14:43:09.1325 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c43f345a-00db-4957-639e-08dc087c79cb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001A8.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8379 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786627790966776079 X-GMAIL-MSGID: 1786627790966776079 Hi All, This adds an implementation for conditional branch optab for AArch32. The previous version only allowed operand 0 but it looks like cbranch expansion does not check with the target and so we have to implement all. I therefore did not commit it. This is a larger version. For e.g. void f1 () { for (int i = 0; i < N; i++) { b[i] += a[i]; if (a[i] > 0) break; } } For 128-bit vectors we generate: vcgt.s32 q8, q9, #0 vpmax.u32 d7, d16, d17 vpmax.u32 d7, d7, d7 vmov r3, s14 @ int cmp r3, #0 and of 64-bit vector we can omit one vpmax as we still need to compress to 32-bits. Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/arm/neon.md (cbranch4): New. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-early-break_2.c: Skip Arm. * gcc.dg/vect/vect-early-break_7.c: Likewise. * gcc.dg/vect/vect-early-break_75.c: Likewise. * gcc.dg/vect/vect-early-break_77.c: Likewise. * gcc.dg/vect/vect-early-break_82.c: Likewise. * gcc.dg/vect/vect-early-break_88.c: Likewise. * lib/target-supports.exp (add_options_for_vect_early_break, check_effective_target_vect_early_break_hw, check_effective_target_vect_early_break): Support AArch32. * gcc.target/arm/vect-early-break-cbranch.c: New test. --- inline copy of patch -- diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d213369ffc38fb88ad0357d848cc7da5af73bab7..0f088a51d31e6882bc0fabbad99862b8b465dd22 100644 --- diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d213369ffc38fb88ad0357d848cc7da5af73bab7..0f088a51d31e6882bc0fabbad99862b8b465dd22 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -408,6 +408,54 @@ (define_insn "vec_extract" [(set_attr "type" "neon_store1_one_lane,neon_to_gp")] ) +;; Patterns comparing two vectors and conditionally jump. +;; Avdanced SIMD lacks a vector != comparison, but this is a quite common +;; operation. To not pay the penalty for inverting == we can map our any +;; comparisons to all i.e. any(~x) => all(x). +;; +;; However unlike the AArch64 version, we can't optimize this further as the +;; chain is too long for combine due to these being unspecs so it doesn't fold +;; the operation to something simpler. +(define_expand "cbranch4" + [(set (pc) (if_then_else + (match_operator 0 "expandable_comparison_operator" + [(match_operand:VDQI 1 "register_operand") + (match_operand:VDQI 2 "reg_or_zero_operand")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_NEON" +{ + rtx mask = operands[1]; + + /* If comparing against a non-zero vector we have to do a comparison first + so we can have a != 0 comparison with the result. */ + if (operands[2] != CONST0_RTX (mode)) + { + mask = gen_reg_rtx (mode); + emit_insn (gen_xor3 (mask, operands[1], operands[2])); + } + + /* For 128-bit vectors we need an additional reductions. */ + if (known_eq (128, GET_MODE_BITSIZE (mode))) + { + /* Always reduce using a V4SI. */ + mask = gen_reg_rtx (V2SImode); + rtx low = gen_reg_rtx (V2SImode); + rtx high = gen_reg_rtx (V2SImode); + rtx op1 = simplify_gen_subreg (V4SImode, operands[1], mode, 0); + emit_insn (gen_neon_vget_lowv4si (low, op1)); + emit_insn (gen_neon_vget_highv4si (high, op1)); + emit_insn (gen_neon_vpumaxv2si (mask, low, high)); + } + + emit_insn (gen_neon_vpumaxv2si (mask, mask, mask)); + + rtx val = gen_reg_rtx (SImode); + emit_move_insn (val, gen_lowpart (SImode, mask)); + emit_jump_insn (gen_cbranch_cc (operands[0], val, const0_rtx, operands[3])); + DONE; +}) + ;; This pattern is renamed from "vec_extract" to ;; "neon_vec_extract" and this pattern is called ;; by define_expand in vec-common.md file. diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c index 5c32bf94409e9743e72429985ab3bf13aab8f2c1..dec0b492ab883de6e02944a95fd554a109a68a39 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c index 8c86c5034d7522b3733543fb384a23c5d6ed0fcf..d218a0686719fee4c167684dcf26402851b53260 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c index ed27f8635730ff0d8803517c72693625a2feddef..9dcc3372acd657458df8d94ce36c4bd96f02fd52 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-O3" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "x86_64-*-* i?86-*-*" } } } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "x86_64-*-* i?86-*-* arm*-*-*" } } } } */ #include #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c index 225106aab0a3efc7536de6f6e45bc6ff16210ea8..9fa7e6948ebfb5f1723833653fd6ad1fc65f4e8e 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-O3" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include "tree-vect.h" diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c index 0e9b2d8d385c556063a3c6fcb14383317b056a79..7cd21d33485f3abb823e1943c87e9481c41fd2c3 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c index b392dd46553994d813761da41c42989a79b90119..59ed57c5fb5f3e8197fc20058eeb0a81a55815cc 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-Ofast --param vect-partial-vector-usage=2" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include "tree-vect.h" diff --git a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c new file mode 100644 index 0000000000000000000000000000000000000000..0e9a39d231fdf4cb56590945e7cedfabd11d39b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c @@ -0,0 +1,138 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard -fno-schedule-insns -fno-reorder-blocks -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#define N 640 +int a[N] = {0}; +int b[N] = {0}; + +/* +** f1: +** ... +** vcgt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f1 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] > 0) + break; + } +} + +/* +** f2: +** ... +** vcge.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f2 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] >= 0) + break; + } +} + +/* +** f3: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f3 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] == 0) + break; + } +} + +/* +** f4: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vmvn q[0-9]+, q[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f4 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] != 0) + break; + } +} + +/* +** f5: +** ... +** vclt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f5 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] < 0) + break; + } +} + +/* +** f6: +** ... +** vcle.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f6 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] <= 0) + break; + } +} + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 05fc417877bcd658931061b7245eb8ba5abd2e09..24a937dbb59b5723af038bd9e0b89369595fcf87 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4059,6 +4059,7 @@ proc check_effective_target_vect_early_break { } { return [check_cached_effective_target_indexed vect_early_break { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_v8_neon_ok] || [check_effective_target_sse4] }}] } @@ -4072,6 +4073,7 @@ proc check_effective_target_vect_early_break_hw { } { return [check_cached_effective_target_indexed vect_early_break_hw { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_v8_neon_hw] || [check_sse4_hw_available] }}] } @@ -4081,6 +4083,11 @@ proc add_options_for_vect_early_break { flags } { return "$flags" } + if { [check_effective_target_arm_v8_neon_ok] } { + global et_arm_v8_neon_flags + return "$flags $et_arm_v8_neon_flags -march=armv8-a" + } + if { [check_effective_target_sse4] } { return "$flags -msse4.1" } --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -408,6 +408,54 @@ (define_insn "vec_extract" [(set_attr "type" "neon_store1_one_lane,neon_to_gp")] ) +;; Patterns comparing two vectors and conditionally jump. +;; Avdanced SIMD lacks a vector != comparison, but this is a quite common +;; operation. To not pay the penalty for inverting == we can map our any +;; comparisons to all i.e. any(~x) => all(x). +;; +;; However unlike the AArch64 version, we can't optimize this further as the +;; chain is too long for combine due to these being unspecs so it doesn't fold +;; the operation to something simpler. +(define_expand "cbranch4" + [(set (pc) (if_then_else + (match_operator 0 "expandable_comparison_operator" + [(match_operand:VDQI 1 "register_operand") + (match_operand:VDQI 2 "reg_or_zero_operand")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_NEON" +{ + rtx mask = operands[1]; + + /* If comparing against a non-zero vector we have to do a comparison first + so we can have a != 0 comparison with the result. */ + if (operands[2] != CONST0_RTX (mode)) + { + mask = gen_reg_rtx (mode); + emit_insn (gen_xor3 (mask, operands[1], operands[2])); + } + + /* For 128-bit vectors we need an additional reductions. */ + if (known_eq (128, GET_MODE_BITSIZE (mode))) + { + /* Always reduce using a V4SI. */ + mask = gen_reg_rtx (V2SImode); + rtx low = gen_reg_rtx (V2SImode); + rtx high = gen_reg_rtx (V2SImode); + rtx op1 = simplify_gen_subreg (V4SImode, operands[1], mode, 0); + emit_insn (gen_neon_vget_lowv4si (low, op1)); + emit_insn (gen_neon_vget_highv4si (high, op1)); + emit_insn (gen_neon_vpumaxv2si (mask, low, high)); + } + + emit_insn (gen_neon_vpumaxv2si (mask, mask, mask)); + + rtx val = gen_reg_rtx (SImode); + emit_move_insn (val, gen_lowpart (SImode, mask)); + emit_jump_insn (gen_cbranch_cc (operands[0], val, const0_rtx, operands[3])); + DONE; +}) + ;; This pattern is renamed from "vec_extract" to ;; "neon_vec_extract" and this pattern is called ;; by define_expand in vec-common.md file. diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c index 5c32bf94409e9743e72429985ab3bf13aab8f2c1..dec0b492ab883de6e02944a95fd554a109a68a39 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_2.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c index 8c86c5034d7522b3733543fb384a23c5d6ed0fcf..d218a0686719fee4c167684dcf26402851b53260 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_7.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c index ed27f8635730ff0d8803517c72693625a2feddef..9dcc3372acd657458df8d94ce36c4bd96f02fd52 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_75.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-O3" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "x86_64-*-* i?86-*-*" } } } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "x86_64-*-* i?86-*-* arm*-*-*" } } } } */ #include #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c index 225106aab0a3efc7536de6f6e45bc6ff16210ea8..9fa7e6948ebfb5f1723833653fd6ad1fc65f4e8e 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_77.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-O3" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include "tree-vect.h" diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c index 0e9b2d8d385c556063a3c6fcb14383317b056a79..7cd21d33485f3abb823e1943c87e9481c41fd2c3 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_82.c @@ -5,7 +5,7 @@ /* { dg-additional-options "-Ofast" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c index b392dd46553994d813761da41c42989a79b90119..59ed57c5fb5f3e8197fc20058eeb0a81a55815cc 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c @@ -3,7 +3,7 @@ /* { dg-require-effective-target vect_int } */ /* { dg-additional-options "-Ofast --param vect-partial-vector-usage=2" } */ -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! "arm*-*-*" } } } } */ #include "tree-vect.h" diff --git a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c new file mode 100644 index 0000000000000000000000000000000000000000..0e9a39d231fdf4cb56590945e7cedfabd11d39b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c @@ -0,0 +1,138 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard -fno-schedule-insns -fno-reorder-blocks -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#define N 640 +int a[N] = {0}; +int b[N] = {0}; + +/* +** f1: +** ... +** vcgt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f1 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] > 0) + break; + } +} + +/* +** f2: +** ... +** vcge.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f2 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] >= 0) + break; + } +} + +/* +** f3: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f3 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] == 0) + break; + } +} + +/* +** f4: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vmvn q[0-9]+, q[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f4 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] != 0) + break; + } +} + +/* +** f5: +** ... +** vclt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f5 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] < 0) + break; + } +} + +/* +** f6: +** ... +** vcle.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f6 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] <= 0) + break; + } +} + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 05fc417877bcd658931061b7245eb8ba5abd2e09..24a937dbb59b5723af038bd9e0b89369595fcf87 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4059,6 +4059,7 @@ proc check_effective_target_vect_early_break { } { return [check_cached_effective_target_indexed vect_early_break { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_v8_neon_ok] || [check_effective_target_sse4] }}] } @@ -4072,6 +4073,7 @@ proc check_effective_target_vect_early_break_hw { } { return [check_cached_effective_target_indexed vect_early_break_hw { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_v8_neon_hw] || [check_sse4_hw_available] }}] } @@ -4081,6 +4083,11 @@ proc add_options_for_vect_early_break { flags } { return "$flags" } + if { [check_effective_target_arm_v8_neon_ok] } { + global et_arm_v8_neon_flags + return "$flags $et_arm_v8_neon_flags -march=armv8-a" + } + if { [check_effective_target_sse4] } { return "$flags -msse4.1" }