From patchwork Mon Nov 6 07:42:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 161911 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp2540276vqu; Mon, 6 Nov 2023 01:36:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IHL0E42E4OOHbd94mesPM84pW08curcoJBM5PhKSK2VsUEdkFVt87oVOtFZGb/uYfqDKaD2 X-Received: by 2002:a05:620a:444b:b0:778:9824:4b6c with SMTP id w11-20020a05620a444b00b0077898244b6cmr34513408qkp.16.1699263370008; Mon, 06 Nov 2023 01:36:10 -0800 (PST) ARC-Seal: i=4; a=rsa-sha256; t=1699263369; cv=pass; d=google.com; s=arc-20160816; b=mwKC108Ws1wPZb7qPvbvYEogK7Nk+b7mK5bDVg6OR18Tk3/feFP1x7WIP9VSm+gHLy RjZqvk/Rjeh+xE1ttepld+tdJkjgsdfgEvL0433f/WdwudTnskDjYOtE57ZvS0TXG3OS I2I7Ln+/vlNWnJrQ1IGHoCNgMHXV+8pSnuhXCjvJIPyZzYZq6e9rIvT/4rJaBDrfEuoA 636TLi0v1ET/syWTzdECcwYv3plWBmqtM0Ve29LQvUL8FWAatP3dMpUJJLQc5qkywFRY TArBstAPC7nFYxaOcX4A88f4RJCKw5ZoOrq6wN6G2zAHcyDPVYNj5L40fB5L6wUIln08 eCGQ== ARC-Message-Signature: i=4; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:original-authentication-results :nodisclaimer:mime-version:in-reply-to:content-disposition :message-id:subject:cc:to:from:date:authentication-results-original :dkim-signature:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=PZ8GF3Tk4wa3SgiA8DLwo6zgJ7BwC5NOEULj2T5+evQ=; fh=Yt1FGz6RyV7+RqQlNAyvJO9M2CgMZgoxOe6Taq+wFaM=; b=BkeT+OJhcoabzBvYDuhTrps0hXE/v+uHyhGZ/A7coTmjVbAZrU12wnRzmW9PkJG99N GCEnN2kpg6C5NKOawPnTgaxtgeJ6t8hnN9EuyuguLbYLhOmCuy4ZusdyzXgg4PVyYK6O Lgq0BdQlsliZt8jBkwSW8DYclAER0Gl99y2sDkx1fdxbbJe53wyGbgFDj0O0BEo+rVCS ymDRmKf+sBY32Z0+sY19GILboeD2o9uzXI+N0vUI32JZspIBIqinVhpxMNp0rCit03GR CqFF9bvfJNjOEY9A3tSOHGt8IsI2/16EAOohsBNkaLly8sN0qa925cRTooK/52L7evWh bRRg== ARC-Authentication-Results: i=4; mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=nz7hf0fS; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=nz7hf0fS; arc=pass (i=3); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id pr13-20020a05620a86cd00b0076edc6ca0afsi5756081qkn.172.2023.11.06.01.36.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Nov 2023 01:36:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=nz7hf0fS; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=nz7hf0fS; arc=pass (i=3); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 27F483882047 for ; Mon, 6 Nov 2023 07:44:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2052.outbound.protection.outlook.com [40.107.22.52]) by sourceware.org (Postfix) with ESMTPS id D04663831396 for ; Mon, 6 Nov 2023 07:42:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D04663831396 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D04663831396 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.52 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699256582; cv=pass; b=tjV2CTUxuvtYFYFC1y87WsNnpP3WXvCXREeDMZo3Sd5yu8zKwOhFoSSN/5396jE4bJDkfSNZl+X1XDhMeRXXeN5gVqywrPTi7bnM97hvBdMlJZPeIDvNZ0IBgM2xNik3IuA6jIMK0VMToNtg9DuY6YB6GBuutJtwgsH+785b83I= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699256582; c=relaxed/simple; bh=swpwyzUOqF0DWG2eDtldPVHpeG8Oslq2Zl7zIRaG/to=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=WRki9C7sDNJylGzrI7ZuYsSkIVNBQDRdKJxTl+YlWldf97e6H4IafRXKjRwQVAgbQFXag2eZpuSoE4N2gGlDy1ucRJ3DmZr1aP9dJb7mAavTHxdN4BqjqwL79RAG4H5ACyGW8xVkmp94f02zBjMlV7LkmYUlwKzFPWnA7mqHty8= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=SzwuEvWe68qsjuYJQmk+0FWM9smAr+L4aoEKcoVgIfa1rbdACXzI8RNlb2abtbPREc/Ip+VePzwtCbXaDy1ORYfJTA/CI+JDYMrFSriht9IdlwUxn8qwBtyfR8pljN0XcGzC5Hb6LO/Xh3y+G4Ixd7B4A++amNQjAOy0R1dgZ38auFANxWH+ol9TQPkKdxGGiX100AaGhTYyBT3884+6uZ3XWsXl7k3Pcby5fJTD0ruVcM859THcyK4ABHvyrB0pixHEVRQzFXzegpftEbBLyLXdxApGfVQhb58d8of4Ozy1KIUowXVeGFQw6Nykvc6c+ZIx32XixQEZ0YnyfBqHTg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PZ8GF3Tk4wa3SgiA8DLwo6zgJ7BwC5NOEULj2T5+evQ=; b=LK4C4UH2wTvuwrRjrIE5hVa2lfbQC6SS9pk8GuLuq/ZzKbaVrzRM/x1iyM8ZJLabaMQ0ZSrOeaIwI7D7BJ55fZuHS/uyCx70SO2981nvpG93Xlm/I+vqaiR8XfRwFe6c9d7WcVSxWurExoKZWYko9iA6D+PXEMTmJgo926vp00E54nFXjsf1/HE5pq9MsVEZtzBzj8PX4zl1D5AhxKHkAPZR9u165QsIToSqTGvn+1EJFGxG5JJVBY0PUd/0+fS8MyWL1xjQXwJQeds1a0qCgcuRSJtsWOhBo+K1MNnEQyR8umDpVMS3Kk3hJSj39v1iT0oiQdwaAGDbOAunkPg/4g== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PZ8GF3Tk4wa3SgiA8DLwo6zgJ7BwC5NOEULj2T5+evQ=; b=nz7hf0fSPdTr7LU6pNVPhqlXR2KXtUJ5/NH1YMM9w42LsaQmuUhdeSVa1sw/WyCmOBWCJnbSVwhw9birYXBVqHWXQ34whm3xWW1sZQBIynH5dyvs0UDlSmi1UbtIJHlPUVZK4utENWWl7qVAMYJ3w3+R6EGeupZZpfcr4uBS/c4= Received: from DUZPR01CA0197.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b6::11) by DU0PR08MB9273.eurprd08.prod.outlook.com (2603:10a6:10:419::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28; Mon, 6 Nov 2023 07:42:57 +0000 Received: from DB5PEPF00014B99.eurprd02.prod.outlook.com (2603:10a6:10:4b6:cafe::72) by DUZPR01CA0197.outlook.office365.com (2603:10a6:10:4b6::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28 via Frontend Transport; Mon, 6 Nov 2023 07:42:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B99.mail.protection.outlook.com (10.167.8.166) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.16 via Frontend Transport; Mon, 6 Nov 2023 07:42:56 +0000 Received: ("Tessian outbound 7c4ecdadb9e7:v228"); Mon, 06 Nov 2023 07:42:56 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: fad1c4cff1ff24ae X-CR-MTA-TID: 64aa7808 Received: from d182cc641d6a.3 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 1CE40699-F481-42A7-A01A-ED992BA002AF.1; Mon, 06 Nov 2023 07:42:49 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d182cc641d6a.3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 06 Nov 2023 07:42:49 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Oc9H+VEXS1mLVe1R8+t9OHW6FRnSbKJbS7IH29N6rYm9aOzttoIaKV7Ui6Uzr7Yt1suDHwtg2a2UXjQMKf6dKu9o5HpRr+Cty2QnHU/kvrpKxtVR8dbLspmzDm+ROw/6mYGDV995zHMoqtX08mZwTX8GUT6jWsHtHdS7olF6anUzx+5JR8zJogpqK7KC0azFubYEd88Qwil8CGdPyhcfbTRTTowm74WDlOBFPbzG7fEpRHKyABlQJndR5sMyHlh3ssLZFZm6cpD1T438hip/LRSAPozwP/YS0znZOY2ac+8q0iqu/Uf63+v2xRntsLDKibIVluM744IXrtF5vQna9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PZ8GF3Tk4wa3SgiA8DLwo6zgJ7BwC5NOEULj2T5+evQ=; b=OKx1BTSdEFWENKhIVJMfL8qSBfvueeD5sdRk07J9n7dURT4G8yXgWuUm8LrmCjzhOnxwd1gfZqgaZZ+ul13GCtSZn2IimeWc0UMKIVqIAEF897zUhWMBtE3eO8XqmgYg1sh+CCilqFzvesAvuO9nonxC0vRYVN8vtpnxUX8gC/+bzDGzHCOIsYg5uxONkGoo+zPbLwCUVRyvbhGnbvtBBIfMvUKICagFK6BKAbf60hP9hNJkmvHE5j56+3Sm2XZa0LCVyUIaNzbTG4HZdLmeaLzq7E2x0HtFNM6PFxki7i/BIk7QJBosyKQKyHOSppjFx4QJI2bAFEKMAQUPCMtxQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PZ8GF3Tk4wa3SgiA8DLwo6zgJ7BwC5NOEULj2T5+evQ=; b=nz7hf0fSPdTr7LU6pNVPhqlXR2KXtUJ5/NH1YMM9w42LsaQmuUhdeSVa1sw/WyCmOBWCJnbSVwhw9birYXBVqHWXQ34whm3xWW1sZQBIynH5dyvs0UDlSmi1UbtIJHlPUVZK4utENWWl7qVAMYJ3w3+R6EGeupZZpfcr4uBS/c4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS8PR08MB6694.eurprd08.prod.outlook.com (2603:10a6:20b:39e::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28; Mon, 6 Nov 2023 07:42:47 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::26aa:efdd:a74a:27d0]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::26aa:efdd:a74a:27d0%5]) with mapi id 15.20.6954.028; Mon, 6 Nov 2023 07:42:47 +0000 Date: Mon, 6 Nov 2023 07:42:45 +0000 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Ramana.Radhakrishnan@arm.com, Richard.Earnshaw@arm.com, nickc@redhat.com, Kyrylo.Tkachov@arm.com Subject: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0059.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:60::23) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS8PR08MB6694:EE_|DB5PEPF00014B99:EE_|DU0PR08MB9273:EE_ X-MS-Office365-Filtering-Correlation-Id: dd289889-57d8-4367-c61e-08dbde9bfe30 X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: QXeW2keOL0t6rUlnoFumzA3dyxdZv+13NOiDrA114w1mcGZZXJkYQ4sqVzzgwZHSLDMK34Yt5+DLL9UW+M96gnuNkwmYqiyK8pccH/CQtcxxnf3g0SrE+geN6K9WdEFw5xPdBNjkyLpgUhosuu+VfgMWuC3rrC1+k0GsA81vFiXpfZ5Nq+/8HUe1Z8zY7Y/W36phkHrFTdU9lWw6/Pylba9at+zkk/f2OO6WIy6BU4TiiX+222glb8tM4P0liZZrFgmkCv0Zav7YZrHzydOIxPGagrcumdW1rYSX1zX6HC9VjDyxu9BkhME6/CCfcjVwRhtUydW3Rn90gruh3t4V259kM7/V+DL7UcXVGaoBa4TXmb4FbiXxO6+f6kL4DynLgMks64UcYBKcryg72CDkIBVP1q5AFUaQwu7IqPwftesVBuh6HyrHBeAJceGXI9J9bJ4FXhXoI4CsGcP2gfwwVUH5GPQ9rXRFDkp4lNoItz94UidzhWzfqbBSJWC8QuEobcrNFz6wqjRtuKGfIU1QJAVZpJEID0GZlAr9vJ/N7/Qi7sobZGQpwzcH3ZT00sRwkIWCcryz3rpIAQCBNL9Uv2GylkEGofkeo7dVEY6fy64= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(39860400002)(366004)(396003)(376002)(136003)(346002)(230922051799003)(64100799003)(451199024)(186009)(1800799009)(2906002)(38100700002)(4326008)(8676002)(8936002)(41300700001)(84970400001)(5660300002)(86362001)(235185007)(4743002)(44832011)(6512007)(2616005)(316002)(66556008)(66476007)(66946007)(6916009)(6506007)(33964004)(44144004)(26005)(6486002)(478600001)(36756003)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6694 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B99.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 48aa3927-40ae-4c19-3ff0-08dbde9bf86f X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wnPIJseKtzCHqQfmYn9dXImhOnNvBayRJiSR5eHRqmw1kW0YZqo/Kp80GMUSGJFSfH0v2MJaqu5uTSHn8XCP8TBcxRIYXGAEqU1JFLJ3/8AlF6xsTPTd53+38ZAivL1gYVhMbqYAD3qKHse5hMufkT0hMd8UYpKv0eyDlA+ybjr7co42jOcd2vB660jTFc/WUOfQAhanOivWk51fBf5aGb/4ASlEw16kYjeVkPKUy9a4IfpoqOa82CpZPjYGdgZDCzocrDBPqy1+Zszms0f7rHe0Ti7nBZ6jUUyAwd8jO5yiNFW4Ce1C7x5TV1QjNxPGs1m4eeSuuAkZvdrKOaX0rYRN8EjV4VynAIHcGMiAs914TfRnoZjXNXSj++MBi68BzFVuW2KUotOnJiIw33W2pJ/SjSA5ih9firJfCy6yrnzkxpVeaBTtdIkVX+fglbJMpACJJQ3pMFBP7+yq1RfGTOaG8jW5risRRN+NpzXVUoVBgdg8cs5QeXnAnRnZuyQ5iiR1ID+ges5XkVvuSkqEplyzYyC+3fWyQQ5BT0A79A7o11RF6iKTYml0mFJ5J7gtazFcc4tSwXpmnNzruWc3qx4cFKDGi4UdDsvaXcK27v/UtlbT/tAMFdTQ+cknQF1cDFnhwmhpFLQRZ9VTb9yaKWDMwILPGNuiI47oXBD9R7d7llZR3qjz9Occd/NdWCsAxdw+OTO9CJPgnPGFQEdP7bbm8ZbVjz9yrQq5hzU/oIJwHEKAdfPZZcl9cF5et69i0o0C1QBjJAdV8+aEf7Oavw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(376002)(39860400002)(346002)(396003)(230922051799003)(186009)(451199024)(82310400011)(64100799003)(1800799009)(36840700001)(46966006)(40470700004)(6506007)(44144004)(33964004)(44832011)(86362001)(8936002)(8676002)(6486002)(478600001)(4326008)(84970400001)(5660300002)(36756003)(6512007)(6916009)(316002)(70586007)(70206006)(2616005)(235185007)(40480700001)(2906002)(26005)(4743002)(336012)(40460700003)(47076005)(36860700001)(41300700001)(356005)(82740400003)(81166007)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Nov 2023 07:42:56.8963 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dd289889-57d8-4367-c61e-08dbde9bfe30 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B99.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9273 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1769954901362996134 X-GMAIL-MSGID: 1781806787523101107 Hi All, This adds an implementation for conditional branch optab for AArch32. For e.g. void f1 () { for (int i = 0; i < N; i++) { b[i] += a[i]; if (a[i] > 0) break; } } For 128-bit vectors we generate: vcgt.s32 q8, q9, #0 vpmax.u32 d7, d16, d17 vpmax.u32 d7, d7, d7 vmov r3, s14 @ int cmp r3, #0 and of 64-bit vector we can omit one vpmax as we still need to compress to 32-bits. Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/arm/neon.md (cbranch4): New. gcc/testsuite/ChangeLog: * lib/target-supports.exp (vect_early_break): Add AArch32. * gcc.target/arm/vect-early-break-cbranch.c: New test. --- inline copy of patch -- diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d213369ffc38fb88ad0357d848cc7da5af73bab7..130efbc37cfe3128533599dfadc344d2243dcb63 100644 --- diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d213369ffc38fb88ad0357d848cc7da5af73bab7..130efbc37cfe3128533599dfadc344d2243dcb63 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -408,6 +408,45 @@ (define_insn "vec_extract" [(set_attr "type" "neon_store1_one_lane,neon_to_gp")] ) +;; Patterns comparing two vectors and conditionally jump. +;; Avdanced SIMD lacks a vector != comparison, but this is a quite common +;; operation. To not pay the penalty for inverting == we can map our any +;; comparisons to all i.e. any(~x) => all(x). +;; +;; However unlike the AArch64 version, we can't optimize this further as the +;; chain is too long for combine due to these being unspecs so it doesn't fold +;; the operation to something simpler. +(define_expand "cbranch4" + [(set (pc) (if_then_else + (match_operator 0 "expandable_comparison_operator" + [(match_operand:VDQI 1 "register_operand") + (match_operand:VDQI 2 "zero_operand")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_NEON" +{ + rtx mask = operands[1]; + + /* For 128-bit vectors we need an additional reductions. */ + if (known_eq (128, GET_MODE_BITSIZE (mode))) + { + /* Always reduce using a V4SI. */ + mask = gen_reg_rtx (V2SImode); + rtx low = gen_reg_rtx (V2SImode); + rtx high = gen_reg_rtx (V2SImode); + emit_insn (gen_neon_vget_lowv4si (low, operands[1])); + emit_insn (gen_neon_vget_highv4si (high, operands[1])); + emit_insn (gen_neon_vpumaxv2si (mask, low, high)); + } + + emit_insn (gen_neon_vpumaxv2si (mask, mask, mask)); + + rtx val = gen_reg_rtx (SImode); + emit_move_insn (val, gen_lowpart (SImode, mask)); + emit_jump_insn (gen_cbranch_cc (operands[0], val, const0_rtx, operands[3])); + DONE; +}) + ;; This pattern is renamed from "vec_extract" to ;; "neon_vec_extract" and this pattern is called ;; by define_expand in vec-common.md file. diff --git a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c new file mode 100644 index 0000000000000000000000000000000000000000..2c05aa10d26ed4ac9785672e6e3b4355cef046dc --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c @@ -0,0 +1,136 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#define N 640 +int a[N] = {0}; +int b[N] = {0}; + +/* f1: +** ... +** vcgt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f1 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] > 0) + break; + } +} + +/* +** f2: +** ... +** vcge.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f2 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] >= 0) + break; + } +} + +/* +** f3: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f3 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] == 0) + break; + } +} + +/* +** f4: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vmvn q[0-9]+, q[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f4 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] != 0) + break; + } +} + +/* +** f5: +** ... +** vclt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f5 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] < 0) + break; + } +} + +/* +** f6: +** ... +** vcle.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f6 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] <= 0) + break; + } +} + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 5516188dc0aa86d161d67dea5a7769e3c3d72f85..8f58671e6cfd3546c6a98e40341fe31c6492594b 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3784,6 +3784,7 @@ proc check_effective_target_vect_early_break { } { return [check_cached_effective_target_indexed vect_early_break { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_neon_ok] }}] } # Return 1 if the target supports hardware vectorization of complex additions of --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -408,6 +408,45 @@ (define_insn "vec_extract" [(set_attr "type" "neon_store1_one_lane,neon_to_gp")] ) +;; Patterns comparing two vectors and conditionally jump. +;; Avdanced SIMD lacks a vector != comparison, but this is a quite common +;; operation. To not pay the penalty for inverting == we can map our any +;; comparisons to all i.e. any(~x) => all(x). +;; +;; However unlike the AArch64 version, we can't optimize this further as the +;; chain is too long for combine due to these being unspecs so it doesn't fold +;; the operation to something simpler. +(define_expand "cbranch4" + [(set (pc) (if_then_else + (match_operator 0 "expandable_comparison_operator" + [(match_operand:VDQI 1 "register_operand") + (match_operand:VDQI 2 "zero_operand")]) + (label_ref (match_operand 3 "" "")) + (pc)))] + "TARGET_NEON" +{ + rtx mask = operands[1]; + + /* For 128-bit vectors we need an additional reductions. */ + if (known_eq (128, GET_MODE_BITSIZE (mode))) + { + /* Always reduce using a V4SI. */ + mask = gen_reg_rtx (V2SImode); + rtx low = gen_reg_rtx (V2SImode); + rtx high = gen_reg_rtx (V2SImode); + emit_insn (gen_neon_vget_lowv4si (low, operands[1])); + emit_insn (gen_neon_vget_highv4si (high, operands[1])); + emit_insn (gen_neon_vpumaxv2si (mask, low, high)); + } + + emit_insn (gen_neon_vpumaxv2si (mask, mask, mask)); + + rtx val = gen_reg_rtx (SImode); + emit_move_insn (val, gen_lowpart (SImode, mask)); + emit_jump_insn (gen_cbranch_cc (operands[0], val, const0_rtx, operands[3])); + DONE; +}) + ;; This pattern is renamed from "vec_extract" to ;; "neon_vec_extract" and this pattern is called ;; by define_expand in vec-common.md file. diff --git a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c new file mode 100644 index 0000000000000000000000000000000000000000..2c05aa10d26ed4ac9785672e6e3b4355cef046dc --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c @@ -0,0 +1,136 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#define N 640 +int a[N] = {0}; +int b[N] = {0}; + +/* f1: +** ... +** vcgt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f1 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] > 0) + break; + } +} + +/* +** f2: +** ... +** vcge.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f2 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] >= 0) + break; + } +} + +/* +** f3: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f3 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] == 0) + break; + } +} + +/* +** f4: +** ... +** vceq.i32 q[0-9]+, q[0-9]+, #0 +** vmvn q[0-9]+, q[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f4 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] != 0) + break; + } +} + +/* +** f5: +** ... +** vclt.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f5 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] < 0) + break; + } +} + +/* +** f6: +** ... +** vcle.s32 q[0-9]+, q[0-9]+, #0 +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vpmax.u32 d[0-9]+, d[0-9]+, d[0-9]+ +** vmov r[0-9]+, s[0-9]+ @ int +** cmp r[0-9]+, #0 +** bne \.L[0-9]+ +** ... +*/ +void f6 () +{ + for (int i = 0; i < N; i++) + { + b[i] += a[i]; + if (a[i] <= 0) + break; + } +} + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 5516188dc0aa86d161d67dea5a7769e3c3d72f85..8f58671e6cfd3546c6a98e40341fe31c6492594b 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3784,6 +3784,7 @@ proc check_effective_target_vect_early_break { } { return [check_cached_effective_target_indexed vect_early_break { expr { [istarget aarch64*-*-*] + || [check_effective_target_arm_neon_ok] }}] } # Return 1 if the target supports hardware vectorization of complex additions of