From patchwork Wed Oct 18 15:44:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Carlotti X-Patchwork-Id: 154937 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4884404vqb; Wed, 18 Oct 2023 08:45:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEEsyvIWkXxo9lUleGm/YhkdrGcJk9082Ca84RSZxd1lfSThKzqeEYakdcEMC4U2VDanQoI X-Received: by 2002:ac8:4e83:0:b0:417:953c:ff57 with SMTP id 3-20020ac84e83000000b00417953cff57mr7629698qtp.14.1697643900520; Wed, 18 Oct 2023 08:45:00 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n13-20020ac85a0d000000b003eb14b07a6dsi104499qta.125.2023.10.18.08.45.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 08:45:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b="i6vP8/JN"; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b="i6vP8/JN"; arc=fail (previous hop failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E9EC738582A1 for ; Wed, 18 Oct 2023 15:44:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2076.outbound.protection.outlook.com [40.107.22.76]) by sourceware.org (Postfix) with ESMTPS id BC0F43858C30 for ; Wed, 18 Oct 2023 15:44:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BC0F43858C30 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BC0F43858C30 Authentication-Results: server2.sourceware.org; arc=fail smtp.remote-ip=40.107.22.76 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1697643871; cv=fail; b=rQYLOeybfytLhzETbxCSUC79VwxG6IdexW+kPpxrSSoTrPyTO+Sw/V1yeHOMUvpOxSxrJFRFHYbjpzkpx/nOGsDVA9aFHU2VGhufniLSzFE+zC2QSMGbXDuwy74TbP71GFlTd/slpEyTjoDHXummXF1DekT+tTM0QKWS/Ahgli0= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1697643871; c=relaxed/simple; bh=09M1XeWlmEyyws1/9hHVSolRtfDomehU2wNL2wLxxUM=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=jeeYfVvpq9maGGb6gBzzfQtzKctVWbN04cOTmEPdyEi7fPDo9elVvCU78e9TcKlYKpyJIwPJPZQxJKdN9ogRi28Ab0lfkQsySo8Sc9bIfjL9do+oDqfYczO970z0wH9CVSIhlWji6xpA7HKoDFPkseh2nsLMPHRrZVat6873H7Q= ARC-Authentication-Results: i=2; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cwPfC1BqvQ3HVitZ4OVHcUuu1ZzUyYsnYs6u2ObZWAE=; b=i6vP8/JNz5rPisBNXRUA2GfMjyFpUMEPBX8Dp8D9/pa98n7DWbPNhCTOlTQEjE+hQGLw6zZcm5RWO+cb2T0um71mzCFfhZnzXIFBFoWjwhwaZir3bLhF63d9Ms3m0MClnDI30sKsxl75gLF+rYL9bOFeWTaMlAjnU4/atyZelmA= Received: from AS4PR10CA0020.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:20b:5d8::10) by PA6PR08MB10782.eurprd08.prod.outlook.com (2603:10a6:102:3d0::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.21; Wed, 18 Oct 2023 15:44:23 +0000 Received: from AM4PEPF00025F9A.EURPRD83.prod.outlook.com (2603:10a6:20b:5d8:cafe::e9) by AS4PR10CA0020.outlook.office365.com (2603:10a6:20b:5d8::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.23 via Frontend Transport; Wed, 18 Oct 2023 15:44:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00025F9A.mail.protection.outlook.com (10.167.16.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.4 via Frontend Transport; Wed, 18 Oct 2023 15:44:23 +0000 Received: ("Tessian outbound 80b6fe5915e6:v215"); Wed, 18 Oct 2023 15:44:22 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6e5a393f5bbd145c X-CR-MTA-TID: 64aa7808 Received: from 9b4f0fb783b9.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 43476DCB-B6BF-4A41-995C-5EF6CC0C83E1.1; Wed, 18 Oct 2023 15:44:11 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 9b4f0fb783b9.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 18 Oct 2023 15:44:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=alChxhZbEExafpzzwdTqnDzBCih7g1bXlWvsAmoKk8mSBq088FvKKSr6aI7KgVwyDYcDTn444gz898CEkmurmt05hCZyD7Z9XIeF86w0zSRu92qvWz/2TWA++Chp265cmo030C2K5KkJo8k1P9JeCX47Kg8tg1ry7zxVZJ/DZQzyMoQFdZL9h5TBRpQdqGAdDv2s7vvHqpG06VFo7J46Chx5hvDK4m3TTEGrzRmnt+xhk1GlpMtcDXo8e0+cJYs6yg0IVZq89Xuxy2PxLJ48zSNxKwcFF/YQnhU2/XsFAMdzy/9Oeoeej8ve/+3Fvv6b9YypiugucvvJNVINB0NawQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cwPfC1BqvQ3HVitZ4OVHcUuu1ZzUyYsnYs6u2ObZWAE=; b=kB9WG+f8y/oCI9iI1jFbcKEYlEmw/umTYiNfkY5Z0hWA8VBBeH0b7o0Z3BgwGUiSIsMe+p4YrDCr3SvIfbJiD6L0S9a9QjtwxliAD79psQR419+daFIEXcZtGvphFdkXuTyQdQwAKXRiRBC3n+L8FgDKfBKfUCHjN19eR1lQ+QLpRFP7iDDLnPRlKBdLAp3KIOy9A7Ow+BkUhnBWfpT9t2E10pvHjTfpdc8/Hh8Ajk8Ds/x+7kMsu3UXvNunwN4ebLqT4LWOqEV6Zz0a/Zg7ID3UPEDLJ2hDTJhHFu5DRXjTIEt4QytdhM3xefB+1n4ItfvJK1EyjUYsn7gRGjn9+A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cwPfC1BqvQ3HVitZ4OVHcUuu1ZzUyYsnYs6u2ObZWAE=; b=i6vP8/JNz5rPisBNXRUA2GfMjyFpUMEPBX8Dp8D9/pa98n7DWbPNhCTOlTQEjE+hQGLw6zZcm5RWO+cb2T0um71mzCFfhZnzXIFBFoWjwhwaZir3bLhF63d9Ms3m0MClnDI30sKsxl75gLF+rYL9bOFeWTaMlAjnU4/atyZelmA= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AS8PR08MB6678.eurprd08.prod.outlook.com (2603:10a6:20b:398::8) by AS8PR08MB9120.eurprd08.prod.outlook.com (2603:10a6:20b:5c1::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.36; Wed, 18 Oct 2023 15:44:09 +0000 Received: from AS8PR08MB6678.eurprd08.prod.outlook.com ([fe80::3880:6a2c:60e:3f3e]) by AS8PR08MB6678.eurprd08.prod.outlook.com ([fe80::3880:6a2c:60e:3f3e%7]) with mapi id 15.20.6886.034; Wed, 18 Oct 2023 15:44:09 +0000 Date: Wed, 18 Oct 2023 16:44:08 +0100 From: Andrew Carlotti To: gcc-patches@gcc.gnu.org Cc: richard.earnshaw@arm.com, richard.sandiford@arm.com Subject: [2/3] [aarch64] Add function multiversioning support Message-ID: <3ab87b1b-04c8-bf92-f678-9b7a58611f1a@e124511.cambridge.arm.com> References: <26bbc7e4-9d5a-fef3-2f78-1b7a03865050@e124511.cambridge.arm.com> Content-Disposition: inline In-Reply-To: <26bbc7e4-9d5a-fef3-2f78-1b7a03865050@e124511.cambridge.arm.com> X-ClientProxiedBy: LO2P265CA0030.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:61::18) To AS8PR08MB6678.eurprd08.prod.outlook.com (2603:10a6:20b:398::8) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: AS8PR08MB6678:EE_|AS8PR08MB9120:EE_|AM4PEPF00025F9A:EE_|PA6PR08MB10782:EE_ X-MS-Office365-Filtering-Correlation-Id: 1b294e3f-5983-4076-04e3-08dbcff119ec x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: cMCZ90jxKF3ADuJnB59YE5ifsQFe0t0QI9XI+uJLHPzcIruyUsdVR3bNqBuKgEzbHHoRqOTBIGfCLfe7r6p4vIZX4dynBXIgKbRp5KlMFWMEMInUKWmWips+bPVN7ARg40B+v3q3RL049Y95FOrESrpmn9eGrbhreSu/tTtM6eNW41h/Kzm/JNjiLsGn3i+RvAVyvMPcWf+Rvw3LKpB+gxgLmaarTuXD+lq3apvnTnyu5wWaXyInd59Uy/euig+8G7TEZ2h2SnYFb/HwFWk9d1wsjhJjwkluhxmQeCcBFzcMNiT/5uSAIgvPf0VXMZcCWBiRqj967MybZDyDz6bIfMh7t/nBrGIlQLLsWnta2ScFICabceZsIBP64JhQfVTA4BC8QaxH7GdiJF3yVuMODP8bcZb5rM2lPquZvmw0DqtYlf17xTm2DfCUYfePGewGPOen31bcGPDaUeNGpuM468g+Oyz0jJ2mw9ZbMM9QUf5Wihy7O0RCUuAraqHFqbimh4DxIk4rDC5YM5YFvgf6nMHPs4WiJq5EPXejtom5rcU= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AS8PR08MB6678.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(396003)(39860400002)(366004)(136003)(346002)(376002)(230922051799003)(1800799009)(451199024)(186009)(64100799003)(26005)(6506007)(8936002)(966005)(83380400001)(66946007)(5660300002)(44832011)(41300700001)(8676002)(30864003)(478600001)(2906002)(6486002)(4326008)(316002)(6916009)(66556008)(66476007)(86362001)(31696002)(38100700002)(31686004)(6512007); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9120 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00025F9A.EURPRD83.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a2893e36-ee02-4165-b85d-08dbcff11169 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NrVWr8stDisXPXK8HTfhBVn6+i2Hx0S3Fx9dv1YKkBzMnGlA9WKOkTML5RvCRRRtwadrQn3BJJMyWd6NEsfuBu+o65pLUJnx1fOndXMdMeiF0snPO3YfD57JXWoNhsI8rano0WlHaw50oPvS524Xd55O9ml52CpS9g84h56f8nEdYlFv/M2c4QF4rScZos4GRsvIw6zlACAtrMAq/+QhmrczypVPzcujerBsKmJl/uyz3oOINBCm+/7hyG6NWl6CwILTOTwzta3QWeUX6S2eIqfHCFqd3ZuR1vfEEAh1vgSk81h6aavpIEXKSbHTNkrC1j5Z2pmh91V2BXvswPEChwT1AxlT6u1jKTY788Fe3lZX0XVL7JEg6cjM+fLeuIAOz1VhjA06swXi2ag9msysAzxMMXUkxpwzYfP/UAj8Lk2C1u9fAinPyDzYXBqfO6lhy5pk79oY/1+gHd48AL5SOjv8lRL2oWOBC7yk5FKh58vC00E03fZD7YJ+XBIOmakeSWd0OHuJ+ifzupzSoV6PQX/a07ltDD1LhM7+GRVDH+0KVnLreg4sok1p07lG4JdVms0VpzHEPlxs9Qa8J5GmK0hxSV1GDYMaw7h0oxXRPX8wQjukPPPNc7r8WfbXdkkqU7FC71lIBVgVC8UoO7eWOYsQSi13CROc2jlUVkVb32cMiFahdifQoRLhoGPspdghOwZlld24IG/mKwCCx5IWaPqdD+Uy9T+vbc0foPfc1AQ= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(346002)(39860400002)(376002)(396003)(230922051799003)(451199024)(64100799003)(82310400011)(186009)(1800799009)(36840700001)(46966006)(40470700004)(26005)(40460700003)(6506007)(8676002)(8936002)(966005)(336012)(47076005)(83380400001)(5660300002)(6512007)(41300700001)(44832011)(4326008)(2906002)(30864003)(6486002)(478600001)(316002)(31696002)(70206006)(6916009)(81166007)(356005)(82740400003)(70586007)(36860700001)(40480700001)(86362001)(31686004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2023 15:44:23.1241 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1b294e3f-5983-4076-04e3-08dbcff119ec X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00025F9A.EURPRD83.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA6PR08MB10782 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780108650625834164 X-GMAIL-MSGID: 1780108650625834164 This adds initial support for function multiversion on aarch64 using the target_version and target_clones attributes. This mostly follows the Beta specification in the ACLE [1], with a few diffences that remain to be fixed: - Symbol mangling for target_clones differs from that for target_version and does not match the mangling specified in the ACLE. This inconsistency is also present in i386 and rs6000 mangling. - The target_clones attribute does not currently support an implicit "default" version. - Unrecognised target names in a target_clones attribute should be ignored (with an optional warning), but currently cause an error to be raised instead. - There is no option to disable function multiversioning at compile time. - There is no support for function multiversioning in C, since this is not yet enabled in the frontend. On the other hand, this patch happens to enable multiversioning in Ada and D as well, using their existing frontend support. This patch relies on adding functionality to libgcc, to support: - struct { unsigned long long features; } __aarch64_cpu_features; - void __init_cpu_features (void); - void __init_cpu_features_resolver (unsigned long hwcap, const __ifunc_arg_t *arg); This support matches the interface currently used in LLVM's compiler-rt, and will be implemented in a future patch (which will be merged before merging this patch). This version of the patch incorrectly uses __init_cpu_features in the ifunc resolvers, which could lead to invalid library calls at load time. I will fix this to use __init_cpu_features_resolver in a future version of the patch. [1] https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning gcc/ChangeLog: * attribs.cc (decl_attributes): Pass attribute name to target hook. * config/aarch64/aarch64.cc (aarch64_process_target_version_attr): New. (aarch64_option_valid_attribute_p): Add check and support for target_version attribute. (enum CPUFeatures): New list of for bitmask positions. (aarch64_fmv_feature_data): New. (get_feature_bit): New. (get_feature_mask_for_version): New. (compare_feature_masks): New. (aarch64_compare_version_priority): New. (make_resolver_func): New. (add_condition_to_bb): New. (compare_feature_version_info): New. (dispatch_function_versions): New. (aarch64_generate_version_dispatcher_body): New. (aarch64_get_function_versions_dispatcher): New. (aarch64_common_function_versions): New. (aarch64_mangle_decl_assembler_name): New. (TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation. (TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation. (TARGET_OPTION_FUNCTION_VERSIONS): New implementation. (TARGET_COMPARE_VERSION_PRIORITY): New implementation. (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation. (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation. (TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation. diff --git a/gcc/attribs.cc b/gcc/attribs.cc index a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6..cc935b502028392ebdc105f940900f01f79196a7 100644 --- a/gcc/attribs.cc +++ b/gcc/attribs.cc @@ -657,7 +657,8 @@ decl_attributes (tree *node, tree attributes, int flags, options to the attribute((target(...))) list. */ if (TREE_CODE (*node) == FUNCTION_DECL && current_target_pragma - && targetm.target_option.valid_attribute_p (*node, NULL_TREE, + && targetm.target_option.valid_attribute_p (*node, + get_identifier("target"), current_target_pragma, 0)) { tree cur_attr = lookup_attribute ("target", attributes); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 9c3c0e705e2e6ea3b55b4a5f1e7d3360f91eb51d..ca0e2a2507ffdbf99e17b77240504bf2d175b9c0 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -19088,11 +19088,70 @@ aarch64_process_target_attr (tree args) return true; } +/* Parse the tree in ARGS that contains the targeti_version attribute + information and update the global target options space. */ + +bool +aarch64_process_target_version_attr (tree args) +{ + if (TREE_CODE (args) == TREE_LIST) + { + if (TREE_CHAIN (args)) + { + error ("attribute % has multiple values"); + return false; + } + args = TREE_VALUE (args); + } + + if (!args || TREE_CODE (args) != STRING_CST) + { + error ("attribute % argument not a string"); + return false; + } + + const char *str = TREE_STRING_POINTER (args); + if (strcmp (str, "default") == 0) + return true; + + auto with_plus = std::string ("+") + str; + enum aarch_parse_opt_result parse_res; + auto isa_flags = aarch64_asm_isa_flags; + + std::string invalid_extension; + parse_res = aarch64_parse_extension (with_plus.c_str(), &isa_flags, + &invalid_extension); + + if (parse_res == AARCH_PARSE_OK) + { + aarch64_set_asm_isa_flags (isa_flags); + return true; + } + + switch (parse_res) + { + case AARCH_PARSE_MISSING_ARG: + error ("missing value in % attribute"); + break; + + case AARCH_PARSE_INVALID_FEATURE: + error ("invalid feature modifier %qs of value %qs in " + "% attribute", invalid_extension.c_str (), + with_plus.c_str()); + break; + + default: + gcc_unreachable (); + } + + return false; +} + /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P. This is used to process attribute ((target ("..."))). */ static bool -aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) +aarch64_option_valid_attribute_p (tree fndecl, tree name, tree args, int) { struct cl_target_option cur_target; bool ret; @@ -19100,13 +19159,22 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) tree new_target, new_optimize; tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); + bool target_version_p; + const char *attr_name = IDENTIFIER_POINTER (name); + if (strcmp (attr_name, "target") == 0) + target_version_p = false; + else if (strcmp (attr_name, "target_version") == 0) + target_version_p = true; + else + gcc_assert (false); + /* If what we're processing is the current pragma string then the target option node is already stored in target_option_current_node by aarch64_pragma_target_parse in aarch64-c.cc. Use that to avoid having to re-parse the string. This is especially useful to keep arm_neon.h compile times down since that header contains a lot of intrinsics enclosed in pragmas. */ - if (!existing_target && args == current_target_pragma) + if (!target_version_p && !existing_target && args == current_target_pragma) { DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_option_current_node; return true; @@ -19142,7 +19210,25 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) cl_target_option_restore (&global_options, &global_options_set, TREE_TARGET_OPTION (target_option_current_node)); - ret = aarch64_process_target_attr (args); + if (!target_version_p) + { + ret = aarch64_process_target_attr (args); + if (ret) + { + tree version_attr = lookup_attribute ("target_version", + DECL_ATTRIBUTES (fndecl)); + if (version_attr != NULL_TREE) + { + /* Reapply any target_version attribute after target attribute. + This should be equivalent to applying the target_version once + after processing all target attributes. */ + tree version_args = TREE_VALUE (version_attr); + ret = aarch64_process_target_version_attr (version_args); + } + } + } + else + ret = aarch64_process_target_version_attr (args); /* Set up any additional state. */ if (ret) @@ -19173,6 +19259,730 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) return ret; } +/* This enum needs to match the enum used in libgcc cpuinfo.c. */ +//TODO: Does this clash with or overlap an existing list of target features? +enum CPUFeatures { + FEAT_RNG, + FEAT_FLAGM, + FEAT_FLAGM2, + FEAT_FP16FML, + FEAT_DOTPROD, + FEAT_SM4, + FEAT_RDM, + FEAT_LSE, + FEAT_FP, + FEAT_SIMD, + FEAT_CRC, + FEAT_SHA1, + FEAT_SHA2, + FEAT_SHA3, + FEAT_AES, + FEAT_PMULL, + FEAT_FP16, + FEAT_DIT, + FEAT_DPB, + FEAT_DPB2, + FEAT_JSCVT, + FEAT_FCMA, + FEAT_RCPC, + FEAT_RCPC2, + FEAT_FRINTTS, + FEAT_DGH, + FEAT_I8MM, + FEAT_BF16, + FEAT_EBF16, + FEAT_RPRES, + FEAT_SVE, + FEAT_SVE_BF16, + FEAT_SVE_EBF16, + FEAT_SVE_I8MM, + FEAT_SVE_F32MM, + FEAT_SVE_F64MM, + FEAT_SVE2, + FEAT_SVE_AES, + FEAT_SVE_PMULL128, + FEAT_SVE_BITPERM, + FEAT_SVE_SHA3, + FEAT_SVE_SM4, + FEAT_SME, + FEAT_MEMTAG, + FEAT_MEMTAG2, + FEAT_MEMTAG3, + FEAT_SB, + FEAT_PREDRES, + FEAT_SSBS, + FEAT_SSBS2, + FEAT_BTI, + FEAT_LS64, + FEAT_LS64_V, + FEAT_LS64_ACCDATA, + FEAT_WFXT, + FEAT_SME_F64, + FEAT_SME_I64, + FEAT_SME2, + FEAT_RCPC3, //TODO: Check this index - needs to agree with LLVM. + FEAT_MAX +}; + +typedef struct +{ + const char *name; + int priority; + unsigned long long feature_mask; +} aarch64_fmv_feature_datum; + +/* List these in priority order, to make it easier to sort target strings. */ +static aarch64_fmv_feature_datum aarch64_fmv_feature_data[] = { + {"default", 0, 0ULL}, + {"rng", 10, 1ULL << FEAT_RNG}, + {"flagm", 20, 1ULL << FEAT_FLAGM}, + {"flagm2", 30, 1ULL << FEAT_FLAGM2}, + {"fp16fml", 40, 1ULL << FEAT_FP16FML}, + {"dotprod", 50, 1ULL << FEAT_DOTPROD}, + {"sm4", 60, 1ULL << FEAT_SM4}, + {"rdm", 70, 1ULL << FEAT_RDM}, + {"lse", 80, 1ULL << FEAT_LSE}, + {"fp", 90, 1ULL << FEAT_FP}, + {"simd", 100, 1ULL << FEAT_SIMD}, + {"crc", 110, 1ULL << FEAT_CRC}, + {"sha1", 120, 1ULL << FEAT_SHA1}, + {"sha2", 130, 1ULL << FEAT_SHA2}, + {"sha3", 140, 1ULL << FEAT_SHA3}, + {"aes", 150, 1ULL << FEAT_AES}, + {"pmull", 160, 1ULL << FEAT_PMULL}, + {"fp16", 170, 1ULL << FEAT_FP16}, + {"dit", 180, 1ULL << FEAT_DIT}, + {"dpb", 190, 1ULL << FEAT_DPB}, + {"dpb2", 200, 1ULL << FEAT_DPB2}, + {"jscvt", 210, 1ULL << FEAT_JSCVT}, + {"fcma", 220, 1ULL << FEAT_FCMA}, + {"rcpc", 230, 1ULL << FEAT_RCPC}, + {"rcpc2", 240, 1ULL << FEAT_RCPC2}, + {"rcpc3", 241, 1ULL << FEAT_RCPC3}, + {"frintts", 250, 1ULL << FEAT_FRINTTS}, + {"dgh", 260, 1ULL << FEAT_DGH}, + {"i8mm", 270, 1ULL << FEAT_I8MM}, + {"bf16", 280, 1ULL << FEAT_BF16}, + {"ebf16", 290, 1ULL << FEAT_EBF16}, + {"rpres", 300, 1ULL << FEAT_RPRES}, + {"sve", 310, 1ULL << FEAT_SVE}, + {"sve-bf16", 320, 1ULL << FEAT_SVE_BF16}, + {"sve-ebf16", 330, 1ULL << FEAT_SVE_EBF16}, + {"sve-i8mm", 340, 1ULL << FEAT_SVE_I8MM}, + {"f32mm", 350, 1ULL << FEAT_SVE_F32MM}, + {"f64mm", 360, 1ULL << FEAT_SVE_F64MM}, + {"sve2", 370, 1ULL << FEAT_SVE2}, + {"sve2-aes", 380, 1ULL << FEAT_SVE_AES}, + {"sve2-pmull128", 390, 1ULL << FEAT_SVE_PMULL128}, + {"sve2-bitperm", 400, 1ULL << FEAT_SVE_BITPERM}, + {"sve2-sha3", 410, 1ULL << FEAT_SVE_SHA3}, + {"sve2-sm4", 420, 1ULL << FEAT_SVE_SM4}, + {"sme", 430, 1ULL << FEAT_SME}, + {"memtag", 440, 1ULL << FEAT_MEMTAG}, + {"memtag2", 450, 1ULL << FEAT_MEMTAG2}, + {"memtag3", 460, 1ULL << FEAT_MEMTAG3}, + {"sb", 470, 1ULL << FEAT_SB}, + {"predres", 480, 1ULL << FEAT_PREDRES}, + {"ssbs", 490, 1ULL << FEAT_SSBS}, + {"ssbs2", 500, 1ULL << FEAT_SSBS2}, + {"bti", 510, 1ULL << FEAT_BTI}, + {"ls64", 520, 1ULL << FEAT_LS64}, + {"ls64_v", 530, 1ULL << FEAT_LS64_V}, + {"ls64_accdata", 540, 1ULL << FEAT_LS64_ACCDATA}, + {"wfxt", 550, 1ULL << FEAT_WFXT}, + {"sme-f64f64", 560, 1ULL << FEAT_SME_F64}, + {"sme-i16i64", 570, 1ULL << FEAT_SME_I64}, + {"sme2", 580, 1ULL << FEAT_SME2} +}; + +/* Look up a single feature name, and return the bitmask. */ +unsigned long long +get_feature_bit (char *name) +{ + /* Skip default entry here. */ + for (int i = 1; i < FEAT_MAX; i++) + if (strcmp(aarch64_fmv_feature_data[i].name, name) == 0) + return aarch64_fmv_feature_data[i].feature_mask; + return 0; +} + +/* This parses the attribute arguments to target_version in DECL and the + feature mask required to select those targets. No adjustments are made to + add or remove redundant feature requirements. */ + +unsigned long long +get_feature_mask_for_version (tree decl) +{ + tree version_attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl)); + if (version_attr == NULL) + return 0; + + const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE + (version_attr))); + if (strcmp (version_string, "default") == 0 + || strcmp (version_string, "") == 0) + return 0; + + int attr_len = strlen (version_string); + + char *feature_string = XNEWVEC (char, attr_len+ 1); + strcpy (feature_string, version_string); + + int count = 1; + for (int i = 0; i < attr_len; i++) + { + if (feature_string[i] == '+') + { + feature_string[i] = '\0'; + count++; + } + } + + unsigned long long feature_mask = 0ULL; + char *cur_feature = feature_string; + for (int i = 0; i < count; i++) + { + unsigned long long feature_bit = get_feature_bit (cur_feature); + if (feature_bit == 0) + { + /* TODO: For target_clones, we should just ignore this version + instead. */ + error_at (DECL_SOURCE_LOCATION (decl), 0, + "Unrecognised feature %s in function version string", + cur_feature); + feature_mask = -1ULL; + } + feature_mask |= feature_bit; + cur_feature += strlen(cur_feature) + 1; + } + XDELETEVEC (feature_string); + return feature_mask; +} + +/* Compare priorities of two feature masks. Return: + 1: mask1 is higher priority + -1: mask2 is higher priority + 0: masks are equal. */ + +int +compare_feature_masks (unsigned long long mask1, unsigned long long mask2) +{ + int pop1 = __builtin_popcountll(mask1); + int pop2 = __builtin_popcountll(mask2); + if (pop1 > pop2) + return 1; + if (pop2 > pop1) + return -1; + + unsigned long long diff_mask = mask1 ^ mask2; + if (diff_mask == 0ULL) + return 0; + for (int i = FEAT_MAX - 1; i > 0; i--) + { + unsigned long long bit_mask = aarch64_fmv_feature_data[i].feature_mask; + if (diff_mask & bit_mask) + return (mask1 & bit_mask) ? 1 : -1; + } + gcc_unreachable(); +} + +int +aarch64_compare_version_priority (tree decl1, tree decl2) +{ + unsigned long long mask1 = get_feature_mask_for_version (decl1); + unsigned long long mask2 = get_feature_mask_for_version (decl2); + + return compare_feature_masks (mask1, mask2); +} + +/* Make the resolver function decl to dispatch the versions of + a multi-versioned function, DEFAULT_DECL. IFUNC_ALIAS_DECL is + ifunc alias that will point to the created resolver. Create an + empty basic block in the resolver and store the pointer in + EMPTY_BB. Return the decl of the resolver function. */ + +static tree +make_resolver_func (const tree default_decl, + const tree ifunc_alias_decl, + basic_block *empty_bb) +{ + tree decl, type, t; + + /* Create resolver function name based on default_decl. */ + tree decl_name = clone_function_name (default_decl, "resolver"); + const char *resolver_name = IDENTIFIER_POINTER (decl_name); + + /* The resolver function should return a (void *). */ + type = build_function_type_list (ptr_type_node, NULL_TREE); + + decl = build_fn_decl (resolver_name, type); + SET_DECL_ASSEMBLER_NAME (decl, decl_name); + + DECL_NAME (decl) = decl_name; + TREE_USED (decl) = 1; + DECL_ARTIFICIAL (decl) = 1; + DECL_IGNORED_P (decl) = 1; + TREE_PUBLIC (decl) = 0; + DECL_UNINLINABLE (decl) = 1; + + /* Resolver is not external, body is generated. */ + DECL_EXTERNAL (decl) = 0; + DECL_EXTERNAL (ifunc_alias_decl) = 0; + + DECL_CONTEXT (decl) = NULL_TREE; + DECL_INITIAL (decl) = make_node (BLOCK); + DECL_STATIC_CONSTRUCTOR (decl) = 0; + + if (DECL_COMDAT_GROUP (default_decl) + || TREE_PUBLIC (default_decl)) + { + /* In this case, each translation unit with a call to this + versioned function will put out a resolver. Ensure it + is comdat to keep just one copy. */ + DECL_COMDAT (decl) = 1; + make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); + } + else + TREE_PUBLIC (ifunc_alias_decl) = 0; + + /* Build result decl and add to function_decl. */ + t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node); + DECL_CONTEXT (t) = decl; + DECL_ARTIFICIAL (t) = 1; + DECL_IGNORED_P (t) = 1; + DECL_RESULT (decl) = t; + + gimplify_function_tree (decl); + push_cfun (DECL_STRUCT_FUNCTION (decl)); + *empty_bb = init_lowered_empty_function (decl, false, + profile_count::uninitialized ()); + + cgraph_node::add_new_function (decl, true); + symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl)); + + pop_cfun (); + + gcc_assert (ifunc_alias_decl != NULL); + /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name. */ + DECL_ATTRIBUTES (ifunc_alias_decl) + = make_attribute ("ifunc", resolver_name, + DECL_ATTRIBUTES (ifunc_alias_decl)); + + /* Create the alias for dispatch to resolver here. */ + cgraph_node::create_same_body_alias (ifunc_alias_decl, decl); + return decl; +} + +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL + to return a pointer to VERSION_DECL if all feature bits specified in + FEATURE_MASK are not set in MASK_VAR. This function will be called during + version dispatch to decide which function version to execute. It returns + the basic block at the end, to which more conditions can be added. */ +static basic_block +add_condition_to_bb (tree function_decl, tree version_decl, + unsigned long long feature_mask, + tree mask_var, basic_block new_bb) +{ + gimple *return_stmt; + tree convert_expr, result_var; + gimple *convert_stmt; + gimple *if_else_stmt; + + basic_block bb1, bb2, bb3; + edge e12, e23; + + gimple_seq gseq; + + push_cfun (DECL_STRUCT_FUNCTION (function_decl)); + + gcc_assert (new_bb != NULL); + gseq = bb_seq (new_bb); + + + convert_expr = build1 (CONVERT_EXPR, ptr_type_node, + build_fold_addr_expr (version_decl)); + result_var = create_tmp_var (ptr_type_node); + convert_stmt = gimple_build_assign (result_var, convert_expr); + return_stmt = gimple_build_return (result_var); + + + if (feature_mask == 0) + { + /* Default version. */ + gimple_seq_add_stmt (&gseq, convert_stmt); + gimple_seq_add_stmt (&gseq, return_stmt); + set_bb_seq (new_bb, gseq); + gimple_set_bb (convert_stmt, new_bb); + gimple_set_bb (return_stmt, new_bb); + pop_cfun (); + return new_bb; + } + + tree and_expr_var = create_tmp_var (long_long_unsigned_type_node); + tree and_expr = build2 (BIT_AND_EXPR, + long_long_unsigned_type_node, + mask_var, + build_int_cst (long_long_unsigned_type_node, + feature_mask)); + gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr); + gimple_set_block (and_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (and_stmt, new_bb); + gimple_seq_add_stmt (&gseq, and_stmt); + + tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0); + if_else_stmt = gimple_build_cond (EQ_EXPR, and_expr_var, zero_llu, + NULL_TREE, NULL_TREE); + gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (if_else_stmt, new_bb); + gimple_seq_add_stmt (&gseq, if_else_stmt); + + gimple_seq_add_stmt (&gseq, convert_stmt); + gimple_seq_add_stmt (&gseq, return_stmt); + set_bb_seq (new_bb, gseq); + + bb1 = new_bb; + e12 = split_block (bb1, if_else_stmt); + bb2 = e12->dest; + e12->flags &= ~EDGE_FALLTHRU; + e12->flags |= EDGE_TRUE_VALUE; + + e23 = split_block (bb2, return_stmt); + + gimple_set_bb (convert_stmt, bb2); + gimple_set_bb (return_stmt, bb2); + + bb3 = e23->dest; + make_edge (bb1, bb3, EDGE_FALSE_VALUE); + + remove_edge (e23); + make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0); + + pop_cfun (); + + return bb3; +} + +/* Used when sorting the decls into dispatch order. */ +static int compare_feature_version_info (const void *p1, const void *p2) +{ + typedef struct _function_version_info + { + tree version_decl; + unsigned long long feature_mask; + } function_version_info; + const function_version_info v1 = *(const function_version_info *)p1; + const function_version_info v2 = *(const function_version_info *)p2; + return - compare_feature_masks (v1.feature_mask, v2.feature_mask); +} + +static int +dispatch_function_versions (tree dispatch_decl, + void *fndecls_p, + basic_block *empty_bb) +{ + gimple *ifunc_cpu_init_stmt; + gimple_seq gseq; + int ix; + tree ele; + vec *fndecls; + unsigned int num_versions = 0; + unsigned int actual_versions = 0; + unsigned int i; + + struct _function_version_info + { + tree version_decl; + unsigned long long feature_mask; + }*function_version_info; + + gcc_assert (dispatch_decl != NULL + && fndecls_p != NULL + && empty_bb != NULL); + + /*fndecls_p is actually a vector. */ + fndecls = static_cast *> (fndecls_p); + + /* At least one more version other than the default. */ + num_versions = fndecls->length (); + gcc_assert (num_versions >= 2); + + function_version_info = (struct _function_version_info *) + XNEWVEC (struct _function_version_info, (num_versions)); + + push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl)); + + gseq = bb_seq (*empty_bb); + /* Function version dispatch is via IFUNC. IFUNC resolvers fire before + constructors, so explicity call __builtin_cpu_init here. */ + tree init_fn_type = build_function_type_list (void_type_node, NULL); + tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, + get_identifier ("init_cpu_features"), + init_fn_type); + ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 0); + gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt); + gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb); + + /* Build the struct type for __aarch64_cpu_features. */ + tree global_type = lang_hooks.types.make_type (RECORD_TYPE); + tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL, + get_identifier ("features"), + long_long_unsigned_type_node); + DECL_FIELD_CONTEXT (field1) = global_type; + TYPE_FIELDS (global_type) = field1; + layout_type (global_type); + + tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier + ("__aarch64_cpu_features"), global_type); + DECL_EXTERNAL (global_var) = 1; + tree mask_var = create_tmp_var (long_long_unsigned_type_node); + + tree component_expr = build3 (COMPONENT_REF, long_long_unsigned_type_node, + global_var, field1, NULL_TREE); + gimple *component_stmt = gimple_build_assign (mask_var, component_expr); + gimple_set_block (component_stmt, DECL_INITIAL (dispatch_decl)); + gimple_set_bb (component_stmt, *empty_bb); + gimple_seq_add_stmt (&gseq, component_stmt); + + tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, mask_var); + gimple *not_stmt = gimple_build_assign (mask_var, not_expr); + gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl)); + gimple_set_bb (not_stmt, *empty_bb); + gimple_seq_add_stmt (&gseq, not_stmt); + + set_bb_seq (*empty_bb, gseq); + + pop_cfun (); + + for (ix = 0; fndecls->iterate (ix, &ele); ++ix) + { + tree version_decl = ele; + unsigned long long feature_mask; + /* Get attribute string, parse it and find the right features. */ + feature_mask = get_feature_mask_for_version (version_decl); + function_version_info [actual_versions].version_decl = version_decl; + function_version_info [actual_versions].feature_mask = feature_mask; + actual_versions++; + } + + /* Sort the versions according to descending order of dispatch priority. */ + qsort (function_version_info, actual_versions, + sizeof (struct _function_version_info), compare_feature_version_info); + + for (i = 0; i < actual_versions; ++i) + *empty_bb = add_condition_to_bb (dispatch_decl, + function_version_info[i].version_decl, + function_version_info[i].feature_mask, + mask_var, + *empty_bb); + + free (function_version_info); + return 0; +} + + +tree +aarch64_generate_version_dispatcher_body (void *node_p) +{ + tree resolver_decl; + basic_block empty_bb; + tree default_ver_decl; + struct cgraph_node *versn; + struct cgraph_node *node; + + struct cgraph_function_version_info *node_version_info = NULL; + struct cgraph_function_version_info *versn_info = NULL; + + node = (cgraph_node *)node_p; + + node_version_info = node->function_version (); + gcc_assert (node->dispatcher_function + && node_version_info != NULL); + + if (node_version_info->dispatcher_resolver) + return node_version_info->dispatcher_resolver; + + /* The first version in the chain corresponds to the default version. */ + default_ver_decl = node_version_info->next->this_node->decl; + + /* node is going to be an alias, so remove the finalized bit. */ + node->definition = false; + + resolver_decl = make_resolver_func (default_ver_decl, + node->decl, &empty_bb); + + node_version_info->dispatcher_resolver = resolver_decl; + + push_cfun (DECL_STRUCT_FUNCTION (resolver_decl)); + + auto_vec fn_ver_vec; + + for (versn_info = node_version_info->next; versn_info; + versn_info = versn_info->next) + { + versn = versn_info->this_node; + /* Check for virtual functions here again, as by this time it should + have been determined if this function needs a vtable index or + not. This happens for methods in derived classes that override + virtual methods in base classes but are not explicitly marked as + virtual. */ + if (DECL_VINDEX (versn->decl)) + sorry ("virtual function multiversioning not supported"); + + fn_ver_vec.safe_push (versn->decl); + } + + dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb); + cgraph_edge::rebuild_edges (); + pop_cfun (); + return resolver_decl; +} + +/* Make a dispatcher declaration for the multi-versioned function DECL. + Calls to DECL function will be replaced with calls to the dispatcher + by the front-end. Returns the decl of the dispatcher function. */ + +tree +aarch64_get_function_versions_dispatcher (void *decl) +{ + tree fn = (tree) decl; + struct cgraph_node *node = NULL; + struct cgraph_node *default_node = NULL; + struct cgraph_function_version_info *node_v = NULL; + struct cgraph_function_version_info *first_v = NULL; + + tree dispatch_decl = NULL; + + struct cgraph_function_version_info *default_version_info = NULL; + + gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn)); + + node = cgraph_node::get (fn); + gcc_assert (node != NULL); + + node_v = node->function_version (); + gcc_assert (node_v != NULL); + + if (node_v->dispatcher_resolver != NULL) + return node_v->dispatcher_resolver; + + /* Find the default version and make it the first node. */ + first_v = node_v; + /* Go to the beginning of the chain. */ + while (first_v->prev != NULL) + first_v = first_v->prev; + default_version_info = first_v; + while (default_version_info != NULL) + { + if (get_feature_mask_for_version + (default_version_info->this_node->decl) == 0ULL) + break; + default_version_info = default_version_info->next; + } + + /* If there is no default node, just return NULL. */ + if (default_version_info == NULL) + return NULL; + + /* Make default info the first node. */ + if (first_v != default_version_info) + { + default_version_info->prev->next = default_version_info->next; + if (default_version_info->next) + default_version_info->next->prev = default_version_info->prev; + first_v->prev = default_version_info; + default_version_info->next = first_v; + default_version_info->prev = NULL; + } + + default_node = default_version_info->this_node; + + if (targetm.has_ifunc_p ()) + { + struct cgraph_function_version_info *it_v = NULL; + struct cgraph_node *dispatcher_node = NULL; + struct cgraph_function_version_info *dispatcher_version_info = NULL; + + /* Right now, the dispatching is done via ifunc. */ + dispatch_decl = make_dispatcher_decl (default_node->decl); + TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn); + + dispatcher_node = cgraph_node::get_create (dispatch_decl); + gcc_assert (dispatcher_node != NULL); + dispatcher_node->dispatcher_function = 1; + dispatcher_version_info + = dispatcher_node->insert_new_function_version (); + dispatcher_version_info->next = default_version_info; + dispatcher_node->definition = 1; + + /* Set the dispatcher for all the versions. */ + it_v = default_version_info; + while (it_v != NULL) + { + it_v->dispatcher_resolver = dispatch_decl; + it_v = it_v->next; + } + } + else + { + error_at (DECL_SOURCE_LOCATION (default_node->decl), + "multiversioning needs % which is not supported " + "on this target"); + } + + return dispatch_decl; +} + +bool +aarch64_common_function_versions (tree fn1, tree fn2) +{ + if (TREE_CODE (fn1) != FUNCTION_DECL + || TREE_CODE (fn2) != FUNCTION_DECL) + return false; + + return (aarch64_compare_version_priority (fn1, fn2) != 0); +} + + +tree +aarch64_mangle_decl_assembler_name (tree decl, tree id) +{ + /* For function version, add the target suffix to the assembler name. */ + if (TREE_CODE (decl) == FUNCTION_DECL + && DECL_FUNCTION_VERSIONED (decl)) + { + unsigned long long feature_mask = get_feature_mask_for_version (decl); + + /* No suffix for the default version. */ + if (feature_mask == 0ULL) + return id; + + char suffix[2048]; + int pos = 0; + const char *base = IDENTIFIER_POINTER (id); + + for (int i = 1; i < FEAT_MAX; i++) + { + if (feature_mask & aarch64_fmv_feature_data[i].feature_mask) + { + suffix[pos] = 'M'; + strcpy (&suffix[pos+1], aarch64_fmv_feature_data[i].name); + pos += strlen(aarch64_fmv_feature_data[i].name) + 1; + } + } + suffix[pos] = '\0'; + + char *ret = XNEWVEC (char, strlen (base) + strlen (suffix) + 3); + sprintf (ret, "%s._%s", base, suffix); + + if (DECL_ASSEMBLER_NAME_SET_P (decl)) + SET_DECL_RTL (decl, NULL); + + id = get_identifier (ret); + } + return id; +} + + /* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are tri-bool options (yes, no, don't care) and the default value is DEF, determine whether to reject inlining. */ @@ -27804,6 +28614,12 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_OPTION_VALID_ATTRIBUTE_P #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p +#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P +#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P aarch64_option_valid_attribute_p + +#undef TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE +#define TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE "target_version" + #undef TARGET_SET_CURRENT_FUNCTION #define TARGET_SET_CURRENT_FUNCTION aarch64_set_current_function @@ -28128,6 +28944,24 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_CONST_ANCHOR #define TARGET_CONST_ANCHOR 0x1000000 +#undef TARGET_OPTION_FUNCTION_VERSIONS +#define TARGET_OPTION_FUNCTION_VERSIONS aarch64_common_function_versions + +#undef TARGET_COMPARE_VERSION_PRIORITY +#define TARGET_COMPARE_VERSION_PRIORITY aarch64_compare_version_priority + +#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY +#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \ + aarch64_generate_version_dispatcher_body + +#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER +#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \ + aarch64_get_function_versions_dispatcher + +#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME +#define TARGET_MANGLE_DECL_ASSEMBLER_NAME aarch64_mangle_decl_assembler_name + + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-aarch64.h"