From patchwork Thu Jan 11 20:13:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 187481 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1698180dyi; Thu, 11 Jan 2024 12:15:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IFV2aaB/t/J90R6czb53kvCn9L5AZp5XoYzHvtHfwRfjIZcPk6c74Zst0ee7YL0B+Z+sPhA X-Received: by 2002:a05:6830:1007:b0:6dd:ec84:fa09 with SMTP id a7-20020a056830100700b006ddec84fa09mr309467otp.73.1705004106124; Thu, 11 Jan 2024 12:15:06 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705004106; cv=pass; d=google.com; s=arc-20160816; b=K5NvuCzXXCvnlbzbm8opMoC4RXExgAJTM650QKhIHL1qjBr9y0mm23/VhaFrK0XK14 lwF6WSnKPbluaLwU/R4nmI5FwLeJ9/Ayl4WdgKpek5yrCoaJkIlEWn3rMLOFx2W/24IL wClsCQeq3d7DBqM+E6FMPaYK3pafXustft7tM/44Na5OxSgijKDqNvLq8QK7vO8wXRNq n9x7Xz0svjrmusRL+B3o+aBSSxKBcTanXi5djbGSIFBrxk8MZixmdVDt/WuHK8lOmyZS m2JbrGRrFHfoFkbPkeR5eSxhlKqqTb0aPxF93LzC2Jj1J4zZgON40zfFQvCwR9AVY8rt GpPg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:ironport-sdr :arc-filter:dmarc-filter:delivered-to; bh=rdDQyP8P/7tQ0xyXFp7mrLH5NiaUlC9HPORcXTuFp/I=; fh=0oxoBFh7cBkNO243yGRJG5uRzJF8QS4TtwYQLI791sY=; b=NEGS/6HNALbVhCqZam71ySufsOKmfRqPXR/cquXT3sdrgVZyXTRykYO/8aBwFZP2Nm V8MrfpoPYqWHqGXsdzcVMtug6WFYV+nnmsMSzmYAHWTl3H/icmgOLtMq4+AaP5lk4HVF ez9gqVov9/YGoj553hCFPzcskT2M75ybYhsjVRAd6rDjSFOTlItn6DByWKVMk8dmvMQd 8jHvAErL/c21EUWq4hGw7gJFVPBtwIfaJZ1Wn3R9i1QYC/LFxk9xMyYFrQZlQxirp8Tm qKoacFv+PlEHghhHQKWIMwKWkdLZbh/PK7kyhYPQY/mKweGc3XNHbD181sQRoEVSZltV xTCg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e18-20020ac85dd2000000b00427b7c81e5csi1639325qtx.711.2024.01.11.12.15.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 12:15:06 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA6F63857BAB for ; Thu, 11 Jan 2024 20:15:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 06A383858D1E for ; Thu, 11 Jan 2024 20:14:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 06A383858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 06A383858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=68.232.129.153 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705004050; cv=none; b=POmxYA2/pnnfO3ljXPG/CjiVHPjsFqcMkq8yXQu0b7ahC4CITJ0s2jmnxlB4TNKdtLXRY+kt6HYc+wv8mMHPsCgw7HKntp3IbpiwoSejye0BlmjaiqjGZXfDfia5PvIRcxEUc27NK8bLJSIgBNy41YqpBSANONkdkF/Te2+yd2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705004050; c=relaxed/simple; bh=rYjfslZsSi1oDA62H3qaRUR0iFc4lJt6PgbtxIIXCC4=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=F5NMPs8XC3e+3OMMiTYiqOgiNc0af4ItXWNwiOrVeNCJH85NstbIn9M3fPLXWCff7nL+a0TqWe3wvxIVzH5rHrdMUOY+XOHnejU8A3v2Q5rctqPdcZjlD6rryAsgfBSI4AFPVs5N79rAV2oGYc72t1y3CL/qL3/1H0So33NBcrM= ARC-Authentication-Results: i=1; server2.sourceware.org X-CSE-ConnectionGUID: 0S7KGw2tQfWYnALBzGQEyg== X-CSE-MsgGUID: R3uvZnryTLm1sq4F0sXY1w== X-IronPort-AV: E=Sophos;i="6.04,187,1695715200"; d="scan'208";a="31343713" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 11 Jan 2024 12:14:02 -0800 IronPort-SDR: nL/Taz1EW+MXOdfA0mDfejmt06+vJdyjMu4/yt5au4fIJZU+MLEz7N0p4d9aaN2q5YEFHG7hvz Ab4u9cJXIpbjqE8nPxq/zXghgF9MCnAzTH9mll0QEYuzXjxlndNTiiTa9hp0BaxVHfQlgZKMj0 1UfO5fVceI0Kq9SNqi4e+ZBjiAxHpq/dbidWblLbe84tkgmql9eZnYO/hSqxWaHkiWYUIw9YdH toTfIA2IN46HUC1iajwa3LrNxhEoxWJNU2NWMFI/INVU3yCYdGYDjnp0tn4uSDFMvC4SrGQ3u4 OYs= From: Julian Brown To: CC: , , Subject: [PATCH] OpenMP 5.1: WIP delimited (begin/end) 'declare variant' support Date: Thu, 11 Jan 2024 20:13:46 +0000 Message-ID: <20240111201346.566341-1-julian@codesourcery.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787826385524824796 X-GMAIL-MSGID: 1787826385524824796 This WIP patch adds preliminary and very lightly-tested support for the "begin declare variant" and "end declare variant" directives of OpenMP 5.1. I am posting it now for logistical reasons, rather than because I believe it is immediately ready for review. Some notes follow on the implementation as-is and on my understanding of what is still to be done. I have tried to build on top of the existing support for "declare variant" as far as possible. I also borrowed ideas and some of the implementation from function versioning support (but see below regarding template handling). There are mostly four aspects to the implementation: 1) Inside begin/end "declare variant" blocks, you may define specialised versions of functions with the same name as some base function in the main part of the program. The specialised functions must have some way of coexisting with other functions with the same name and prototype. 2) Begin/end "declare variant" blocks may be nested, and the OpenMP context is formed from a combination of the outer and inner levels (in some way that depends on the trait sets, etc. in use). 3) The same-named "declare variant" functions must be resolved to the right implementation in some given OpenMP context (including in parts of the program not enclosed by OpenMP directives). 4) The same-named, specialized versions of functions must have their names mangled by adding a context-specific suffix string. Taking these in turn: 1. "declare variant" overloading -------------------------------- In C++, this looks somewhat like function overloading, and that is also how the existing support for function multiversioning is implemented (https://gcc.gnu.org/wiki/FunctionMultiVersioning). Unfortunately this doesn't work with templates at present, so the attached patch has to do a little more in that regard. This works in simple cases, but I am not entirely convinced it is a complete solution yet -- in particular pt.cc:spec_hasher::equal always considers two functions with the "omp declare variant overload" attribute to be unequal. Should it be comparing the context? Similarly, I am not entirely convinced the logic in call.cc:joust is correct, though it seems to work for what I've tried it with so far. In C, which is so-far unimplemented, this scheme will not work so smoothly. An alternative might be to mangle function names earlier, or perhaps mangle with some temporary name ("foo$1", "foo$2", ...) early in compilation then re-mangle properly later (such functions would need to be considered "the same" in some places, but not in others. Details TBD of course). 2. Context merging ------------------ Contexts are merged eagerly as "begin/end declare variant" directives are being parsed: hence, a top-level directive will be merged with its immediate child, and the grandchild will be merged with the combined "child+parent" context. So, a function defined at some given point in the nesting hierarchy will decorated with a copy of the innermost context. Function decls encountered within the block (or nested block, ...) are annotated with a new attribute, "omp declare variant overload". This is similar to the existing "omp declare variant variant" attribute, but: - It records the entire context (as specified on the enclosing "begin declare variant" block(s)), not just the "contruct" set. - It is set immediately for each function within a "declare variant" block (see below). Some aspects of context merging need verification against the standard -- it's not always completely obvious how trait properties should be combined. 3. Function resolution ---------------------- The current patch tries to adjust function attributes so they are similar to those used by the existing non-delimited "declare variant" support. For that, if we have a fragment like this: T foo_x86 (T v) { ... } [...] #pragma omp declare variant (foo_x86) match(device={arch(x86)}) #pragma omp declare variant (foo_powerpc) match(device={arch(powerpc)}) #pragma omp declare variant (foo_aarch64) match(device={arch(aarch64)}) T foo (T v) { return v + 1; } The function "foo" is annotated with one "omp declare variant base" attribute for each named/specialized function version. The VALUE field of the attribute is a tuple specifying the *name* of the specialized function, the context, and a location. This name is then looked up in decl.cc:omp_declare_variant_finalize_one, a function call to the specialized function is synthesized (to mark that function as requiring template instantiation, amongst other things?), and a "omp declare variant variant" attribute with *just the construct part* of the context selector is added to the specialized function. For the new "overload"-type delimited "declare variant" functions, this works backwards. We know the full context for the variant function and we already have the FUNCTION_DECL for it (so we don't need to look up the name). So, we create a "omp declare variant base" attribute on the version of the function that *doesn't* have a "omp declare variant overload" attribute already, for each *other* variant that does have such an attribute. This makes the end result more-or-less the same as with the non-delimited "declare variant" function support, except the "omp declare variant base" attributes point to specialised functions with "omp declare variant overload" instead of "omp declare variant variant" attributes. This should work no matter what order the base function or specialized versions of it are defined in, but those bits aren't really heavily tested. Class members and template class members are supported also, but again, testing has only been done very lightly so far. 4. Function mangling -------------------- The last part is function mangling, which makes use of the tables that Sandra added as part of her context-selector abstraction rework patch series. This works for the set of new (very incomplete!) tests, but probably needs a little more work -- mostly to make sure that e.g. some unfortunately-named new architecture or ISA extension (and so forth) can't create name collisions. Some of the logic regarding "separator" string insertion is also a bit rough/buggy, I think. 5. AOB ------ There are too many "lookup_attribute" calls in places that are probably critical paths -- it might be better to change these to use another tree flag instead. Due to light testing and the variety of things one can specify with context selectors, some behaviour might well be completely bogus. E.g. memory-order traits probably do nothing. There are almost certainly circumstances where context matching is done in the wrong place or gets the wrong results. There is no support yet for "elision" -- removing code from the source program altogether between begin/end "declare variant" directives, following e.g. OpenMP 5.2 "7.5.5 begin declare variant Directive"): "If the context selector of a begin declare variant directive contains traits in the device or implementation set that are known never to be compatible with an OpenMP context during the current compilation, the preprocessed code that follows the begin declare variant directive up to its paired end directive is elided." ...as discussed at the GNU Tools Cauldron (2023). This has somewhat far-reaching consequences, depending perhaps on how one interprets "the current compilation", but the plan is/was to only perform elision for cases we know for sure can't be supported, like within FPGA-requiring contexts. There are various questions to be answered/FIXMEs to be addressed throughout the code, apart from things mentioned above already. Data structures could probably use some work (it's a bit wasteful in places at the moment, e.g. omp_combine_trait_sets). The patch inherits various problems with context selectors noted elsewhere (constants instead of expressions in some places, no dynamic dispatch, ...). Hopefully none of the choices made so far make those problems worse! Restrictions listed in e.g. OpenMP 5.2, "7.5.5 begin declare variant Directive", are not yet checked against. It would be nice if the "overload"-type declare variant functions worked smoothly alongside the existing function multiversion support, but since that's an extension, I don't suppose it's a requirement (maybe a "sorry" here and there might be needed). None of the tests are finished, they're just the bits and pieces I've been using whilst developing the patch. They mostly aren't even actually testing anything (except that the compiler doesn't crash, I suppose). 2024-01-11 Julian Brown gcc/c-family/ * c-attribs.cc (c_common_gnu_attributes): Add "omp declare variant overload". (handle_omp_declare_variant_attribute): Update comment. gcc/ * coverage.cc (coverage_begin_function): Support DECL_FUNCTION_OMP_VARIANT flag. * omp-expand.cc (expand_omp_target): Handle DECL_FUNCTION_OMP_VARIANT flag. * omp-general.cc (tree-hash-traits.h): Include. (kind_abbrevs, atomic_default_mem_order_abbrevs): New. (omp_ts_map): Populate attrib fields. (omp_context_name_list_prop_1): New helper function. (omp_context_name_list_prop): Adjust to use above. (omp_copy_trait_set): New function. (omp_trait_prop): New type. (omp_trait_prop_hash, omp_trait_prop_hash::hash, omp_trait_prop_hash::is_empty, omp_trait_prop_hash::is_deleted, omp_trait_prop_hash::equal, omp_trait_prop_hash::mark_empty, omp_trait_prop_set): New. (omp_gather_trait_sets, omp_combine_trait_properties, omp_combine_trait_sets): New functions. (omp_get_context_selector_list): Add prototype (FIXME!). (omp_merge_context_selectors): New function. (omp_mangle_obstack, omp_name_obstack, omp_name_base, write_char, write_string, omp_start_mangling, omp_finish_mangling, omp_init_mangle, omp_string_compare, omp_stringify_sorted_property_set, omp_mangle_context_selector): New. * omp-general.h (omp_merge_context_selectors, omp_init_mangle, omp_mangle_context_selector): Add prototypes. * omp-low.cc (create_omp_child_function): Copy DECL_FUNCTION_OMP_VARIANT flag. * omp-selectors.h (omp_ts_info): Add abbrev_name, prop_abbrevs fields. * tree-cfg.cc (dump_function_to_file): Support "omp declare variant overload". * tree-core.h (tree_function_decl): Add omp_variant flag. * tree.h (DECL_FUNCTION_OMP_VARIANT): New. gcc/cp/ * call.cc (joust): Support OpenMP variant overload functions. * class.cc (add_method): Allow class members to be OpenMP variant functions. (resolve_address_of_overloaded_function): Add assertion/TBD comment. * cp-tree.h (cp_omp_declare_variant_attr): New struct. (saved_scope): Add omp_declare_variant_attribute member. (maybe_omp_variant_functions): Add prototype. * decl.cc (decls_match): Support OpenMP variant overload functions. (maybe_mark_omp_variant_functions, maybe_omp_variant_functions): New functions. (duplicate_function_template_decls, duplicate_decls): Support OpenMP variant overload [template] functions. (omp_declare_variant_finalize_one): Support OpenMP variant overload functions. * decl2.cc (cplus_decl_attributes): Record contexts for OpenMP variant overload functions. * lex.cc (omp-general.h): Include. (cxx_init): Initialise OpenMP variant function mangling. * mangle.cc (omp-general.h): Include. (write_mangled_name): Support OpenMP variant function mangling. * parser.cc (cp_parser_omp_begin): Support parsing of "begin declare variant" directives. (cp_parser_omp_end): Similar, for "end declare variant". * pt.cc (spec_hasher::equal): Support OpenMP variant overload functions. (tsubst_function_decl): Add forward declaration. (tsubst_attribute): Support instantiating "omp declare variant base" attribute when info tuple represents a FUNCTION_DECL rather than an identifier (for "overload" variants), and the "omp declare variant overload" attribute. (tsubst_function_decl): Remove default argument from here (in forward declaration above now -- FIXME). Call tsubst_attributes for "omp declare variant base". * semantics.cc (finish_translation_unit): Check for mismatched begin/end declare variant directives. gcc/testsuite/ * g++.dg/gomp/delim-declare-variant-1.C: New (WIP) test. * g++.dg/gomp/delim-declare-variant-2.C: Likewise. * g++.dg/gomp/delim-declare-variant-3.C: Likewise. * g++.dg/gomp/delim-declare-variant-4.C: Likewise. * g++.dg/gomp/delim-declare-variant-5.C: Likewise. * g++.dg/gomp/delim-declare-variant-11.C: Likewise. * g++.dg/gomp/delim-declare-variant-12.C: Likewise. libgomp/ * testsuite/libgomp.c++/delim-declare-variant-6.C: New (WIP) test. * testsuite/libgomp.c++/delim-declare-variant-7.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-8.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-9.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-10.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-13.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-14.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-15.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-16.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-17.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-18.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-19.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-20.C: Likewise. * testsuite/libgomp.c++/delim-declare-variant-21.C: Likewise. --- gcc/c-family/c-attribs.cc | 6 +- gcc/coverage.cc | 1 + gcc/cp/call.cc | 23 + gcc/cp/class.cc | 15 + gcc/cp/cp-tree.h | 7 + gcc/cp/decl.cc | 176 ++++- gcc/cp/decl2.cc | 11 + gcc/cp/lex.cc | 3 + gcc/cp/mangle.cc | 16 + gcc/cp/parser.cc | 143 +++- gcc/cp/pt.cc | 41 +- gcc/cp/semantics.cc | 7 + gcc/omp-expand.cc | 2 + gcc/omp-general.cc | 705 ++++++++++++++++-- gcc/omp-general.h | 3 + gcc/omp-low.cc | 2 + gcc/omp-selectors.h | 2 + .../g++.dg/gomp/delim-declare-variant-1.C | 82 ++ .../g++.dg/gomp/delim-declare-variant-11.C | 20 + .../g++.dg/gomp/delim-declare-variant-12.C | 32 + .../g++.dg/gomp/delim-declare-variant-2.C | 26 + .../g++.dg/gomp/delim-declare-variant-3.C | 6 + .../g++.dg/gomp/delim-declare-variant-4.C | 10 + .../g++.dg/gomp/delim-declare-variant-5.C | 4 + gcc/tree-cfg.cc | 4 + gcc/tree-core.h | 3 +- gcc/tree.h | 7 + .../libgomp.c++/delim-declare-variant-10.C | 19 + .../libgomp.c++/delim-declare-variant-13.C | 51 ++ .../libgomp.c++/delim-declare-variant-14.C | 57 ++ .../libgomp.c++/delim-declare-variant-15.C | 55 ++ .../libgomp.c++/delim-declare-variant-16.C | 30 + .../libgomp.c++/delim-declare-variant-17.C | 29 + .../libgomp.c++/delim-declare-variant-18.C | 46 ++ .../libgomp.c++/delim-declare-variant-19.C | 41 + .../libgomp.c++/delim-declare-variant-20.C | 39 + .../libgomp.c++/delim-declare-variant-21.C | 56 ++ .../libgomp.c++/delim-declare-variant-6.C | 37 + .../libgomp.c++/delim-declare-variant-7.C | 38 + .../libgomp.c++/delim-declare-variant-8.C | 61 ++ .../libgomp.c++/delim-declare-variant-9.C | 54 ++ 41 files changed, 1886 insertions(+), 84 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-11.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-12.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-4.C create mode 100644 gcc/testsuite/g++.dg/gomp/delim-declare-variant-5.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-10.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-13.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-14.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-15.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-16.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-17.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-18.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-19.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-20.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-21.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-6.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-7.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-8.C create mode 100644 libgomp/testsuite/libgomp.c++/delim-declare-variant-9.C diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 854e987dc79..1fd90a1cfb9 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -521,6 +521,8 @@ const struct attribute_spec c_common_gnu_attributes[] = handle_omp_declare_variant_attribute, NULL }, { "omp declare variant variant", 0, -1, true, false, false, false, handle_omp_declare_variant_attribute, NULL }, + { "omp declare variant overload", 0, -1, true, false, false, false, + handle_omp_declare_variant_attribute, NULL }, { "simd", 0, 1, true, false, false, false, handle_simd_attribute, NULL }, { "omp declare target", 0, -1, true, false, false, false, @@ -4020,8 +4022,8 @@ handle_omp_declare_simd_attribute (tree *, tree, tree, int, bool *) return NULL_TREE; } -/* Handle an "omp declare variant {base,variant}" attribute; arguments as in - struct attribute_spec.handler. */ +/* Handle an "omp declare variant {base,variant,overload}" attribute; + arguments as in struct attribute_spec.handler. */ static tree handle_omp_declare_variant_attribute (tree *, tree, tree, int, bool *) diff --git a/gcc/coverage.cc b/gcc/coverage.cc index ad55f0f1909..0dc156d9c05 100644 --- a/gcc/coverage.cc +++ b/gcc/coverage.cc @@ -646,6 +646,7 @@ coverage_begin_function (unsigned lineno_checksum, unsigned cfg_checksum) (DECL_ASSEMBLER_NAME (current_function_decl))); gcov_write_unsigned (DECL_ARTIFICIAL (current_function_decl) && !DECL_FUNCTION_VERSIONED (current_function_decl) + && !DECL_FUNCTION_OMP_VARIANT (current_function_decl) && !DECL_LAMBDA_FUNCTION_P (current_function_decl)); gcov_write_filename (remap_profile_filename (startloc.file)); gcov_write_unsigned (startloc.line); diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc index c7efc5b077a..44848342b25 100644 --- a/gcc/cp/call.cc +++ b/gcc/cp/call.cc @@ -13093,6 +13093,29 @@ joust (struct z_candidate *cand1, struct z_candidate *cand2, bool warn, } } + if (TREE_CODE (cand1->fn) == FUNCTION_DECL + && DECL_FUNCTION_OMP_VARIANT (cand1->fn) + && TREE_CODE (cand2->fn) == FUNCTION_DECL + && DECL_FUNCTION_OMP_VARIANT (cand2->fn)) + { + tree f1 = TREE_TYPE (cand1->fn); + tree f2 = TREE_TYPE (cand2->fn); + tree p1 = TYPE_ARG_TYPES (f1); + tree p2 = TYPE_ARG_TYPES (f2); + if (compparms (p1, p2) + && same_type_p (TREE_TYPE (f1), TREE_TYPE (f2))) + { + tree a1 = lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (cand1->fn)); + tree a2 = lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (cand2->fn)); + if (!a1 && a2) + return 1; + else if (a1 && !a2) + return -1; + } + } + /* If the two function declarations represent the same function (this can happen with declarations in multiple scopes and arg-dependent lookup), arbitrarily choose one. But first make sure the default args we're diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc index 6fdb56abfb9..d9751a28014 100644 --- a/gcc/cp/class.cc +++ b/gcc/cp/class.cc @@ -1180,6 +1180,18 @@ add_method (tree type, tree method, bool via_using) && maybe_version_functions (method, fn, true)) continue; + if (flag_openmp + && (lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (method)) + || lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (fn)))) + { + bool record = (!DECL_FUNCTION_OMP_VARIANT (method) + || !DECL_FUNCTION_OMP_VARIANT (fn)); + maybe_omp_variant_functions (method, fn, record); + continue; + } + if (DECL_INHERITED_CTOR (method)) { if (!DECL_INHERITED_CTOR (fn)) @@ -8759,6 +8771,9 @@ resolve_address_of_overloaded_function (tree target_type, mark_versions_used (fn); } + // TBD. + gcc_assert (!DECL_FUNCTION_OMP_VARIANT (fn)); + /* If we're doing overload resolution purely for the purpose of determining conversion sequences, we should not consider the function used. If this conversion sequence is selected, the diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index cb89d372b23..c32ac60c2a9 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -1845,6 +1845,11 @@ struct GTY(()) cp_omp_begin_assumes_data { bool attr_syntax; }; +struct GTY(()) cp_omp_declare_variant_attr { + bool attr_syntax; + tree selector; +}; + /* Global state. */ struct GTY(()) saved_scope { @@ -1894,6 +1899,7 @@ struct GTY(()) saved_scope { hash_map *GTY((skip)) x_local_specializations; vec *omp_declare_target_attribute; vec *omp_begin_assumes; + vec *omp_declare_variant_attribute; struct saved_scope *prev; }; @@ -6937,6 +6943,7 @@ extern bool member_like_constrained_friend_p (tree); extern bool fns_correspond (tree, tree); extern int decls_match (tree, tree, bool = true); extern bool maybe_version_functions (tree, tree, bool); +extern void maybe_omp_variant_functions (tree, tree, bool); extern bool merge_default_template_args (tree, tree, bool); extern tree duplicate_decls (tree, tree, bool hiding = false, diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 1844c923b7f..4181f42d648 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -1180,6 +1180,21 @@ decls_match (tree newdecl, tree olddecl, bool record_versions /* = true */) || !DECL_FUNCTION_VERSIONED (olddecl))); return 0; } + if (flag_openmp + && types_match + && (lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (newdecl)) + || lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (olddecl)))) + { + if (record_versions) + { + bool record = (!DECL_FUNCTION_OMP_VARIANT (newdecl) + || !DECL_FUNCTION_OMP_VARIANT (olddecl)); + maybe_omp_variant_functions (newdecl, olddecl, record); + } + return 0; + } } else if (TREE_CODE (newdecl) == TEMPLATE_DECL) { @@ -1243,6 +1258,17 @@ maybe_mark_function_versioned (tree decl) } } +static void +maybe_mark_omp_variant_function (tree decl) +{ + if (!DECL_FUNCTION_OMP_VARIANT (decl)) + { + DECL_FUNCTION_OMP_VARIANT (decl) = 1; + if (DECL_ASSEMBLER_NAME_SET_P (decl)) + mangle_decl (decl); + } +} + /* NEWDECL and OLDDECL have identical signatures. If they are different versions adjust them and return true. If RECORD is set to true, record function versions. */ @@ -1277,6 +1303,116 @@ maybe_version_functions (tree newdecl, tree olddecl, bool record) return true; } +void +maybe_omp_variant_functions (tree newdecl, tree olddecl, bool record) +{ + maybe_mark_omp_variant_function (olddecl); + if (DECL_LOCAL_DECL_P (olddecl)) + { + /* Is this meaningful for these functions? */ + olddecl = DECL_LOCAL_DECL_ALIAS (olddecl); + maybe_mark_omp_variant_function (olddecl); + } + + maybe_mark_omp_variant_function (newdecl); + if (DECL_LOCAL_DECL_P (newdecl)) + { + /* again? */ + if (!DECL_LOCAL_DECL_ALIAS (newdecl)) + { + if (!DECL_LOCAL_DECL_ALIAS (newdecl)) + push_local_extern_decl_alias (newdecl); + newdecl = DECL_LOCAL_DECL_ALIAS (newdecl); + maybe_mark_omp_variant_function (newdecl); + } + } + + if (record) + cgraph_node::record_function_versions (olddecl, newdecl); + + if (record) + { + cgraph_node *oldn = cgraph_node::get (olddecl); + cgraph_node *newn = cgraph_node::get (newdecl); + cgraph_function_version_info *oldv = oldn->function_version (); + cgraph_function_version_info *first_ver, *ver; + + for (ver = oldv; ver; ver = ver->prev) + first_ver = ver; + + tree have_base = NULL_TREE; + hash_set linked_variants; + + const char *base_attr_name = "omp declare variant base"; + size_t base_attr_len = strlen (base_attr_name); + + for (ver = first_ver; ver; ver = ver->next) + { + tree fn = ver->this_node->decl; + tree variant = lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (fn)); + if (!variant) + { + /* There should only be one base function (non-variant) in the + list. FIXME: Might not be true for versioned functions. */ + gcc_assert (!have_base); + have_base = fn; + tree attrs = DECL_ATTRIBUTES (fn); + while (attrs) + { + tree attr = get_attribute_name (attrs); + size_t ident_len = IDENTIFIER_LENGTH (attr); + if (cmp_attribs (base_attr_name, base_attr_len, + IDENTIFIER_POINTER (attr), ident_len)) + { + /* This is a TREE_LIST node used as a tuple of: + (variant decl, context selector, location). */ + tree info = TREE_VALUE (attr); + tree linked_variant = TREE_PURPOSE (info); + /* Linked variant should appear only once. */ + gcc_assert (!linked_variants.contains (linked_variant)); + linked_variants.add (linked_variant); + } + attrs = TREE_CHAIN (attrs); + } + } + } + + /* We haven't seen the base function yet; do nothing. */ + if (!have_base) + return; + + for (ver = first_ver; ver; ver = ver->next) + { + tree fn = ver->this_node->decl; + if (fn == have_base) + continue; + if (linked_variants.contains (fn)) + continue; + /* The base function doesn't have a link to this variant yet. Add + one now. */ + tree attrib = lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (fn)); + gcc_assert (attrib); + tree ctx_selector = TREE_VALUE (attrib); + /* This location stuff is nonsense. FIXME. */ + location_t fn_loc = DECL_SOURCE_LOCATION (fn); + tree wrapped_loc = maybe_wrap_with_location (integer_zero_node, + fn_loc); + cp_id_kind idk = CP_ID_KIND_NONE; + wrapped_loc + = tree_cons (wrapped_loc, build_int_cst (integer_type_node, idk), + build_tree_list (wrapped_loc, integer_zero_node)); + tree info = tree_cons (fn, ctx_selector, wrapped_loc); + DECL_ATTRIBUTES (have_base) + = tree_cons (get_identifier (base_attr_name), info, + DECL_ATTRIBUTES (have_base)); + if (processing_template_decl) + ATTR_IS_DEPENDENT (DECL_ATTRIBUTES (have_base)) = 1; + } + } +} + /* If NEWDECL is `static' and an `extern' was seen previously, warn about it. OLDDECL is the previous declaration. @@ -1547,10 +1683,14 @@ duplicate_function_template_decls (tree newdecl, tree olddecl) tree oldres = DECL_TEMPLATE_RESULT (olddecl); /* Function template declarations can be differentiated by parameter and return type. */ - if (compparms (TYPE_ARG_TYPES (TREE_TYPE (oldres)), - TYPE_ARG_TYPES (TREE_TYPE (newres))) - && same_type_p (TREE_TYPE (TREE_TYPE (newdecl)), - TREE_TYPE (TREE_TYPE (olddecl)))) + if ((!lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (newres)) + && !lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (oldres))) + && compparms (TYPE_ARG_TYPES (TREE_TYPE (oldres)), + TYPE_ARG_TYPES (TREE_TYPE (newres))) + && same_type_p (TREE_TYPE (TREE_TYPE (newdecl)), + TREE_TYPE (TREE_TYPE (olddecl)))) { /* ... and also by their template-heads and requires-clauses. */ if (template_heads_equivalent_p (newdecl, olddecl) @@ -1970,6 +2110,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden) are not ambiguous. */ else if ((!DECL_FUNCTION_VERSIONED (newdecl) && !DECL_FUNCTION_VERSIONED (olddecl)) + && (!DECL_FUNCTION_OMP_VARIANT (newdecl) + && !DECL_FUNCTION_OMP_VARIANT (olddecl)) /* Let constrained hidden friends coexist for now, we'll check satisfaction later. */ && !member_like_constrained_friend_p (newdecl) @@ -3143,6 +3285,15 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden) cgraph_node::delete_function_version_by_decl (newdecl); } + if (TREE_CODE (newdecl) == FUNCTION_DECL + && DECL_FUNCTION_OMP_VARIANT (olddecl)) + { + /* Set the flag for newdecl so that it gets copied to olddecl. */ + DECL_FUNCTION_OMP_VARIANT (newdecl) = 1; + /* FIXME: Do we need this here? */ + //cgraph_node::delete_function_version_by_decl (newdecl); + } + if (TREE_CODE (newdecl) == FUNCTION_DECL) { int function_size; @@ -8168,6 +8319,20 @@ omp_declare_variant_finalize_one (tree decl, tree attr) if (idk == CP_ID_KIND_QUALIFIED) variant = finish_call_expr (variant, &args, /*disallow_virtual=*/true, koenig_p, tf_warning_or_error); + else if (idk == CP_ID_KIND_NONE + && DECL_NONSTATIC_MEMBER_FUNCTION_P (variant) + && CLASS_TYPE_P (DECL_CONTEXT (decl))) + { + tree saved_ccp = current_class_ptr; + tree saved_ccr = current_class_ref; + current_class_ptr = NULL_TREE; + current_class_ref = NULL_TREE; + inject_this_parameter (DECL_CONTEXT (decl), TYPE_UNQUALIFIED); + variant = finish_call_expr (variant, &args, /*disallow_virtual=*/false, + koenig_p, tf_warning_or_error); + current_class_ptr = saved_ccp; + current_class_ref = saved_ccr; + } else variant = finish_call_expr (variant, &args, /*disallow_virtual=*/false, koenig_p, tf_warning_or_error); @@ -8194,6 +8359,9 @@ omp_declare_variant_finalize_one (tree decl, tree attr) error_at (varid_loc, "variant %qD is a built-in", variant); return true; } + else if (lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (variant))) + TREE_PURPOSE (TREE_VALUE (attr)) = variant; else { tree construct diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc index bee84879023..df0ef66d14a 100644 --- a/gcc/cp/decl2.cc +++ b/gcc/cp/decl2.cc @@ -1774,6 +1774,17 @@ cplus_decl_attributes (tree *decl, tree attributes, int flags) } } + if (vec_safe_length (scope_chain->omp_declare_variant_attribute) + && TREE_CODE (*decl) == FUNCTION_DECL) + { + int length = scope_chain->omp_declare_variant_attribute->length (); + cp_omp_declare_variant_attr a + = (*scope_chain->omp_declare_variant_attribute)[length - 1]; + tree ctx = copy_list (a.selector); + attributes = tree_cons (get_identifier ("omp declare variant overload"), + ctx, attributes); + } + tree late_attrs = NULL_TREE; if (processing_template_decl) { diff --git a/gcc/cp/lex.cc b/gcc/cp/lex.cc index 64bcfb18196..672be7e3616 100644 --- a/gcc/cp/lex.cc +++ b/gcc/cp/lex.cc @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see #include "gcc-rich-location.h" #include "cp-name-hint.h" #include "langhooks.h" +#include "omp-general.h" static int interface_strcmp (const char *); static void init_cp_pragma (void); @@ -328,6 +329,8 @@ cxx_init (void) init_cp_semantics (); init_operators (); init_method (); + if (flag_openmp) + omp_init_mangle (); current_function_decl = NULL; diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc index 0684f0e6038..073d79c2644 100644 --- a/gcc/cp/mangle.cc +++ b/gcc/cp/mangle.cc @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see #include "stor-layout.h" #include "flags.h" #include "attribs.h" +#include "omp-general.h" /* Debugging support. */ @@ -828,6 +829,21 @@ write_mangled_name (const tree decl, bool top_level) else gcc_unreachable (); } + + if (flag_openmp + && TREE_CODE (decl) == FUNCTION_DECL + && DECL_FUNCTION_OMP_VARIANT (decl)) + { + tree ctx; + if ((ctx = lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (decl)))) + { + write_string (JOIN_STR "ompvariant"); + const char *append + = omp_mangle_context_selector (JOIN_STR, TREE_VALUE (ctx)); + write_string (append); + } + } } /* Returns true if the return type of DECL is part of its signature, and diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 65aeb83758b..95f7381bab2 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -48440,6 +48440,62 @@ cp_parser_omp_begin (cp_parser *parser, cp_token *pragma_tok) = { in_omp_attribute_pragma, device_type, indirect }; vec_safe_push (scope_chain->omp_declare_target_attribute, a); } + else if (strcmp (p, "variant") == 0) + { + cp_lexer_consume_token (parser->lexer); + const char *clause = ""; + matching_parens parens; + location_t match_loc = cp_lexer_peek_token (parser->lexer)->location; + if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)) + { + tree id = cp_lexer_peek_token (parser->lexer)->u.value; + clause = IDENTIFIER_POINTER (id); + } + if (strcmp (clause, "match") != 0) + { + cp_parser_error (parser, "expected %"); + cp_parser_skip_to_pragma_eol (parser, pragma_tok); + return; + } + + cp_lexer_consume_token (parser->lexer); + + if (!parens.require_open (parser)) + { + cp_parser_skip_to_pragma_eol (parser, pragma_tok); + return; + } + + tree ctx = cp_parser_omp_context_selector_specification (parser, + true); + if (ctx == error_mark_node) + { + cp_parser_skip_to_pragma_eol (parser, pragma_tok); + return; + } + + if (vec_safe_length (scope_chain->omp_declare_variant_attribute) > 0) + { + int length + = scope_chain->omp_declare_variant_attribute->length (); + cp_omp_declare_variant_attr a + = (*scope_chain->omp_declare_variant_attribute)[length - 1]; + tree outer_ctx = a.selector; + ctx = omp_merge_context_selectors (outer_ctx, ctx); + } + + ctx = omp_check_context_selector (match_loc, ctx); + + if (ctx != error_mark_node) + { + cp_omp_declare_variant_attr a + = { parser->lexer->in_omp_attribute_pragma, ctx }; + vec_safe_push (scope_chain->omp_declare_variant_attribute, a); + } + + parens.require_close (parser); + cp_parser_skip_to_pragma_eol (parser, pragma_tok); + } else { cp_parser_error (parser, "expected %"); @@ -48486,41 +48542,70 @@ cp_parser_omp_end (cp_parser *parser, cp_token *pragma_tok) p = IDENTIFIER_POINTER (id); } if (strcmp (p, "target") == 0) - cp_lexer_consume_token (parser->lexer); + { + cp_lexer_consume_token (parser->lexer); + cp_parser_require_pragma_eol (parser, pragma_tok); + if (!vec_safe_length (scope_chain->omp_declare_target_attribute)) + error_at (pragma_tok->location, + "%<#pragma omp end declare target%> without " + "corresponding %<#pragma omp declare target%> or " + "%<#pragma omp begin declare target%>"); + else + { + cp_omp_declare_target_attr + a = scope_chain->omp_declare_target_attribute->pop (); + if (a.attr_syntax != in_omp_attribute_pragma) + { + if (a.attr_syntax) + error_at (pragma_tok->location, + "%qs in attribute syntax terminated " + "with %qs in pragma syntax", + a.device_type >= 0 ? "begin declare target" + : "declare target", + "end declare target"); + else + error_at (pragma_tok->location, + "%qs in pragma syntax terminated " + "with %qs in attribute syntax", + a.device_type >= 0 ? "begin declare target" + : "declare target", + "end declare target"); + } + } + } + else if (strcmp (p, "variant") == 0) + { + cp_lexer_consume_token (parser->lexer); + cp_parser_require_pragma_eol (parser, pragma_tok); + if (!vec_safe_length (scope_chain->omp_declare_variant_attribute)) + error_at (pragma_tok->location, + "%<#pragma omp end declare variant%> without " + "corresponding %<#pragma omp begin declare variant%>"); + else + { + cp_omp_declare_variant_attr + a = scope_chain->omp_declare_variant_attribute->pop (); + if (a.attr_syntax != in_omp_attribute_pragma) + { + if (a.attr_syntax) + error_at (pragma_tok->location, + "% in attribute syntax " + "terminated with % in " + "pragma syntax"); + else + error_at (pragma_tok->location, + "% in pragma syntax " + "terminated with % in " + "attribute syntax"); + } + } + } else { cp_parser_error (parser, "expected %"); cp_parser_skip_to_pragma_eol (parser, pragma_tok); return; } - cp_parser_require_pragma_eol (parser, pragma_tok); - if (!vec_safe_length (scope_chain->omp_declare_target_attribute)) - error_at (pragma_tok->location, - "%<#pragma omp end declare target%> without corresponding " - "%<#pragma omp declare target%> or " - "%<#pragma omp begin declare target%>"); - else - { - cp_omp_declare_target_attr - a = scope_chain->omp_declare_target_attribute->pop (); - if (a.attr_syntax != in_omp_attribute_pragma) - { - if (a.attr_syntax) - error_at (pragma_tok->location, - "%qs in attribute syntax terminated " - "with %qs in pragma syntax", - a.device_type >= 0 ? "begin declare target" - : "declare target", - "end declare target"); - else - error_at (pragma_tok->location, - "%qs in pragma syntax terminated " - "with %qs in attribute syntax", - a.device_type >= 0 ? "begin declare target" - : "declare target", - "end declare target"); - } - } } else if (strcmp (p, "assumes") == 0) { diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 42d020b105d..eb3672dd774 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -1669,6 +1669,18 @@ spec_hasher::equal (spec_entry *e1, spec_entry *e2) tree c2 = e2->spec ? get_constraints (e2->spec) : NULL_TREE; equal = equivalent_constraints (c1, c2); } + if (equal && flag_openmp) + { + tree r1 = DECL_TEMPLATE_RESULT (e1->tmpl); + tree r2 = DECL_TEMPLATE_RESULT (e2->tmpl); + if (TREE_CODE (r1) == FUNCTION_DECL + && TREE_CODE (r2) == FUNCTION_DECL + && (lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (r1)) + || lookup_attribute ("omp declare variant overload", + DECL_ATTRIBUTES (r2)))) + equal = false; + } --processing_template_decl; --comparing_dependent_aliases; --comparing_specializations; @@ -11775,6 +11787,10 @@ tsubst_contract_attributes (tree decl, tree args, tsubst_flags_t complain, tree DECL_ATTRIBUTES (decl) = list; } +static tree +tsubst_function_decl (tree t, tree args, tsubst_flags_t complain, + tree lambda_fntype, bool use_spec_table = true); + /* Instantiate a single dependent attribute T (a TREE_LIST), and return either T or a new TREE_LIST, possibly a chain in the case of a pack expansion. */ @@ -11812,9 +11828,16 @@ tsubst_attribute (tree t, tree *decl_p, tree args, && is_attribute_p ("omp declare variant base", get_attribute_name (t))) { - ++cp_unevaluated_operand; - tree varid = tsubst_expr (TREE_PURPOSE (val), args, complain, in_decl); - --cp_unevaluated_operand; + tree varid; + if (TREE_CODE (TREE_PURPOSE (val)) == FUNCTION_DECL) + varid = tsubst_function_decl (TREE_PURPOSE (val), args, complain, + NULL_TREE); + else + { + ++cp_unevaluated_operand; + varid = tsubst_expr (TREE_PURPOSE (val), args, complain, in_decl); + --cp_unevaluated_operand; + } tree chain = TREE_CHAIN (val); location_t match_loc = cp_expr_loc_or_input_loc (TREE_PURPOSE (chain)); tree ctx = copy_list (TREE_VALUE (val)); @@ -11908,6 +11931,10 @@ tsubst_attribute (tree t, tree *decl_p, tree args, } val = tree_cons (varid, ctx, chain); } + else if (flag_openmp + && is_attribute_p ("omp declare variant overload", + get_attribute_name (t))) + ; /* If the first attribute argument is an identifier, don't pass it through tsubst. Attributes like mode, format, cleanup and several target specific attributes expect it @@ -14329,7 +14356,7 @@ maybe_rebuild_function_decl_type (tree decl) static tree tsubst_function_decl (tree t, tree args, tsubst_flags_t complain, - tree lambda_fntype, bool use_spec_table = true) + tree lambda_fntype, bool use_spec_table /*= true*/) { tree gen_tmpl = NULL_TREE, argvec = NULL_TREE; hashval_t hash = 0; @@ -14630,7 +14657,11 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t complain, if (flag_openmp) if (tree attr = lookup_attribute ("omp declare variant base", DECL_ATTRIBUTES (r))) - omp_declare_variant_finalize (r, attr); + { + DECL_ATTRIBUTES (r) = tsubst_attributes (DECL_ATTRIBUTES (r), args, + complain, r); + omp_declare_variant_finalize (r, attr); + } return r; } diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 6634acfda3f..e70feb85ab5 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -3467,6 +3467,13 @@ finish_translation_unit (void) "#pragma omp end declare target"); vec_safe_truncate (scope_chain->omp_declare_target_attribute, 0); } + if (vec_safe_length (scope_chain->omp_declare_variant_attribute)) + { + if (!errorcount) + error ("% without corresponding " + "%"); + vec_safe_truncate (scope_chain->omp_declare_variant_attribute, 0); + } if (vec_safe_length (scope_chain->omp_begin_assumes)) { if (!errorcount) diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc index 8281ec67e00..45c68fa72bc 100644 --- a/gcc/omp-expand.cc +++ b/gcc/omp-expand.cc @@ -10099,6 +10099,8 @@ expand_omp_target (struct omp_region *region) = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); DECL_FUNCTION_VERSIONED (child_fn2) = DECL_FUNCTION_VERSIONED (current_function_decl); + DECL_FUNCTION_OMP_VARIANT (child_fn2) + = DECL_FUNCTION_OMP_VARIANT (current_function_decl); fn2_node = cgraph_node::get_create (child_fn2); fn2_node->offloadable = 1; diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 361630bcaed..e908adde0b4 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. If not see #include "data-streamer.h" #include "streamer-hooks.h" #include "opts.h" +#include "tree-hash-traits.h" enum omp_requires omp_requires_mask; @@ -1125,6 +1126,8 @@ const char *omp_tss_map[] = /* Arrays of property candidates must be null-terminated. */ static const char *const kind_properties[] = { "host", "nohost", "cpu", "gpu", "fpga", "any", NULL }; +static const char *const kind_abbrevs[] = + { "h", "n", "c", "g", "f", "" }; static const char *const vendor_properties[] = { "amd", "arm", "bsc", "cray", "fujitsu", "gnu", "hpe", "ibm", "intel", "llvm", "nvidia", "pgi", "ti", "unknown", NULL }; @@ -1132,111 +1135,112 @@ static const char *const extension_properties[] = { NULL }; static const char *const atomic_default_mem_order_properties[] = { "seq_cst", "relaxed", "acq_rel", "acquire", "release", NULL }; +static const char *const atomic_default_mem_order_abbrevs[] = + { "sc", "rx", "ar", "ac", "re" }; struct omp_ts_info omp_ts_map[] = { - { "kind", + { "kind", "k", (1 << OMP_TRAIT_SET_DEVICE) | (1 << OMP_TRAIT_SET_TARGET_DEVICE), OMP_TRAIT_PROPERTY_NAME_LIST, false, - kind_properties + kind_properties, kind_abbrevs }, - { "isa", + { "isa", "i", (1 << OMP_TRAIT_SET_DEVICE) | (1 << OMP_TRAIT_SET_TARGET_DEVICE), OMP_TRAIT_PROPERTY_NAME_LIST, false, - NULL + NULL, NULL }, - { "arch", + { "arch", "a", (1 << OMP_TRAIT_SET_DEVICE) | (1 << OMP_TRAIT_SET_TARGET_DEVICE), OMP_TRAIT_PROPERTY_NAME_LIST, false, - NULL + NULL, NULL }, - { "device_num", + { "device_num", NULL, (1 << OMP_TRAIT_SET_TARGET_DEVICE), OMP_TRAIT_PROPERTY_EXPR, false, - NULL + NULL, NULL }, - { "vendor", + { "vendor", "v", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NAME_LIST, true, - vendor_properties, + vendor_properties, NULL }, - { "extension", + { "extension", "e", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NAME_LIST, true, - extension_properties, + extension_properties, NULL }, - { "atomic_default_mem_order", + { "atomic_default_mem_order", "a", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_ID, true, atomic_default_mem_order_properties, + atomic_default_mem_order_abbrevs }, - { "requires", + { "requires", "r", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_CLAUSE_LIST, true, - NULL + NULL, NULL }, - { "unified_address", + { "unified_address", "una", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NONE, true, - NULL + NULL, NULL }, - { "unified_shared_memory", + { "unified_shared_memory", "usm", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NONE, true, - NULL + NULL, NULL }, - { "dynamic_allocators", + { "dynamic_allocators", "dna", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NONE, true, - NULL + NULL, NULL }, - { "reverse_offload", + { "reverse_offload", "rvo", (1 << OMP_TRAIT_SET_IMPLEMENTATION), OMP_TRAIT_PROPERTY_NONE, true, - NULL + NULL, NULL }, - { "condition", + { "condition", NULL, (1 << OMP_TRAIT_SET_USER), OMP_TRAIT_PROPERTY_EXPR, true, - NULL + NULL, NULL }, - { "target", + { "target", "ta", (1 << OMP_TRAIT_SET_CONSTRUCT), OMP_TRAIT_PROPERTY_NONE, false, - NULL + NULL, NULL }, - { "teams", + { "teams", "te", (1 << OMP_TRAIT_SET_CONSTRUCT), OMP_TRAIT_PROPERTY_NONE, false, - NULL + NULL, NULL }, - { "parallel", + { "parallel", "pa", (1 << OMP_TRAIT_SET_CONSTRUCT), OMP_TRAIT_PROPERTY_NONE, false, - NULL + NULL, NULL }, - { "for", + { "for", "fo", (1 << OMP_TRAIT_SET_CONSTRUCT), OMP_TRAIT_PROPERTY_NONE, false, - NULL + NULL, NULL }, - { "simd", + { "simd", "si", (1 << OMP_TRAIT_SET_CONSTRUCT), OMP_TRAIT_PROPERTY_CLAUSE_LIST, false, - NULL + NULL, NULL }, - { NULL, 0, OMP_TRAIT_PROPERTY_NONE, false, NULL } /* OMP_TRAIT_LAST */ + /* OMP_TRAIT_LAST */ + { NULL, NULL, 0, OMP_TRAIT_PROPERTY_NONE, false, NULL, NULL } }; +/* Helper function for omp_context_name_list_prop. Return string from either + an identifier or a string constant. */ -/* Return a name from PROP, a property in selectors accepting - name lists. */ - -const char * -omp_context_name_list_prop (tree prop) +static const char * +omp_context_name_list_prop_1 (tree val) { - gcc_assert (OMP_TP_NAME (prop) == OMP_TP_NAMELIST_NODE); - tree val = OMP_TP_VALUE (prop); switch (TREE_CODE (val)) { case IDENTIFIER_NODE: @@ -1254,6 +1258,16 @@ omp_context_name_list_prop (tree prop) } } +/* Return a name from PROP, a property in selectors accepting + name lists. */ + +const char * +omp_context_name_list_prop (tree prop) +{ + gcc_assert (OMP_TP_NAME (prop) == OMP_TP_NAMELIST_NODE); + return omp_context_name_list_prop_1 (OMP_TP_VALUE (prop)); +} + /* Diagnose errors in an OpenMP context selector, return CTX if it is correct or error_mark_node otherwise. */ @@ -1797,6 +1811,611 @@ omp_context_selector_matches (tree ctx) return ret; } +static void +omp_copy_trait_set (tree from_ts, tree *to_ts) +{ + for (tree ts = from_ts; ts; ts = TREE_CHAIN (ts)) + *to_ts = make_trait_selector (OMP_TS_CODE (ts), OMP_TS_SCORE (ts), + OMP_TS_PROPERTIES (ts), *to_ts); +} + +/* Trait properties are stored in a tree list, and for namelist properties, + may use either an IDENTIFIER_POINTER or STRING_CST for their value. We + want these to be considered equal if they represent the same string (e.g. + quoted or not in the source program). Arrange a hash table to allow us to + do this. */ + +typedef std::pair omp_trait_prop; + +struct omp_trait_prop_hash : typed_noop_remove +{ + typedef omp_trait_prop value_type; + typedef omp_trait_prop compare_type; + + static inline hashval_t hash (omp_trait_prop); + static const bool empty_zero_p = true; + static inline bool is_empty (omp_trait_prop); + static inline bool is_deleted (omp_trait_prop); + static inline bool equal (const omp_trait_prop &, const omp_trait_prop &); + static inline void mark_empty (omp_trait_prop &); +}; + +inline hashval_t +omp_trait_prop_hash::hash (omp_trait_prop t) +{ + if (t.first == OMP_TP_NAMELIST_NODE) + return htab_hash_string (omp_context_name_list_prop_1 (t.second)); + if (t.first && t.second) + return iterative_hash_expr (t.second, iterative_hash_expr (t.first, 0)); + else if (t.first) + return iterative_hash_expr (t.first, 0); + else if (t.second) + return iterative_hash_expr (t.second, 0); + else + gcc_unreachable (); +} + +inline bool +omp_trait_prop_hash::is_empty (omp_trait_prop t) +{ + return !t.first && !t.second; +} + +inline bool +omp_trait_prop_hash::is_deleted (omp_trait_prop) +{ + return false; +} + +inline bool +omp_trait_prop_hash::equal (const omp_trait_prop &a, const omp_trait_prop &b) +{ + if (a.first == OMP_TP_NAMELIST_NODE + && b.first == OMP_TP_NAMELIST_NODE) + { + const char *a_name = omp_context_name_list_prop_1 (a.second); + const char *b_name = omp_context_name_list_prop_1 (b.second); + return strcmp (a_name, b_name) == 0; + } + + if (a.first && a.second && b.first && b.second) + return operand_equal_p (a.first, b.first) + && operand_equal_p (a.second, b.second); + else if (a.first && b.first) + return !a.second && !b.second && a.first == b.first; + else if (a.second && b.second) + return !a.first && !b.first && operand_equal_p (a.second, b.second); + else + return false; +} + +inline void +omp_trait_prop_hash::mark_empty (omp_trait_prop &t) +{ + t.first = NULL_TREE; + t.second = NULL_TREE; +} + +typedef hash_set omp_trait_prop_set; + +static void +omp_gather_trait_sets (omp_trait_prop_set sets[], tree scores[], tree inlist) +{ + for (tree ts = inlist; ts; ts = TREE_CHAIN (ts)) + { + enum omp_ts_code ts_code = OMP_TS_CODE (ts); + + if (scores && omp_ts_map[ts_code].allow_score) + scores[ts_code] = OMP_TS_SCORE (ts); + + for (tree tp = OMP_TS_PROPERTIES (ts); tp; tp = TREE_CHAIN (tp)) + { + tree name = OMP_TP_NAME (tp); + tree value = OMP_TP_VALUE (tp); + sets[ts_code].add (std::make_pair (name, value)); + } + } +} + +/* We might have things like: + + outer: arch(avx2,sse4.1,3dnow) + inner: arch(sse4.1) + + outer: kind(any) kind(gpu,cpu) kind(gpu,cpu) + inner: kind(gpu) kind(gpu) kind(any) + + We want to add the intersection of the inner and outer sets -- most of the + time that just means adding the inner set, but there are some special cases + that need handling (e.g. "any" on inner, but some more restrictive kind on + outer). */ + +static void +omp_combine_trait_properties (omp_trait_prop_set &outer_props, + omp_trait_prop_set &inner_props, + omp_tss_code tss_code, omp_ts_code ts_code, + tree *to_list, tree score) +{ + tree anyt = get_identifier ("any"); + tree combined_traits = NULL_TREE; + + std::pair any = std::make_pair (OMP_TP_NAMELIST_NODE, anyt); + + if (outer_props.is_empty ()) + for (std::pair e : inner_props) + combined_traits = make_trait_property (e.first, e.second, + combined_traits); + else if (inner_props.is_empty () + || (tss_code == OMP_TRAIT_SET_DEVICE + && ts_code == OMP_TRAIT_DEVICE_KIND + && inner_props.contains (any))) + for (std::pair e : outer_props) + combined_traits = make_trait_property (e.first, e.second, + combined_traits); + else + for (std::pair e : inner_props) + if (tss_code == OMP_TRAIT_SET_IMPLEMENTATION + && ts_code == OMP_TRAIT_IMPLEMENTATION_ADMO + && !outer_props.contains (e)) + error ("cannot combine nested % properties"); + else if (outer_props.contains (e) + || (tss_code == OMP_TRAIT_SET_DEVICE + && ts_code == OMP_TRAIT_DEVICE_KIND + && outer_props.contains (any))) + combined_traits = make_trait_property (e.first, e.second, + combined_traits); + + if (combined_traits) + *to_list = make_trait_selector (ts_code, score, + nreverse (combined_traits), *to_list); +} + +static void +omp_combine_trait_sets (tree outer_ts, tree inner_ts, omp_tss_code tss_code, + tree *to_list) +{ + unsigned HOST_WIDE_INT used_traits = 0; + + for (tree t = outer_ts; t; t = TREE_CHAIN (t)) + used_traits |= 1 << OMP_TS_CODE (t); + for (tree t = inner_ts; t; t = TREE_CHAIN (t)) + used_traits |= 1 << OMP_TS_CODE (t); + + omp_trait_prop_set outer_sets[OMP_TRAIT_LAST]; + omp_trait_prop_set inner_sets[OMP_TRAIT_LAST]; + + tree outer_scores[OMP_TRAIT_LAST] = { }; + tree inner_scores[OMP_TRAIT_LAST] = { }; + + omp_gather_trait_sets (outer_sets, outer_scores, outer_ts); + omp_gather_trait_sets (inner_sets, inner_scores, inner_ts); + + for (unsigned i = 0; i < OMP_TRAIT_LAST; i++) + if (used_traits & (1 << i)) + { + omp_ts_code ts_code = static_cast(i); + /* We can technically have a score on both the inner and outer context + selectors (e.g. for the "implementation" trait set selector). Just + ignore the outer one. */ + tree use_score = inner_scores[i] ? inner_scores[i] : outer_scores[i]; + + switch (omp_ts_map[ts_code].tp_type) + { + case OMP_TRAIT_PROPERTY_ID: + case OMP_TRAIT_PROPERTY_NAME_LIST: + omp_combine_trait_properties (outer_sets[i], inner_sets[i], + tss_code, ts_code, to_list, + use_score); + break; + + case OMP_TRAIT_PROPERTY_EXPR: + /* The user trait set selector just contains the 'condition' trait. + No attempt is made to combine conditions in outer/inner context + selectors here -- instead, the outer condition is just + dropped. */ + if (ts_code == OMP_TRAIT_USER_CONDITION && inner_ts && outer_ts) + warning (OPT_Wopenmp, "ignoring condition on enclosing context " + "selector"); + + if (inner_ts) + omp_copy_trait_set (inner_ts, to_list); + else if (outer_ts) + omp_copy_trait_set (outer_ts, to_list); + break; + + case OMP_TRAIT_PROPERTY_NONE: + switch (ts_code) + { + case OMP_TRAIT_IMPLEMENTATION_UNIFIED_ADDRESS: + case OMP_TRAIT_IMPLEMENTATION_UNIFIED_SHARED_MEMORY: + case OMP_TRAIT_IMPLEMENTATION_DYNAMIC_ALLOCATORS: + case OMP_TRAIT_IMPLEMENTATION_REVERSE_OFFLOAD: + /* For these ones, we want the union of the features listed on + the outer and inner context selectors. We already checked + the trait has been mentioned in one or the other selector + via 'used_traits', so add it here. */ + *to_list = make_trait_selector (ts_code, NULL_TREE, + NULL_TREE, *to_list); + break; + + default: + sorry ("don't know how to merge nested context selectors"); + } + break; + + default: + gcc_unreachable (); + } + } +} + +extern tree omp_get_context_selector_list (tree ctx, enum omp_tss_code set); + +tree +omp_merge_context_selectors (tree outer_ctx, tree inner_ctx) +{ + tree merged_ctx = NULL_TREE; + + for (unsigned i = OMP_TRAIT_SET_CONSTRUCT; i != OMP_TRAIT_SET_LAST; i++) + { + omp_tss_code tss_code = static_cast(i); + tree outer_ts = omp_get_context_selector_list (outer_ctx, tss_code); + tree inner_ts = omp_get_context_selector_list (inner_ctx, tss_code); + tree merged_ts = NULL_TREE; + + switch (tss_code) + { + case OMP_TRAIT_SET_CONSTRUCT: + if (outer_ts) + omp_copy_trait_set (outer_ts, &merged_ts); + if (inner_ts) + omp_copy_trait_set (inner_ts, &merged_ts); + break; + + case OMP_TRAIT_SET_DEVICE: + case OMP_TRAIT_SET_TARGET_DEVICE: + case OMP_TRAIT_SET_IMPLEMENTATION: + case OMP_TRAIT_SET_USER: + if (outer_ts && inner_ts) + omp_combine_trait_sets (outer_ts, inner_ts, tss_code, &merged_ts); + else if (outer_ts) + omp_copy_trait_set (outer_ts, &merged_ts); + else if (inner_ts) + omp_copy_trait_set (inner_ts, &merged_ts); + break; + + default: + gcc_unreachable (); + } + + if (merged_ts) + merged_ctx = make_trait_set_selector (tss_code, nreverse (merged_ts), + merged_ctx); + } + + return nreverse (merged_ctx); +} + +/* OpenMP "declare variant" mangling obstack bits. These are essentially + copied from cp/mangle.cc, but we'll need them for C also, so they need to go + here. */ + +static struct obstack *omp_mangle_obstack; +static struct obstack omp_name_obstack; +static void *omp_name_base; + +/* Append a single character to the end of the mangled + representation. */ +#define write_char(CHAR) \ + obstack_1grow (omp_mangle_obstack, (CHAR)) + +/* Append a NUL-terminated string to the end of the mangled + representation. */ +#define write_string(STRING) \ + obstack_grow (omp_mangle_obstack, (STRING), strlen (STRING)) + +static void +omp_start_mangling (void) +{ + obstack_free (&omp_name_obstack, omp_name_base); + omp_mangle_obstack = &omp_name_obstack; + omp_name_base = obstack_alloc (&omp_name_obstack, 0); +} + +static const char * +omp_finish_mangling (void) +{ + write_char ('\0'); + return (const char *) obstack_finish (omp_mangle_obstack); +} + +void +omp_init_mangle (void) +{ + gcc_obstack_init (&omp_name_obstack); + omp_name_base = obstack_alloc (&omp_name_obstack, 0); +} + +static int +omp_string_compare (const void *a, const void *b) +{ + const char *as = *(char * const *) a; + const char *bs = *(char * const *) b; + return strcmp (as, bs); +} + +static void +omp_stringify_sorted_property_set (const char *sep, omp_ts_code ts_code, + omp_trait_prop_set &props) +{ + vec names = vNULL; + hash_map abbrevs; + omp_tp_type tp_type = omp_ts_map[ts_code].tp_type; + + /* This function only really works for name lists. */ + gcc_assert (tp_type == OMP_TRAIT_PROPERTY_NAME_LIST + || tp_type == OMP_TRAIT_PROPERTY_ID); + + if (omp_ts_map[ts_code].prop_abbrevs) + for (int i = 0; omp_ts_map[ts_code].valid_properties[i]; i++) + abbrevs.put (omp_ts_map[ts_code].valid_properties[i], + omp_ts_map[ts_code].prop_abbrevs[i]); + + for (auto prop : props) + { + if (prop.first == OMP_TP_NAMELIST_NODE) + { + const char *name = omp_context_name_list_prop_1 (prop.second); + if (omp_ts_map[ts_code].prop_abbrevs) + { + const char **repl = abbrevs.get (name); + if (repl) + names.safe_push (*repl); + else + error ("missing abbreviation for %qs", name); + } + else + names.safe_push (name); + } + else if (tp_type == OMP_TRAIT_PROPERTY_ID + && TREE_CODE (prop.first) == IDENTIFIER_NODE) + { + const char *name = IDENTIFIER_POINTER (prop.first); + if (omp_ts_map[ts_code].prop_abbrevs) + { + const char **repl = abbrevs.get (name); + if (repl) + names.safe_push (*repl); + else + error ("missing abbreviation for %qs", name); + } + else + names.safe_push (name); + } + else + gcc_unreachable (); + } + + names.qsort (omp_string_compare); + + const char *use_sep = ""; + + for (auto &n : names) + { + bool needs_escaping = false; + for (int i = 0; n[i]; i++) + if (!ISALPHA (n[i]) && !ISDIGIT (n[i]) && n[i] != '_') + needs_escaping = true; + if (needs_escaping) + { + write_string (use_sep); + for (int i = 0; n[i]; i++) + { + /* We may have an arbirary string constants representing e.g. an + ISA, which could contain characters not valid for symbols in + assembly output. The below isn't really safe for arbitrary + character encodings (either in source or assembly output), but + should be a good-enough approximation for our purposes. */ + if (ISALPHA (n[i]) || ISDIGIT (n[i]) || n[i] == '_') + write_char (n[i]); + else + { + char tmpbuf[4]; + sprintf (tmpbuf, "%.2x", (unsigned char) n[i]); + write_string (sep); + write_string (tmpbuf); + } + } + } + else + { + write_string (use_sep); + write_string (n); + } + /* If we're using abbreviations for these trait properties, we don't need + extra separators between them. */ + if (!omp_ts_map[ts_code].prop_abbrevs) + use_sep = sep; + } +} + +/* Key: + + C: start construct + ta: target + te: teams + pa: parallel + fo: for/do + si: simd + D: start device + i: isa + (ordered) [ SEP] list + a: arch + (ordered) [ SEP] list + k: kind + h: host + n: nohost + c: cpu + g: gpu + f: fpga + (omitted): any + T: start target_device + (as above, also) + N: (ordered) [ SEP] list + I: start implementation + v: vendor + (ordered) [ SEP] list + e: extension + (nothing here) + r: requires + (tbd) + a: atomic_default_mem_order + ar: acq_rel + ac: acquire + rx: relaxed + re: release + sc: seq_cst + U: start user + (no further encoding, just 'U' or not) + SEP: separate trait selector sets +*/ + +const char * +omp_mangle_context_selector (const char *sep, tree ctx) +{ + gcc_assert (ctx); + + omp_start_mangling (); + + for (int i = OMP_TRAIT_SET_CONSTRUCT; i != OMP_TRAIT_SET_LAST; i++) + { + omp_tss_code tss_code = static_cast(i); + tree ts_list = omp_get_context_selector_list (ctx, tss_code); + + if (!ts_list) + continue; + + switch (tss_code) + { + case OMP_TRAIT_SET_CONSTRUCT: + { + write_string (sep); + write_char ('C'); + for (tree ts = ts_list; ts; ts = TREE_CHAIN (ts)) + write_string (omp_ts_map[OMP_TS_CODE (ts)].abbrev_name); + } + break; + case OMP_TRAIT_SET_DEVICE: + { + omp_trait_prop_set trait_sets[OMP_TRAIT_LAST]; + + omp_gather_trait_sets (trait_sets, NULL, ts_list); + write_string (sep); + write_char ('D'); + + const char *use_sep = ""; + + for (unsigned j = 0; j < OMP_TRAIT_LAST; j++) + { + omp_ts_code ts_code = static_cast(j); + + if (!(omp_ts_map[ts_code].tss_mask + & (1 << OMP_TRAIT_SET_DEVICE))) + continue; + + if (trait_sets[ts_code].is_empty ()) + continue; + + write_string (use_sep); + write_string (omp_ts_map[ts_code].abbrev_name); + + omp_stringify_sorted_property_set (sep, ts_code, + trait_sets[ts_code]); + + use_sep = sep; + } + } + break; + case OMP_TRAIT_SET_TARGET_DEVICE: + { + write_string (sep); + write_char ('T'); + /* This should share the 'device' code above, but for now is + unimplemented. */ + gcc_unreachable (); + } + break; + case OMP_TRAIT_SET_IMPLEMENTATION: + { + omp_trait_prop_set trait_sets[OMP_TRAIT_LAST]; + + omp_gather_trait_sets (trait_sets, NULL, ts_list); + write_string (sep); + write_char ('I'); + const char *use_sep = ""; + + for (unsigned j = 0; j < OMP_TRAIT_LAST; j++) + { + omp_ts_code ts_code = static_cast(j); + + if (!(omp_ts_map[ts_code].tss_mask + & (1 << OMP_TRAIT_SET_IMPLEMENTATION))) + continue; + + switch (omp_ts_map[ts_code].tp_type) + { + case OMP_TRAIT_PROPERTY_NONE: + if (omp_get_context_selector (ctx, tss_code, ts_code)) + { + write_string (use_sep); + write_string (omp_ts_map[ts_code].abbrev_name); + } + break; + + case OMP_TRAIT_PROPERTY_ID: + case OMP_TRAIT_PROPERTY_NAME_LIST: + if (trait_sets[i].is_empty ()) + continue; + + write_string (use_sep); + write_string (omp_ts_map[ts_code].abbrev_name); + + omp_stringify_sorted_property_set (sep, ts_code, + trait_sets[ts_code]); + break; + + case OMP_TRAIT_PROPERTY_CLAUSE_LIST: + if (omp_get_context_selector (ctx, tss_code, ts_code)) + { + write_string (use_sep); + write_string ("?????"); + } + break; + + default: + gcc_unreachable (); + } + + use_sep = sep; + } + continue; + } + break; + case OMP_TRAIT_SET_USER: + { + write_string (sep); + write_char ('U'); + continue; + } + break; + default: + gcc_unreachable (); + } + } + + return omp_finish_mangling (); +} + /* Compare construct={simd} CLAUSES1 with CLAUSES2, return 0/-1/1/2 as in omp_context_selector_set_compare. */ diff --git a/gcc/omp-general.h b/gcc/omp-general.h index 88a0b28f483..d9f83f40116 100644 --- a/gcc/omp-general.h +++ b/gcc/omp-general.h @@ -170,6 +170,9 @@ extern tree omp_check_context_selector (location_t loc, tree ctx); extern void omp_mark_declare_variant (location_t loc, tree variant, tree construct); extern int omp_context_selector_matches (tree); +extern tree omp_merge_context_selectors (tree, tree); +extern void omp_init_mangle (void); +extern const char * omp_mangle_context_selector (const char *, tree); extern int omp_context_selector_set_compare (enum omp_tss_code, tree, tree); extern tree omp_get_context_selector (tree, enum omp_tss_code, enum omp_ts_code); diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index dd802ca37a6..9993a2e0ee1 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -2089,6 +2089,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy) = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); DECL_FUNCTION_VERSIONED (decl) = DECL_FUNCTION_VERSIONED (current_function_decl); + DECL_FUNCTION_OMP_VARIANT (decl) + = DECL_FUNCTION_OMP_VARIANT (current_function_decl); if (omp_maybe_offloaded_ctx (ctx)) { diff --git a/gcc/omp-selectors.h b/gcc/omp-selectors.h index 825a082c939..65011bb610c 100644 --- a/gcc/omp-selectors.h +++ b/gcc/omp-selectors.h @@ -80,10 +80,12 @@ extern const char *omp_tss_map []; null-terminated array of strings. */ struct omp_ts_info { const char *name; + const char *abbrev_name; unsigned int tss_mask; enum omp_tp_type tp_type; bool allow_score; const char * const *valid_properties; + const char * const *prop_abbrevs; }; extern struct omp_ts_info omp_ts_map[]; diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-1.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-1.C new file mode 100644 index 00000000000..2af1001d6cb --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-1.C @@ -0,0 +1,82 @@ +#if 1 +#ifdef _OPENMP +namespace bar { +#pragma omp begin declare variant match(construct={target}) +#pragma omp begin declare variant match(construct={parallel}) +namespace qux { +int foo (char *buf, int len) +{ + return 5; +} +} // namespace qux +#pragma omp end declare variant +#pragma omp end declare variant +} // namespace bar +#endif + +namespace bar { +namespace qux { +int foo (char *buf, int len) +{ + return 3; +} +} // namespace qux +} // namespace bar + +#pragma omp begin declare variant match(implementation={vendor(gnu)}) +#pragma omp begin declare variant match(implementation={vendor(gnu,nvidia)}) +int quux (int c, int d) +{ + return c+d; +} +#pragma omp end declare variant +#pragma omp end declare variant + +#pragma omp begin declare variant match(device={arch("mips","riscv")}) +#pragma omp begin declare variant match(device={arch("mips"),isa("mips4")}) +int quuux (int c, int d) +{ + return c+d; +} +#pragma omp end declare variant +#pragma omp begin declare variant match(device={arch("riscv"),isa("rv3.5")}) +int quuux (int c, int d) +{ + return c+d; +} +#pragma omp end declare variant +#pragma omp end declare variant + +int quux (int c, int d) +{ + return c+d; +} +#endif + +#if 1 +#pragma omp begin declare variant match(device={kind(any)}) +#pragma omp begin declare variant match(device={kind(gpu,nohost,fpga)}) +int warble (int c, int d) +{ + return c*d; +} +#pragma omp end declare variant +#pragma omp end declare variant +#endif + +#if 1 +#pragma omp begin declare variant match(device={kind(gpu,nohost,fpga)}) +#pragma omp begin declare variant match(device={kind(any)}) +int warble2 (int c, int d) +{ + return c*d; +} +#pragma omp end declare variant +#pragma omp end declare variant +#endif + + +#pragma omp begin declare variant match(device={arch("mips",riscv)}) +#pragma omp begin declare variant match(device={arch(mips,"riscv")}) +#pragma omp end declare variant +#pragma omp end declare variant diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-11.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-11.C new file mode 100644 index 00000000000..837eaab1d29 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-11.C @@ -0,0 +1,20 @@ +#pragma omp begin declare variant match(implementation={dynamic_allocators,unified_shared_memory}) +/* This is a "sorry" for now. */ +#pragma omp begin declare variant match(implementation={requires(reverse_offload)}) +int foo (int c) +{ + return c+5; +} +#pragma omp end declare variant +#pragma omp end declare variant + +int foo (int c) +{ + return c+3; +} + +int main(void) +{ + foo (6); + return 0; +} diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-12.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-12.C new file mode 100644 index 00000000000..26f0213195b --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-12.C @@ -0,0 +1,32 @@ +#include + +int r_foo (int c, int d) +{ + return (c+d) * 3; +} + +#pragma omp declare variant (r_foo) match(implementation={requires(unified_shared_memory)}) +int foo (int c, int d) +{ + return c+d; +} + +int main() +{ + printf ("host: %d\n", foo (3, 4)); + +#pragma omp parallel + { +#pragma omp single + { + printf ("parallel: %d\n", foo (3, 4)); + } + } + +#pragma omp target + { + printf ("target: %d\n", foo (3, 4)); + } + + return 0; +} diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-2.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-2.C new file mode 100644 index 00000000000..98d7dff9963 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-2.C @@ -0,0 +1,26 @@ +#pragma omp begin declare variant match(implementation={vendor(score(5):nvidia)}) +#pragma omp begin declare variant match(implementation={vendor(nvidia)}) +int foo (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant + +#pragma omp begin declare variant match(implementation={vendor(nvidia)}) +#pragma omp begin declare variant match(implementation={vendor(score(7):nvidia)}) +int bar (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant + +#pragma omp begin declare variant match(implementation={vendor(score(9):nvidia)}) +#pragma omp begin declare variant match(implementation={vendor(score(13):"nvidia")}) +int qux (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant + + const bool enable_p = true; + const bool enable2_p = false; + +#pragma omp begin declare variant match(implementation={vendor(score(9):nvidia,amd)},user={condition(score(17):enable_p)}) +#pragma omp begin declare variant match(implementation={vendor(score(13):"nvidia")},user={condition(enable2_p)}) +int quux (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-3.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-3.C new file mode 100644 index 00000000000..2f5a23f5594 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-3.C @@ -0,0 +1,6 @@ +#pragma omp begin declare variant match(implementation={vendor(amd,nvidia),atomic_default_mem_order(relaxed)}) +#pragma omp begin declare variant match(implementation={atomic_default_mem_order(acq_rel)}) +int foo (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant + diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-4.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-4.C new file mode 100644 index 00000000000..5e1478132e6 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-4.C @@ -0,0 +1,10 @@ +#pragma omp begin declare variant match(implementation={vendor(amd,nvidia),atomic_default_mem_order(acq_rel)}) +#pragma omp begin declare variant match(implementation={atomic_default_mem_order(acq_rel)}) +int foo (int c) { return c; } +#pragma omp end declare variant +#pragma omp end declare variant + + +int foo (int c) { return c; } + +int main() { foo (5); return 0; } diff --git a/gcc/testsuite/g++.dg/gomp/delim-declare-variant-5.C b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-5.C new file mode 100644 index 00000000000..25b08697dc2 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/delim-declare-variant-5.C @@ -0,0 +1,4 @@ +#pragma omp begin declare variant match(device={arch("super power")}) +int foo (int c) { return c; } +#pragma omp end declare variant + diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 1ab18fa6b0f..48492951931 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -8300,6 +8300,10 @@ dump_function_to_file (tree fndecl, FILE *file, dump_flags_t flags) print_omp_context_selector (file, TREE_VALUE (a), dump_flags); } + else if (!strcmp (IDENTIFIER_POINTER (name), + "omp declare variant overload")) + print_omp_context_selector (file, TREE_VALUE (chain), + dump_flags); else print_generic_expr (file, TREE_VALUE (chain), dump_flags); fprintf (file, ")"); diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 65e51b939a2..0ffdbb20e01 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1994,8 +1994,9 @@ struct GTY(()) tree_function_decl { unsigned has_debug_args_flag : 1; unsigned versioned_function : 1; unsigned replaceable_operator : 1; + unsigned omp_variant : 1; - /* 11 bits left for future expansion. */ + /* 10 bits left for future expansion. */ /* 32 bits on 64-bit HW. */ }; diff --git a/gcc/tree.h b/gcc/tree.h index 086b55f0375..019deecca40 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -3505,6 +3505,13 @@ extern vec **decl_debug_args_insert (tree); #define DECL_FUNCTION_VERSIONED(NODE)\ (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function) +/* In FUNCTION_DECL, a OpenMP 'declare variant' function from a + 'begin/end declare variant' block. These have the same name as some base + function, so this flag is used to disambiguate them (similar to + DECL_FUNCTION_VERSIONED). */ +#define DECL_FUNCTION_OMP_VARIANT(NODE) \ + (FUNCTION_DECL_CHECK (NODE)->function_decl.omp_variant) + /* In FUNCTION_DECL, this is set if this function is a C++ constructor. Devirtualization machinery uses this knowledge for determing type of the object constructed. Also we assume that constructor address is not diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-10.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-10.C new file mode 100644 index 00000000000..2bc4fbc5260 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-10.C @@ -0,0 +1,19 @@ +#pragma omp begin declare variant match(implementation={dynamic_allocators,unified_shared_memory}) +#pragma omp begin declare variant match(implementation={reverse_offload}) +int foo (int c) +{ + return c+5; +} +#pragma omp end declare variant +#pragma omp end declare variant + +int foo (int c) +{ + return c+3; +} + +int main(void) +{ + foo (6); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-13.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-13.C new file mode 100644 index 00000000000..bbdf7fa6811 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-13.C @@ -0,0 +1,51 @@ +#include + +int foo(int c) +{ + return c+1; +} + +int foo(int c, int d) +{ + return c+d; +} + +int foo(int c, int d, int e, int f = 10) +{ + return c+d+e+f; +} + +#pragma omp begin declare variant match(construct={target}) +int foo(int c) +{ + return (c+1)*2; +} + +int foo(int c, int d) +{ + return (c+d)*2; +} + +int foo(int c, int d, int e, int f = 10) +{ + return (c+d+e+f)*2; +} +#pragma omp end declare variant + +int main() +{ + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2,3) = %d\n", foo(5,2,3)); + printf ("foo(5,2,3,4) = %d\n", foo(5,2,3,4)); + +#pragma omp target + { + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2,3) = %d\n", foo(5,2,3)); + printf ("foo(5,2,3,4) = %d\n", foo(5,2,3,4)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-14.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-14.C new file mode 100644 index 00000000000..99739d8cae1 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-14.C @@ -0,0 +1,57 @@ +#include + +template +T foo(T c) +{ + return c+1; +} + +template +T foo(T c, T d) +{ + return c+d; +} + +template +T foo(T c, T d, T e, T f = U) +{ + return c+d+e+f; +} + +#pragma omp begin declare variant match(construct={target}) +template +T foo(T c) +{ + return (c+1)*2; +} + +template +T foo(T c, T d) +{ + return (c+d)*2; +} + +template +T foo(T c, T d, T e, T f = U) +{ + return (c+d+e+f)*2; +} +#pragma omp end declare variant + +int main() +{ + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2,3) = %d\n", foo(5,2,3)); + printf ("foo(5,2,3,4) = %d\n", foo(5,2,3,4)); + +#pragma omp target + { + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2,3) = %d\n", foo(5,2,3)); + printf ("foo(5,2,3,4) = %d\n", foo(5,2,3,4)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-15.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-15.C new file mode 100644 index 00000000000..1f953e25ec0 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-15.C @@ -0,0 +1,55 @@ +#include + +template +T foo(T c) +{ + return c+1; +} + +template +T foo(T c, T d) +{ + return c+d; +} + +template +T foo(T c, S d) +{ + return c+d+2; +} + +#pragma omp begin declare variant match(construct={target}) +template +T foo(T c) +{ + return (c+1)*2; +} + +template +T foo(T c, T d) +{ + return (c+d)*2; +} + +template +T foo(T c, S d) +{ + return (c+d+2)*2; +} +#pragma omp end declare variant + +int main() +{ + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2) = %d\n", foo(5,2)); + +#pragma omp target + { + printf ("foo(5) = %d\n", foo(5)); + printf ("foo(5,2) = %d\n", foo(5,2)); + printf ("foo(5,2) = %d\n", foo(5,2)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-16.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-16.C new file mode 100644 index 00000000000..22551d24469 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-16.C @@ -0,0 +1,30 @@ +#include + +class foo { +public: + int bar (int c, int d) { return c + d; } + static int qux (int c, int d) { return c + d + 5; } + +#if 1 +#pragma omp begin declare variant match(construct={target}) + int bar (int c, int d) { return c + d + 3; } + static int qux (int c, int d) { return c + d + 7; } +#pragma omp end declare variant +#endif +}; + +int main() +{ + foo fv; + + printf ("fv.bar (1, 2)=%d\n", fv.bar (1, 2)); + printf ("foo::qux (1, 2)=%d\n", foo::qux (1, 2)); + +#pragma omp target + { + printf ("fv.bar (1, 2)=%d\n", fv.bar (1, 2)); + printf ("foo::qux (1, 2)=%d\n", foo::qux (1, 2)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-17.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-17.C new file mode 100644 index 00000000000..0d70cf51d50 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-17.C @@ -0,0 +1,29 @@ +#include + +class foo { +public: + int t_bar (int c, int d) { return c + d; } + static int t_qux (int c, int d) { return c + d + 5; } + +#pragma omp declare variant (t_bar) match(construct={target}) + int bar (int c, int d) { return c + d + 3; } + +#pragma omp declare variant (t_qux) match(construct={target}) + static int qux (int c, int d) { return c + d + 7; } +}; + +int main() +{ + foo fv; + + printf ("fv.bar (1, 2)=%d\n", fv.bar (1, 2)); + printf ("foo::qux (1, 2)=%d\n", foo::qux (1, 2)); + +#pragma omp target + { + printf ("fv.bar (1, 2)=%d\n", fv.bar (1, 2)); + printf ("foo::qux (1, 2)=%d\n", foo::qux (1, 2)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-18.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-18.C new file mode 100644 index 00000000000..1db28324bbe --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-18.C @@ -0,0 +1,46 @@ +#include +#include + +#if 0 +#pragma omp begin declare variant match(device={arch(x86)}) +template +T foo (T v) +{ + return v + 3; +} +#pragma omp end declare variant +#endif + +#pragma omp begin declare variant match(device={arch(powerpc)}) +template +T foo (T v) +{ + return v + 5; +} +#pragma omp end declare variant + +#pragma omp begin declare variant match(device={arch(aarch64)}) +template +T foo (T v) +{ + return v + 7; +} +#pragma omp end declare variant + +template +T foo (T v) +{ + return v + 1; +} + +int main () +{ + int res; +#pragma omp dispatch novariants(1) + { + res = foo (3); + } + printf ("dispatch foo(3)=%d\n", res); + printf ("regular foo(3)=%d\n", foo (3)); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-19.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-19.C new file mode 100644 index 00000000000..077672299ff --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-19.C @@ -0,0 +1,41 @@ +#include +#include + +template +T foo_x86 (T v) +{ + return v + 3; +} + +template +T foo_powerpc (T v) +{ + return v + 5; +} + +template +T foo_aarch64 (T v) +{ + return v + 7; +} + +#pragma omp declare variant (foo_x86) match(device={arch(x86)}) +#pragma omp declare variant (foo_powerpc) match(device={arch(powerpc)}) +#pragma omp declare variant (foo_aarch64) match(device={arch(arm)}) +template +T foo (T v) +{ + return v + 1; +} + +int main () +{ + int res; +#pragma omp dispatch novariants(1) + { + res = foo (3); + } + printf ("dispatch foo(3)=%d\n", res); + printf ("regular foo(3)=%d\n", foo (3)); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-20.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-20.C new file mode 100644 index 00000000000..b1d4f500d1c --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-20.C @@ -0,0 +1,39 @@ +#include +#include + +typedef int T; + +T foo_x86 (T v) +{ + return v + 3; +} + +T foo_powerpc (T v) +{ + return v + 5; +} + +T foo_aarch64 (T v) +{ + return v + 7; +} + +#pragma omp declare variant (foo_x86) match(device={arch(x86)}) +#pragma omp declare variant (foo_powerpc) match(device={arch(powerpc)}) +#pragma omp declare variant (foo_aarch64) match(device={arch(aarch64)}) +T foo (T v) +{ + return v + 1; +} + +int main () +{ + int res; +#pragma omp dispatch novariants(1) + { + res = foo (3); + } + printf ("dispatch foo(3)=%d\n", res); + printf ("regular foo(3)=%d\n", foo (3)); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-21.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-21.C new file mode 100644 index 00000000000..7a26c9fdae9 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-21.C @@ -0,0 +1,56 @@ +#include + +template +class foo +{ + T var; +public: + void set_var(T c) { var = c; } + T get_var () { return var + 1; } + +#pragma omp begin declare variant match(construct={target}) + T get_var () { return var + 3; } +#pragma omp end declare variant +}; + +void inty () +{ + foo fv; + fv.set_var (6); + + printf ("fv.get_var () = %d\n", fv.get_var ()); + + int res; + +#pragma omp target map(from: res) + { + res = fv.get_var (); + } + + printf ("fv.get_var () [target] = %d\n", res); +} + +void longy () +{ + foo fvl; + fvl.set_var (7); + + printf ("fvl.get_var () = %ld\n", fvl.get_var ()); + + long res; + +#pragma omp target map(from: res) + { + res = fvl.get_var (); + } + + printf ("fvl.get_var () [target] = %ld\n", res); +} + +int main () +{ + inty (); + longy (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-6.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-6.C new file mode 100644 index 00000000000..ab99f874f86 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-6.C @@ -0,0 +1,37 @@ +#include + +#pragma omp begin declare variant match(construct={target}) +int foo (void) +{ + return 7; +} +#pragma omp begin declare variant match(construct={teams}) +int foo (void) +{ + return 9; +} +#pragma omp end declare variant +#pragma omp end declare variant + +int foo (void) +{ + return 5; +} + +int main (void) +{ + int c = foo (); + printf ("host c=%d\n", c); + +#pragma omp target map(from:c) + { + c = foo (); + } + printf ("target c=%d\n", c); +#pragma omp target teams map(from:c) + { + c = foo (); + } + printf ("target teams c=%d\n", c); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-7.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-7.C new file mode 100644 index 00000000000..9dd697f9622 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-7.C @@ -0,0 +1,38 @@ +#include + +int p_foo (int c, int d) +{ + return (c+d) * 2; +} + +int t_foo (int c, int d) +{ + return (c+d) * 3; +} + +#pragma omp declare variant (p_foo) match(construct={parallel}) +#pragma omp declare variant (t_foo) match(construct={target}) +int foo (int c, int d) +{ + return c+d; +} + +int main() +{ + printf ("host: %d\n", foo (3, 4)); + +#pragma omp parallel + { +#pragma omp single + { + printf ("parallel: %d\n", foo (3, 4)); + } + } + +#pragma omp target + { + printf ("target: %d\n", foo (3, 4)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-8.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-8.C new file mode 100644 index 00000000000..b674474ddcc --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-8.C @@ -0,0 +1,61 @@ +#include + +#define VARIANTS + +#ifdef VARIANTS +#pragma omp begin declare variant match(construct={target}) +template +T foo (void) +{ + return 7; +} +#pragma omp begin declare variant match(construct={teams}) +template +T foo (void) +{ + return 9; +} +#pragma omp end declare variant +#pragma omp end declare variant +#endif + +template +T foo (void) +{ + return 5; +} + +int main (void) +{ + int c = foo (); + printf ("host c=%d\n", c); + +#pragma omp target map(from:c) + { + c = foo (); + } + printf ("target c=%d\n", c); +#pragma omp target teams map(from:c) + { + c = foo (); + } + printf ("target teams c=%d\n", c); + + long cl = foo (); + printf ("host cl=%ld\n", cl); + +#pragma omp target map(from:cl) + { + cl = foo (); + } + printf ("target cl=%ld\n", cl); +#pragma omp target teams map(from:cl) + { + cl = foo (); + } + printf ("target teams cl=%ld\n", cl); + + // int q = foo (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/delim-declare-variant-9.C b/libgomp/testsuite/libgomp.c++/delim-declare-variant-9.C new file mode 100644 index 00000000000..135660f559b --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/delim-declare-variant-9.C @@ -0,0 +1,54 @@ +#include + +template +T foo_targ (void) +{ + return 7; +} + +template +T foo_targteams (void) +{ + return 9; +} + +#pragma omp declare variant(foo_targ) match(construct={target}) +#pragma omp declare variant(foo_targteams) match(construct={target,teams}) +template +T foo (void) +{ + return 5; +} + +int main (void) +{ + int c = foo (); + printf ("host c=%d\n", c); + +#pragma omp target map(from:c) + { + c = foo (); + } + printf ("target c=%d\n", c); +#pragma omp target teams map(from:c) + { + c = foo (); + } + printf ("target teams c=%d\n", c); + + long cl = foo (); + printf ("host cl=%ld\n", cl); + +#pragma omp target map(from:cl) + { + cl = foo (); + } + printf ("target cl=%ld\n", cl); +#pragma omp target teams map(from:cl) + { + cl = foo (); + } + printf ("target teams cl=%ld\n", cl); + + return 0; +}