From patchwork Fri Mar 24 15:30:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 74599 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp719021vqo; Fri, 24 Mar 2023 08:39:20 -0700 (PDT) X-Google-Smtp-Source: AKy350ZQDhyWOmNxIC2jAZoFsXMf/MQyMeqYafg4SAW9gW5ccNwLmSE4gwdb4/QntigrwRa6btnq X-Received: by 2002:a17:906:dff5:b0:88c:a43d:81bc with SMTP id lc21-20020a170906dff500b0088ca43d81bcmr2786460ejc.58.1679672360598; Fri, 24 Mar 2023 08:39:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679672360; cv=none; d=google.com; s=arc-20160816; b=HMoSxKaolaNixT63QPNfBemw39jaPZjY/dRWbsStoV+67DwglLmbyY53qpxPtafgdo ZOcnOGoyW6r83CZuHxFLsHSpbOnrcMDJJS5AXZ4EozmK+B3evGeX6Lq+yNiwkCuqS+/0 sTv3Qyd4pJUxe/JOpmlKMPUYTMhf4U28NFoWZ9vXtquzTmvEHh24lLD0rFdV8hdTsgLC LdTzCJdoIQFYc6VZZ1FGyByAMXaLd2VSGKWNnAgbdcRTQxJK6xgmNf7IhHzX22mfU6vJ 3BDkg1PqHwGUQluaEyr3ovb8D0BIg6DUGdY0Xuf8fmgRo24BvgcBORBAF97JfwbWEg4T Z2jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :ironport-sdr:dmarc-filter:delivered-to; bh=6mVgaqvsU7nmJyK9pf7E7CZEZb6akto3c/Dc4ygBQ4w=; b=ovesa/pUwNtOaqDxJsg6DsIrbtevWTz1K/yObtM8Zc3sabwg1dTEL3p6FdmIdC2gR0 XEs7g5W99fvao/Mh2CebIP4Jt0vgl1y+ShswqxeNPzSKxcRporuU28HSTcDmWFI9QR9I HVvj3wMQ9ynK/CUjwiv0sofoFU0fjKpmk+0MI0WQJepE/ZoOZ/6l2wsL6GDi1NNx1u5Y lwqa/IZxA+a94iqzT6P95GA1qdgdw4TlQjL2i/q8YjYyLft/vjzr4LZ2bAtqWGhAfpAI OnKG8Ib7vf96kVNBqmH70En8vp4pgiqWXpSrWJBerEiz3OSujMuL0gwilBc0wXrrpKCH P3hQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id l19-20020a170906795300b0093defbd6284si2995279ejo.1035.2023.03.24.08.39.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Mar 2023 08:39:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E900384842D for ; Fri, 24 Mar 2023 15:34:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 039B03858428; Fri, 24 Mar 2023 15:33:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 039B03858428 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="324682" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:07 -0800 IronPort-SDR: WLs1C8plr1BmZ6jYl7BNqaqdghwQVg7Tc82WX4XTyA8iljFBwvxyO891vasqhDzFrHGjJgh3qB MmLO4ZlZx7DJeyHYbVkZNhXl1edSXmUzum91q1x5SAEnboPmEbSMEPfewZI0w6b8OuP3VLM1Nd 65qE03A0iqUd/l3s9Vf4GKj6z/x4VjvzLM+2OwLwodpbQVakGSGWIq3BvlXW+HPtS9HIxlb9sh KRL1UNivQ0KIfl0mrzA1CMf3M1i1HqjjvVVeF3vJiiZA6MKGs4htjhddLZYqAN+nX65fXyZJAG Q/c= From: Frederik Harwath To: , , , Subject: [PATCH 1/7] openmp: Add Fortran support for "omp unroll" directive Date: Fri, 24 Mar 2023 16:30:39 +0100 Message-ID: <20230324153046.3996092-2-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761264125047647609?= X-GMAIL-MSGID: =?utf-8?q?1761264125047647609?= This commit implements the OpenMP 5.1 "omp unroll" directive for Fortran. The Fortran front end changes encompass the parsing and the verification of nesting restrictions etc. The actual loop transformation is implemented in a new language-independent "omp_transform_loops" pass which runs before omp lowering. No attempt is made to re-use existing unrolling optimizations because a separate implementation allows for better control of the unrolling. The new pass will also serve as a foundation for the implementation of further OpenMP loop transformations. This commit only implements the support for "omp unroll" on the outermost loop of a loop nest. The support for inner loops will be added later. gcc/ChangeLog: * Makefile.in: Add omp_transform_loops.o. * gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full" and "partial" clauses. * gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP. * gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL. (gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_omp_for): Handle OMP_UNROLL. (gimplify_expr): Likewise. * params.opt: Add omp-unroll-full-max-iteration and omp-unroll-default-factor. * passes.def: Add pass_omp_transform_loop before pass_lower_omp. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. * tree-pass.h (make_pass_omp_transform_loops): Declare pmake_pass_omp_transform_loops. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. (dump_generic_node): Handle OMP_UNROLL. * tree.cc (omp_clause_num_ops): Add number of operators for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAl. (omp_clause_code_names): Add name strings for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAL. * tree.def (OMP_UNROLL): Define. * tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define. * omp-transform-loops.cc: New file. * omp-general.cc (omp_loop_transform_clause_p): New function. * omp-general.h (omp_loop_transform_clause_p): New declaration. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Handle "unroll full" and "unroll partial". (show_omp_node): Handle OMP_UNROLL. (show_code_node): Handle EXEC_OMP_UNROLL. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL. (enum gfc_exec_op): Add EXEC_OMP_UNROLL. * match.h (gfc_match_omp_unroll): Declare. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL. (gfc_match_omp_clauses): Handle "omp unroll partial". (OMP_UNROLL_CLAUSES): New macro definition. (gfc_match_omp_unroll): Match "full" clause. (omp_unroll_removes_loop_nest): New function. (resolve_omp_unroll): New function. (resolve_omp_do): Accept and verify "omp unroll" directives between directive and loop. (omp_code_to_statement): Handle EXEC_OMP_UNROLL. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle "undroll" and "end unroll". (next_statement): Handle ST_OMP_UNROLL. (gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL. (parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL before/after loop. (parse_executable): Handle ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses. (gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation. (gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL. * trans.cc (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test. --- gcc/Makefile.in | 1 + gcc/fortran/dump-parse-tree.cc | 15 + gcc/fortran/gfortran.h | 9 +- gcc/fortran/match.h | 1 + gcc/fortran/openmp.cc | 174 +- gcc/fortran/parse.cc | 37 +- gcc/fortran/resolve.cc | 3 + gcc/fortran/st.cc | 1 + gcc/fortran/trans-openmp.cc | 71 +- gcc/fortran/trans.cc | 1 + gcc/gimple-pretty-print.cc | 6 + gcc/gimple.h | 1 + gcc/gimplify.cc | 40 +- gcc/omp-general.cc | 14 + gcc/omp-general.h | 1 + gcc/omp-transform-loops.cc | 1401 +++++++++++++++++ gcc/params.opt | 9 + gcc/passes.def | 1 + .../gomp/loop-transforms/unroll-1.f90 | 277 ++++ .../gomp/loop-transforms/unroll-10.f90 | 7 + .../gomp/loop-transforms/unroll-11.f90 | 75 + .../gomp/loop-transforms/unroll-12.f90 | 29 + .../gomp/loop-transforms/unroll-2.f90 | 22 + .../gomp/loop-transforms/unroll-3.f90 | 17 + .../gomp/loop-transforms/unroll-4.f90 | 18 + .../gomp/loop-transforms/unroll-5.f90 | 18 + .../gomp/loop-transforms/unroll-6.f90 | 19 + .../gomp/loop-transforms/unroll-7.f90 | 62 + .../gomp/loop-transforms/unroll-8.f90 | 22 + .../gomp/loop-transforms/unroll-9.f90 | 18 + .../loop-transforms/unroll-no-clause-1.f90 | 20 + .../loop-transforms/unroll-no-clause-2.f90 | 21 + .../loop-transforms/unroll-no-clause-3.f90 | 23 + .../gomp/loop-transforms/unroll-simd-1.f90 | 244 +++ .../gomp/loop-transforms/unroll-simd-2.f90 | 57 + gcc/tree-core.h | 9 + gcc/tree-pass.h | 1 + gcc/tree-pretty-print.cc | 20 + gcc/tree.cc | 6 + gcc/tree.def | 6 + gcc/tree.h | 3 + .../loop-transforms/unroll-1.f90 | 52 + .../loop-transforms/unroll-2.f90 | 88 ++ .../loop-transforms/unroll-3.f90 | 59 + .../loop-transforms/unroll-4.f90 | 72 + .../loop-transforms/unroll-5.f90 | 55 + .../loop-transforms/unroll-6.f90 | 105 ++ .../loop-transforms/unroll-7.f90 | 198 +++ .../loop-transforms/unroll-7a.f90 | 7 + .../loop-transforms/unroll-7b.f90 | 7 + .../loop-transforms/unroll-7c.f90 | 7 + .../loop-transforms/unroll-8.f90 | 38 + .../loop-transforms/unroll-simd-1.f90 | 33 + 53 files changed, 3484 insertions(+), 17 deletions(-) create mode 100644 gcc/omp-transform-loops.cc create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d8b76d83d68..8e203f68bd7 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1540,6 +1540,7 @@ OBJS = \ omp-expand.o \ omp-general.o \ omp-low.o \ + omp-transform-loops.o \ omp-oacc-kernels-decompose.o \ omp-oacc-neuter-broadcast.o \ omp-simd-clone.o \ diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc index 3b24bdc1a6c..e069aca1f1d 100644 --- a/gcc/fortran/dump-parse-tree.cc +++ b/gcc/fortran/dump-parse-tree.cc @@ -2052,6 +2052,16 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) } if (omp_clauses->assume) show_omp_assumes (omp_clauses->assume); + if (omp_clauses->unroll_full) + { + fputs (" FULL", dumpfile); + } + if (omp_clauses->unroll_partial) + { + fputs (" PARTIAL", dumpfile); + if (omp_clauses->unroll_partial_factor > 0) + fprintf (dumpfile, "(%u)", omp_clauses->unroll_partial_factor); + } } /* Show a single OpenMP or OpenACC directive node and everything underneath it @@ -2162,6 +2172,7 @@ show_omp_node (int level, gfc_code *c) name = "TEAMS DISTRIBUTE PARALLEL DO SIMD"; break; case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: name = "TEAMS DISTRIBUTE SIMD"; break; case EXEC_OMP_TEAMS_LOOP: name = "TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "UNROLL"; break; case EXEC_OMP_WORKSHARE: name = "WORKSHARE"; break; default: gcc_unreachable (); @@ -2238,6 +2249,7 @@ show_omp_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: omp_clauses = c->ext.omp_clauses; break; @@ -2299,6 +2311,8 @@ show_omp_node (int level, gfc_code *c) d = d->block; } } + else if (c->op == EXEC_OMP_UNROLL) + show_code (level + 1, c->block != NULL ? c->block->next : c->next); else show_code (level + 1, c->block->next); if (c->op == EXEC_OMP_ATOMIC) @@ -3477,6 +3491,7 @@ show_code_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: show_omp_node (level, c); break; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 9bab2c40ead..5ef4a8907b0 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -319,7 +319,8 @@ enum gfc_statement ST_OMP_END_MASKED_TASKLOOP_SIMD, ST_OMP_SCOPE, ST_OMP_END_SCOPE, ST_OMP_ERROR, ST_OMP_ASSUME, ST_OMP_END_ASSUME, ST_OMP_ASSUMES, /* Note: gfc_match_omp_nothing returns ST_NONE. */ - ST_OMP_NOTHING, ST_NONE + ST_OMP_NOTHING, ST_NONE, + ST_OMP_UNROLL, ST_OMP_END_UNROLL }; /* Types of interfaces that we can have. Assignment interfaces are @@ -1561,6 +1562,8 @@ typedef struct gfc_omp_clauses unsigned order_unconstrained:1, order_reproducible:1, capture:1; unsigned grainsize_strict:1, num_tasks_strict:1, compare:1, weak:1; unsigned non_rectangular:1, order_concurrent:1; + unsigned unroll_full:1, unroll_none:1, unroll_partial:1; + unsigned unroll_partial_factor; ENUM_BITFIELD (gfc_omp_sched_kind) sched_kind:3; ENUM_BITFIELD (gfc_omp_device_type) device_type:2; ENUM_BITFIELD (gfc_omp_memorder) memorder:3; @@ -2974,6 +2977,7 @@ enum gfc_exec_op EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED, EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD, EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE, + EXEC_OMP_UNROLL, EXEC_OMP_ERROR }; @@ -3868,6 +3872,9 @@ void gfc_generate_module_code (gfc_namespace *); /* trans-intrinsic.cc */ bool gfc_inline_intrinsic_function_p (gfc_expr *); +/* trans-openmp.cc */ +bool loop_transform_p (gfc_exec_op op); + /* bbt.cc */ typedef int (*compare_fn) (void *, void *); void gfc_insert_bbt (void *, void *, compare_fn); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 4430aff001c..5640c725f09 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -226,6 +226,7 @@ match gfc_match_omp_teams_distribute_parallel_do_simd (void); match gfc_match_omp_teams_distribute_simd (void); match gfc_match_omp_teams_loop (void); match gfc_match_omp_threadprivate (void); +match gfc_match_omp_unroll (void); match gfc_match_omp_workshare (void); match gfc_match_omp_end_critical (void); match gfc_match_omp_end_nowait (void); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index abca146d78e..e54f016b170 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -1051,6 +1051,9 @@ enum omp_mask1 /* More OpenMP clauses and OpenACC 2.0+ specific clauses. */ enum omp_mask2 { + OMP_CLAUSE_UNROLL_FULL, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_NONE, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_PARTIAL, /* OpenMP 5.1. */ OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, @@ -2523,6 +2526,15 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, NULL, &head, true, true) == MATCH_YES)) continue; + if ((mask & OMP_CLAUSE_UNROLL_FULL) + && (m = gfc_match_dupl_check (!c->unroll_full, "full")) + != MATCH_NO) + { + if (m == MATCH_ERROR) + goto error; + c->unroll_full = needs_space = true; + continue; + } break; case 'g': if ((mask & OMP_CLAUSE_GANG) @@ -3156,10 +3168,36 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, } break; case 'p': - if ((mask & OMP_CLAUSE_COPY) - && gfc_match ("pcopy ( ") == MATCH_YES + if (mask & OMP_CLAUSE_UNROLL_PARTIAL) + { + if ((m = gfc_match_dupl_check (!c->unroll_partial, "partial")) + != MATCH_NO) + { + int unroll_factor; + if (m == MATCH_ERROR) + goto error; + + c->unroll_partial = true; + + gfc_expr *cexpr = NULL; + m = gfc_match (" ( %e )", &cexpr); + if (m == MATCH_NO) + ; + else if (m == MATCH_YES + && !gfc_extract_int (cexpr, &unroll_factor, -1) + && unroll_factor > 0) + c->unroll_partial_factor = unroll_factor; + else + gfc_error_now ("PARTIAL clause argument not constant " + "positive integer at %C"); + gfc_free_expr (cexpr); + continue; + } + } + if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM, true, allow_derived)) + OMP_MAP_TOFROM, true, + allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("pcopyin ( ") == MATCH_YES @@ -4270,6 +4308,8 @@ cleanup: (omp_mask (OMP_CLAUSE_AT) | OMP_CLAUSE_MESSAGE | OMP_CLAUSE_SEVERITY) #define OMP_WORKSHARE_CLAUSES \ omp_mask (OMP_CLAUSE_NOWAIT) +#define OMP_UNROLL_CLAUSES \ + (omp_mask (OMP_CLAUSE_UNROLL_FULL) | OMP_CLAUSE_UNROLL_PARTIAL) static match @@ -6369,6 +6409,20 @@ gfc_match_omp_teams_distribute_simd (void) | OMP_SIMD_CLAUSES); } +match +gfc_match_omp_unroll (void) +{ + match m = match_omp (EXEC_OMP_UNROLL, OMP_UNROLL_CLAUSES); + + /* Add an internal clause as a marker to indicate that this "unroll" + directive had no clause. */ + if (new_st.ext.omp_clauses + && !new_st.ext.omp_clauses->unroll_full + && !new_st.ext.omp_clauses->unroll_partial) + new_st.ext.omp_clauses->unroll_none = true; + + return m; +} match gfc_match_omp_workshare (void) @@ -9235,6 +9289,75 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause) } } + +static bool +omp_unroll_removes_loop_nest (gfc_code *code) +{ + gcc_assert (code->op == EXEC_OMP_UNROLL); + if (!code->ext.omp_clauses) + return true; + + if (code->ext.omp_clauses->unroll_none) + { + gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " + "into a non-loop", + &code->loc); + return true; + } + if (code->ext.omp_clauses->unroll_full) + { + gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " + "non-loop", + &code->loc); + return true; + } + return false; +} + +static void +resolve_loop_transform_generic (gfc_code *code, const char *descr) +{ + gcc_assert (code->block); + + if (code->block->op == EXEC_OMP_UNROLL + && !omp_unroll_removes_loop_nest (code->block)) + return; + + if (code->block->next->op == EXEC_OMP_UNROLL + && !omp_unroll_removes_loop_nest (code->block->next)) + return; + + if (code->block->next->op == EXEC_DO_WHILE) + { + gfc_error ("%s invalid around DO WHILE or DO without loop " + "control at %L", descr, &code->loc); + return; + } + if (code->block->next->op == EXEC_DO_CONCURRENT) + { + gfc_error ("%s invalid around DO CONCURRENT loop at %L", + descr, &code->loc); + return; + } + + gfc_error ("missing canonical loop nest after %s at %L", + descr, &code->loc); + +} + +static void +resolve_omp_unroll (gfc_code *code) +{ + if (!code->block || code->block->op == EXEC_DO) + return; + + if (code->block->next->op == EXEC_DO) + return; + + resolve_loop_transform_generic (code, "!$OMP UNROLL"); +} + + static void handle_local_var (gfc_symbol *sym) { @@ -9259,6 +9382,13 @@ is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) { int i; gfc_code *do_code = code->block->next; + while (loop_transform_p (do_code->op)) { + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (!loop_transform_p (do_code->op)); for (i = 1; i < depth; i++) { @@ -9277,6 +9407,13 @@ expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) { int i; gfc_code *do_code = code->block->next; + while (loop_transform_p (do_code->op)) { + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (!loop_transform_p (do_code->op)); for (i = 1; i < depth; i++) { @@ -9454,6 +9591,7 @@ resolve_omp_do (gfc_code *code) is_simd = true; break; case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "!$OMP UNROLL"; break; default: gcc_unreachable (); } @@ -9461,6 +9599,23 @@ resolve_omp_do (gfc_code *code) resolve_omp_clauses (code, code->ext.omp_clauses, NULL); do_code = code->block->next; + /* Move forward over any loop transformation directives to find the loop. */ + bool error = false; + while (do_code->op == EXEC_OMP_UNROLL) + { + if (!error && omp_unroll_removes_loop_nest (do_code)) + { + gfc_error ("missing canonical loop nest after %s at %L", name, + &code->loc); + error = true; + } + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (do_code->op != EXEC_OMP_UNROLL); + if (code->ext.omp_clauses->orderedc) collapse = code->ext.omp_clauses->orderedc; else @@ -9490,6 +9645,14 @@ resolve_omp_do (gfc_code *code) &do_code->loc); break; } + if (do_code->op != EXEC_DO) + { + gfc_error ("%s must be DO loop at %L", name, + &do_code->loc); + break; + } + + gcc_assert (do_code->op != EXEC_OMP_UNROLL); gcc_assert (do_code->op == EXEC_DO); if (do_code->ext.iterator->var->ts.type != BT_INTEGER) gfc_error ("%s iteration variable must be of type integer at %L", @@ -9726,6 +9889,8 @@ omp_code_to_statement (gfc_code *code) return ST_OMP_PARALLEL_LOOP; case EXEC_OMP_DEPOBJ: return ST_OMP_DEPOBJ; + case EXEC_OMP_UNROLL: + return ST_OMP_UNROLL; default: gcc_unreachable (); } @@ -10155,6 +10320,9 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_TEAMS_LOOP: resolve_omp_do (code); break; + case EXEC_OMP_UNROLL: + resolve_omp_unroll (code); + break; case EXEC_OMP_ASSUME: case EXEC_OMP_CANCEL: case EXEC_OMP_ERROR: diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc index f1e55316e5b..094678436b4 100644 --- a/gcc/fortran/parse.cc +++ b/gcc/fortran/parse.cc @@ -1008,6 +1008,7 @@ decode_omp_directive (void) ST_OMP_END_TEAMS_DISTRIBUTE); matcho ("end teams loop", gfc_match_omp_eos_error, ST_OMP_END_TEAMS_LOOP); matcho ("end teams", gfc_match_omp_eos_error, ST_OMP_END_TEAMS); + matchs ("end unroll", gfc_match_omp_eos_error, ST_OMP_END_UNROLL); matcho ("end workshare", gfc_match_omp_end_nowait, ST_OMP_END_WORKSHARE); break; @@ -1137,6 +1138,9 @@ decode_omp_directive (void) matchdo ("threadprivate", gfc_match_omp_threadprivate, ST_OMP_THREADPRIVATE); break; + case 'u': + matchs ("unroll", gfc_match_omp_unroll, ST_OMP_UNROLL); + break; case 'w': matcho ("workshare", gfc_match_omp_workshare, ST_OMP_WORKSHARE); break; @@ -1724,6 +1728,7 @@ next_statement (void) case ST_OMP_LOOP: case ST_OMP_PARALLEL_LOOP: case ST_OMP_TEAMS_LOOP: \ case ST_OMP_TARGET_PARALLEL_LOOP: case ST_OMP_TARGET_TEAMS_LOOP: \ case ST_OMP_ASSUME: \ + case ST_OMP_UNROLL: \ case ST_CRITICAL: \ case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \ case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \ @@ -2096,6 +2101,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_END_UNION: p = "END UNION"; break; + case ST_OMP_END_UNROLL: + p = "!$OMP END UNROLL"; + break; case ST_END_MAP: p = "END MAP"; break; @@ -2766,6 +2774,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_OMP_THREADPRIVATE: p = "!$OMP THREADPRIVATE"; break; + case ST_OMP_UNROLL: + p = "!$OMP UNROLL"; + break; case ST_OMP_WORKSHARE: p = "!$OMP WORKSHARE"; break; @@ -5180,6 +5191,7 @@ parse_omp_do (gfc_statement omp_st) gfc_statement st; gfc_code *cp, *np; gfc_state_data s; + int num_unroll = 0; accept_statement (omp_st); @@ -5196,6 +5208,12 @@ parse_omp_do (gfc_statement omp_st) unexpected_eof (); else if (st == ST_DO) break; + else if (st == ST_OMP_UNROLL) + { + accept_statement (st); + num_unroll++; + continue; + } else unexpected_statement (st); } @@ -5221,6 +5239,17 @@ parse_omp_do (gfc_statement omp_st) pop_state (); st = next_statement (); + for (; num_unroll > 0; num_unroll--) + { + if (st == ST_OMP_END_UNROLL) + { + gfc_clear_new_st (); + gfc_commit_symbols (); + gfc_warning_check (); + st = next_statement (); + } + } + gfc_statement omp_end_st = ST_OMP_END_DO; switch (omp_st) { @@ -5234,7 +5263,9 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_DISTRIBUTE_SIMD: omp_end_st = ST_OMP_END_DISTRIBUTE_SIMD; break; - case ST_OMP_DO: omp_end_st = ST_OMP_END_DO; break; + case ST_OMP_DO: + omp_end_st = ST_OMP_END_DO; + break; case ST_OMP_DO_SIMD: omp_end_st = ST_OMP_END_DO_SIMD; break; case ST_OMP_LOOP: omp_end_st = ST_OMP_END_LOOP; break; case ST_OMP_PARALLEL_DO: omp_end_st = ST_OMP_END_PARALLEL_DO; break; @@ -5307,6 +5338,9 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_TEAMS_LOOP: omp_end_st = ST_OMP_END_TEAMS_LOOP; break; + case ST_OMP_UNROLL: + omp_end_st = ST_OMP_END_UNROLL; + break; default: gcc_unreachable (); } if (st == omp_end_st) @@ -5991,6 +6025,7 @@ parse_executable (gfc_statement st) case ST_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case ST_OMP_TEAMS_DISTRIBUTE_SIMD: case ST_OMP_TEAMS_LOOP: + case ST_OMP_UNROLL: st = parse_omp_do (st); if (st == ST_IMPLIED_ENDDO) return st; diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index f6ec76acb0b..46988ff281d 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11041,6 +11041,7 @@ gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_LOOP: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: break; @@ -12197,6 +12198,7 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TARGET_SIMD: + case EXEC_OMP_UNROLL: gfc_resolve_omp_do_blocks (code, ns); break; case EXEC_SELECT_TYPE: @@ -12693,6 +12695,7 @@ start: case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_resolve_omp_directive (code, ns); break; diff --git a/gcc/fortran/st.cc b/gcc/fortran/st.cc index 657bc9deebf..6112831e621 100644 --- a/gcc/fortran/st.cc +++ b/gcc/fortran/st.cc @@ -277,6 +277,7 @@ gfc_free_statement (gfc_code *p) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_free_omp_clauses (p->ext.omp_clauses); break; diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 84c0184f48e..c4a23f6e247 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3890,6 +3890,29 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, omp_clauses = gfc_trans_add_clause (c, omp_clauses); } + if (clauses->unroll_full) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_FULL); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_none) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_NONE); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_partial) + { + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) + = clauses->unroll_partial_factor ? build_int_cst ( + integer_type_node, clauses->unroll_partial_factor) + : NULL_TREE; + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + if (clauses->ordered) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_ORDERED); @@ -5080,6 +5103,12 @@ gfc_trans_omp_cancel (gfc_code *code) return gfc_finish_block (&block); } +bool +loop_transform_p (gfc_exec_op op) +{ + return op == EXEC_OMP_UNROLL; +} + static tree gfc_trans_omp_cancellation_point (gfc_code *code) { @@ -5257,7 +5286,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, { gfc_se se; tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls; - tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses; + tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses, loop_transform_clauses; stmtblock_t block; stmtblock_t body; gfc_omp_clauses *clauses = code->ext.omp_clauses; @@ -5268,6 +5297,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, vec *saved_doacross_steps = doacross_steps; gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list; gfc_code *orig_code = code; + locus top_loc = code->loc; /* Both collapsed and tiled loops are lowered the same way. In OpenACC, those clauses are not compatible, so prioritize the tile @@ -5285,7 +5315,25 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, if (collapse <= 0) collapse = 1; + if (pblock == NULL) + { + gfc_start_block (&block); + pblock = █ + } code = code->block->next; + gcc_assert (code->op == EXEC_DO || code->op == EXEC_OMP_UNROLL); + /* Loop transformation directives surrounding the associated loop of an "omp + do" (or similar directive) are represented as clauses on the "omp do". */ + loop_transform_clauses = NULL; + while (code->op == EXEC_OMP_UNROLL) + { + tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, + code->loc); + loop_transform_clauses = chainon (loop_transform_clauses, clauses); + + code = code->block ? code->block->next : code->next; + } + gcc_assert (code->op != EXEC_OMP_UNROLL); gcc_assert (code->op == EXEC_DO); init = make_tree_vec (collapse); @@ -5293,18 +5341,21 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, incr = make_tree_vec (collapse); orig_decls = clauses->ordered ? make_tree_vec (collapse) : NULL_TREE; - if (pblock == NULL) - { - gfc_start_block (&block); - pblock = █ - } - /* simd schedule modifier is only useful for composite do simd and other constructs including that, where gfc_trans_omp_do is only called on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + if (op == EXEC_OMP_UNROLL) + { + /* This is a loop transformation on a loop which is not associated with + any other directive. Use the directive location instead of the loop + location for the clauses. */ + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); + } + else + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = chainon (omp_clauses, loop_transform_clauses); for (i = 0; i < collapse; i++) { @@ -5558,7 +5609,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } gcc_assert (local_dovar == dovar || c != NULL); } - if (local_dovar != dovar) + if (local_dovar != dovar && op != EXEC_OMP_UNROLL) { if (op != EXEC_OMP_SIMD || dovar_found == 1) tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE); @@ -5644,6 +5695,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break; case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break; case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break; + case EXEC_OMP_UNROLL: stmt = make_node (OMP_LOOP_TRANS); break; default: gcc_unreachable (); } @@ -7741,6 +7793,7 @@ gfc_trans_omp_directive (gfc_code *code) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TASKLOOP: + case EXEC_OMP_UNROLL: return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses, NULL); case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: diff --git a/gcc/fortran/trans.cc b/gcc/fortran/trans.cc index f7745add045..56ec59fe80e 100644 --- a/gcc/fortran/trans.cc +++ b/gcc/fortran/trans.cc @@ -2520,6 +2520,7 @@ trans_code (gfc_code * code, tree cond) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: res = gfc_trans_omp_directive (code); break; diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index 300e9d7ed1e..24ef60059fe 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -1478,6 +1478,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: kind = " simd"; break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + kind = " unroll"; + break; default: gcc_unreachable (); } @@ -1515,6 +1518,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: pp_string (buffer, "#pragma omp simd"); break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + pp_string (buffer, "#pragma omp loop_transform"); + break; default: gcc_unreachable (); } diff --git a/gcc/gimple.h b/gcc/gimple.h index 081d18e425a..213cfc58abb 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -159,6 +159,7 @@ enum gf_mask { GF_OMP_FOR_KIND_TASKLOOP = 2, GF_OMP_FOR_KIND_OACC_LOOP = 4, GF_OMP_FOR_KIND_SIMD = 5, + GF_OMP_FOR_KIND_TRANSFORM_LOOP = 6, GF_OMP_FOR_COMBINED = 1 << 3, GF_OMP_FOR_COMBINED_INTO = 1 << 4, GF_OMP_TARGET_KIND_MASK = (1 << 5) - 1, diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index ade6e335da7..2c160686533 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -5949,6 +5949,7 @@ is_gimple_stmt (tree t) case OACC_CACHE: case OMP_PARALLEL: case OMP_FOR: + case OMP_LOOP_TRANS: case OMP_SIMD: case OMP_DISTRIBUTE: case OMP_LOOP: @@ -12101,6 +12102,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } break; + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + break; case OMP_CLAUSE_NOHOST: default: gcc_unreachable (); @@ -13071,6 +13076,9 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: case OMP_CLAUSE_EXCLUSIVE: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: break; case OMP_CLAUSE_NOHOST: @@ -13797,6 +13805,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_SIMD: ort = ORT_SIMD; break; + case OMP_LOOP_TRANS: + break; default: gcc_unreachable (); } @@ -14158,8 +14168,19 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) n->value &= ~GOVD_LASTPRIVATE_CONDITIONAL; } } - else - omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN); + else { + if (TREE_CODE(orig_for_stmt) == OMP_LOOP_TRANS) + { + /* This loop is not going to be associated with any + directive after its transformation in + pass-omp_transform_loops. It will be lowered there + and the loop iteration variable will be used in the + context. */ + omp_notice_variable(gimplify_omp_ctxp, decl, true); + } + else + omp_add_variable(gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN); + } /* If DECL is not a gimple register, create a temporary variable to act as an iteration counter. This is valid, since DECL cannot be @@ -14200,7 +14221,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) c2 = NULL_TREE; } } - else + else if (TREE_CODE (orig_for_stmt) != OMP_LOOP_TRANS) omp_add_variable (gimplify_omp_ctxp, var, GOVD_PRIVATE | GOVD_SEEN); } @@ -14481,6 +14502,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break; case OMP_TASKLOOP: kind = GF_OMP_FOR_KIND_TASKLOOP; break; case OACC_LOOP: kind = GF_OMP_FOR_KIND_OACC_LOOP; break; + case OMP_LOOP_TRANS: kind = GF_OMP_FOR_KIND_TRANSFORM_LOOP; break; default: gcc_unreachable (); } @@ -14665,6 +14687,13 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gtask_clauses_ptr = &OMP_CLAUSE_CHAIN (c); } break; + /* Move loop transformations to inner loop */ + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + *gfor_clauses_ptr = c; + gfor_clauses_ptr = &OMP_CLAUSE_CHAIN (c); + break; default: gcc_unreachable (); } @@ -15105,6 +15134,10 @@ gimplify_omp_loop (tree *expr_p, gimple_seq *pre_p) } pc = &OMP_CLAUSE_CHAIN (*pc); break; + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + break; default: gcc_unreachable (); } @@ -16886,6 +16919,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, case OMP_FOR: case OMP_DISTRIBUTE: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: ret = gimplify_omp_for (expr_p, pre_p); break; diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index eefdcb54590..e29d695dcba 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -2253,6 +2253,20 @@ omp_declare_variant_remove_hook (struct cgraph_node *node, void *) } } +/* Return true if C is a clause that represents an OpenMP loop transformation + directive, false otherwise. */ + +bool +omp_loop_transform_clause_p (tree c) +{ + if (c == NULL) + return false; + + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + return (code == OMP_CLAUSE_UNROLL_FULL || code == OMP_CLAUSE_UNROLL_PARTIAL + || code == OMP_CLAUSE_UNROLL_NONE); +} + /* Try to resolve declare variant, return the variant decl if it should be used instead of base, or base otherwise. */ diff --git a/gcc/omp-general.h b/gcc/omp-general.h index 92717db1628..8d6390ad6f6 100644 --- a/gcc/omp-general.h +++ b/gcc/omp-general.h @@ -113,6 +113,7 @@ extern int omp_context_selector_matches (tree); extern int omp_context_selector_set_compare (const char *, tree, tree); extern tree omp_get_context_selector (tree, const char *, const char *); extern tree omp_resolve_declare_variant (tree); +extern bool omp_loop_transform_clause_p (tree); extern tree oacc_launch_pack (unsigned code, tree device, unsigned op); extern tree oacc_replace_fn_attrib_attr (tree attribs, tree dims); extern void oacc_replace_fn_attrib (tree fn, tree dims); diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc new file mode 100644 index 00000000000..d845d0e4798 --- /dev/null +++ b/gcc/omp-transform-loops.cc @@ -0,0 +1,1401 @@ +/* OMP loop transformation pass. Transforms loops according to + loop transformations directives such as "omp unroll". + + Copyright (C) 2023 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "pretty-print.h" +#include "diagnostic-core.h" +#include "backend.h" +#include "target.h" +#include "tree.h" +#include "tree-inline.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "tree-pass.h" +#include "gimple-walk.h" +#include "gimple-pretty-print.h" +#include "gimplify.h" +#include "ssa.h" +#include "tree-into-ssa.h" +#include "fold-const.h" +#include "print-tree.h" +#include "omp-general.h" + +/* Context information for walk_omp_for_loops. */ +struct walk_ctx +{ + /* The most recently visited gomp_for that has been transformed and + for which gimple_omp_for_set_combined_into_p returned true. */ + gomp_for *inner_combined_loop; + + /* The innermost bind enclosing the currently visited node. */ + gbind *bind; +}; + +static unsigned int walk_omp_for_loops (gimple_seq *, walk_ctx *); +static enum tree_code omp_adjust_neq_condition (tree v, tree step); + +static bool +non_rectangular_p (const gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + for (size_t i = 0; i < collapse; i++) + { + if (TREE_CODE (gimple_omp_for_final (omp_for, i)) == TREE_VEC + || TREE_CODE (gimple_omp_for_initial (omp_for, i)) == TREE_VEC) + return true; + } + + return false; +} + +/* Callback for subst_var. */ + +static tree +subst_var_in_op (tree *t, int *subtrees ATTRIBUTE_UNUSED, void *data) +{ + + auto *wi = (struct walk_stmt_info *)data; + auto from_to = (std::pair *)wi->info; + + if (*t == from_to->first) + { + *t = from_to->second; + wi->changed = true; + } + + return NULL_TREE; +} + +/* Substitute all occurrences of FROM in the operands of the GIMPLE statements + in SEQ by TO. */ + +static void +subst_var (gimple_seq *seq, tree from, tree to) +{ + gcc_assert (VAR_P (from)); + gcc_assert (VAR_P (to)); + + std::pair from_to (from, to); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + + walk_gimple_seq_mod (seq, NULL, subst_var_in_op, &wi); +} + +/* Return the type that should be used for computing the iteration count of a + loop with the given index VAR and upper/lower bound FINAL according to + OpenMP 5.1. */ + +tree +gomp_for_iter_count_type (tree var, tree final) +{ + tree var_type = TREE_TYPE (var); + + if (POINTER_TYPE_P (var_type)) + return ptrdiff_type_node; + + tree operand_type = TREE_TYPE (final); + if (TYPE_UNSIGNED (var_type) && !TYPE_UNSIGNED (operand_type)) + return signed_type_for (operand_type); + + return var_type; +} + +extern tree +gimple_assign_rhs_to_tree (gimple *stmt); + +/* Substitute all definitions from SEQ bottom-up into EXPR. This is used to + reconstruct a tree for a gimplified expression for determinig whether or not + the number of iterations of a loop is constant. */ + +tree +subst_defs (tree expr, gimple_seq seq) +{ + gimple_seq_node last = gimple_seq_last (seq); + gimple_seq_node first = gimple_seq_first (seq); + for (auto n = last; n != NULL; n = n != first ? n->prev : NULL) + { + if (!is_gimple_assign (n)) + continue; + + tree lhs = gimple_assign_lhs (n); + tree rhs = gimple_assign_rhs_to_tree (n); + std::pair from_to (lhs, rhs); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + walk_tree (&expr, subst_var_in_op, &wi, NULL); + expr = fold (expr); + } + + return expr; +} + +/* Return an expression for the number of iterations of the outermost loop of + OMP_FOR. */ + +tree +gomp_for_number_of_iterations (const gomp_for *omp_for, size_t level) +{ + gcc_assert (!non_rectangular_p (omp_for)); + + tree init = gimple_omp_for_initial (omp_for, level); + tree final = gimple_omp_for_final (omp_for, level); + tree_code cond = gimple_omp_for_cond (omp_for, level); + tree index = gimple_omp_for_index (omp_for, level); + tree type = gomp_for_iter_count_type (index, final); + tree step = TREE_OPERAND (gimple_omp_for_incr (omp_for, level), 1); + + init = subst_defs (init, gimple_omp_for_pre_body (omp_for)); + init = fold (init); + final = subst_defs (final, gimple_omp_for_pre_body (omp_for)); + final = fold (final); + + tree_code minus_code = MINUS_EXPR; + tree diff_type = type; + if (POINTER_TYPE_P (TREE_TYPE (final))) + { + minus_code = POINTER_DIFF_EXPR; + diff_type = ptrdiff_type_node; + } + + tree diff; + if (cond == GT_EXPR) + diff = fold_build2 (minus_code, diff_type, init, final); + else if (cond == LT_EXPR) + diff = fold_build2 (minus_code, diff_type, final, init); + else + gcc_unreachable (); + + diff = fold_build2 (CEIL_DIV_EXPR, type, diff, step); + diff = fold_build1 (ABS_EXPR, type, diff); + + return diff; +} + +/* Return true if the expression representing the number of iterations for + OMP_FOR is a constant expression, false otherwise. */ + +bool +gomp_for_constant_iterations_p (gomp_for *omp_for, + unsigned HOST_WIDE_INT *iterations) +{ + tree t = gomp_for_number_of_iterations (omp_for, 0); + if (!TREE_CONSTANT (t) + || !tree_fits_uhwi_p (t)) + return false; + + *iterations = tree_to_uhwi (t); + return true; +} + +/* Split a gomp_for that represents a collapsed loop-nest into single + loops. The result is a gomp_for of the same kind which is not collapsed + (i.e. gimple_omp_for_collapse (OMP_FOR) == 1) and which contains nested, + non-collapsed gomp_for loops whose kind is GF_OMP_FOR_KIND_TRANSFORM_LOOP + (i.e. they will be lowered into plain, non-omp loops by this pass) for each + of the loops of OMP_FOR. All loops whose depth is strictly less than + FROM_DEPTH are left collapsed. */ + +static gomp_for* +gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0) +{ + int collapse = gimple_omp_for_collapse (omp_for); + gcc_assert (from_depth < collapse); + + if (collapse <= 1) + return omp_for; + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Uncollapsing loop:\n %G\n", + static_cast (omp_for)); + + gimple_seq body = gimple_omp_body (omp_for); + gomp_for *level_omp_for = omp_for; + for (int level = collapse - 1; level >= from_depth; level--) + { + level_omp_for = gimple_build_omp_for (body, + GF_OMP_FOR_KIND_TRANSFORM_LOOP, + NULL, 1, NULL); + gimple_omp_for_set_cond (level_omp_for, 0, + gimple_omp_for_cond (omp_for, level)); + gimple_omp_for_set_initial (level_omp_for, 0, + gimple_omp_for_initial (omp_for, level)); + gimple_omp_for_set_final (level_omp_for, 0, + gimple_omp_for_final (omp_for, level)); + gimple_omp_for_set_incr (level_omp_for, 0, + gimple_omp_for_incr (omp_for, level)); + gimple_omp_for_set_index (level_omp_for, 0, + gimple_omp_for_index (omp_for, level)); + + body = level_omp_for; + } + + omp_for->collapse = from_depth; + + if (from_depth > 0) + { + gimple_omp_set_body (omp_for, body); + return omp_for; + } + + gimple_omp_for_set_clauses (level_omp_for, gimple_omp_for_clauses (omp_for)); + gimple_omp_for_set_pre_body (level_omp_for, gimple_omp_for_pre_body (omp_for)); + gimple_omp_for_set_combined_into_p (level_omp_for, + gimple_omp_for_combined_into_p (omp_for)); + gimple_omp_for_set_combined_p (level_omp_for, + gimple_omp_for_combined_p (omp_for)); + + return level_omp_for; +} + +static tree +build_loop_exit_cond (tree index, tree_code cond, tree final, gimple_seq *seq) +{ + tree exit_cond + = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, + fold_build2 (cond, boolean_type_node, index, final)); + tree res = create_tmp_var (boolean_type_node); + gimplify_assign (res, exit_cond, seq); + + return res; +} + +/* Returns a register that contains the final value of a loop as described by + FINAL. This is necessary for non-rectangular loops. */ + +static tree +build_loop_final (tree final, gimple_seq *seq) +{ + if (TREE_CODE (final) != TREE_VEC) /* rectangular loop-nest */ + return final; + + tree coeff = TREE_VEC_ELT (final, 0); + tree outer_var = TREE_VEC_ELT (final, 1); + tree constt = TREE_VEC_ELT (final, 2); + + tree type = TREE_TYPE (outer_var); + tree val = fold_build2 (MULT_EXPR, type, coeff, outer_var); + val = fold_build2 (PLUS_EXPR, type, val, constt); + + tree res = create_tmp_var (type); + gimplify_assign (res, val, seq); + + return res; +} + +/* Unroll the loop BODY UNROLL_FACTOR times, replacing the INDEX + variable by a local copy in each copy of the body that will be + incremented as specified by INCR. If BUILD_EXIT_CONDS is true, + insert a test of the loop exit condition given COND and FINAL + before each copy of the body that will exit the loop if the value + of the local index variable satisfies the loop exit condition. + + For example, the unrolling with BUILD_EXIT_CONDS == true turns + + for (i = 0; i < 3; i = i + 1) + { + BODY + } + + into + + for (i = 0; i < n; i = i + 1) + { + i.0 = i + if (!(i_0 < n)) + goto exit + BODY_COPY_1[i/i.0] i.e. index var i replaced by i.0 + if (!(i_1 < n)) + goto exit + i.1 = i.0 + 1 + BODY_COPY_2[i/i.1] + if (!(i_3 < n)) + goto exit + i.2 = i.2 + 1 + BODY_COPY_3[i/i.2] + exit: + } + */ +static gimple_seq +build_unroll_body (gimple_seq body, tree unroll_factor, tree index, tree incr, + bool build_exit_conds = false, tree final = NULL_TREE, + tree_code *cond = NULL) +{ + gcc_assert ((!build_exit_conds && !final && !cond) + || (build_exit_conds && final && cond)); + + gimple_seq new_body = NULL; + + push_gimplify_context (); + + if (build_exit_conds) + final = build_loop_final (final, &new_body); + + tree local_index = create_tmp_var (TREE_TYPE (index)); + subst_var (&body, index, local_index); + tree local_incr = unshare_expr (incr); + TREE_OPERAND (local_incr, 0) = local_index; + + tree exit_label = create_artificial_label (gimple_location (body)); + + unsigned HOST_WIDE_INT n = tree_to_uhwi (unroll_factor); + for (unsigned HOST_WIDE_INT i = 0; i < n; i++) + { + if (i == 0) + gimplify_assign (local_index, index, &new_body); + else + gimplify_assign (local_index, local_incr, &new_body); + + tree body_copy_label = create_artificial_label (gimple_location (body)); + + if (build_exit_conds) + { + tree exit_cond + = build_loop_exit_cond (local_index, *cond, final, &new_body); + gimple_seq_add_stmt ( + &new_body, + gimple_build_cond (EQ_EXPR, exit_cond, boolean_true_node, + exit_label, body_copy_label)); + } + + gimple_seq body_copy = copy_gimple_seq_and_replace_locals (body); + gimple_seq_add_stmt (&new_body, gimple_build_label (body_copy_label)); + gimple_seq_add_seq (&new_body, body_copy); + } + + + gbind *bind = gimple_build_bind (NULL, new_body, NULL); + pop_gimplify_context (bind); + + gimple_seq result = NULL; + gimple_seq_add_stmt (&result, bind); + gimple_seq_add_stmt (&result, gimple_build_label (exit_label)); + return result; +} + +static gimple_seq transform_gomp_for (gomp_for *, tree, walk_ctx *ctx); + +/* Execute the partial unrolling transformation for OMP_FOR with the given + UNROLL_FACTOR and return the resulting gimple bind. LOC is the location for + diagnostic messages. + + Example + -------- + -------- + + Original loop + ------------- + + #pragma omp for unroll_partial(3) + for (i = 0; i < 100; i = i + 1) + { + BODY + } + + gets, roughly, translated to + + { + #pragma omp for + for (i = 0; i < 100; i = i + 3) + { + i.0 = i + if i.0 > 100: + goto exit_label + BODY_COPY_1[i/i.0] i.e. index var replaced + i.1 = i + 1 + if i.1 > 100: + goto exit_label + BODY_COPY_2[i/1.1] + i.2 = i + 2 + if i.2 > 100: + goto exit_label + BODY_COPY_3[i/i.2] + + exit_label: + } + */ + +/* FIXME The value of the loop counter of the transformed loop is +currently unspecified. OpenMP 5.2 does not define what the value +should be. There is an open OpenMP spec issue ("Loop counter value +after transform: Misc 6.0: Loop transformations #3440") in the +non-public OpenMP spec repository. */ + +static gimple_seq +partial_unroll (gomp_for *omp_for, tree unroll_factor, + location_t loc, tree transformation_clauses, walk_ctx *ctx) +{ + gcc_assert (unroll_factor); + gcc_assert ( + OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_PARTIAL + || OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_NONE); + + /* Partial unrolling reduces the loop nest depth of a canonical loop nest to 1 + hence outer directives cannot require a greater collapse. */ + gcc_assert (gimple_omp_for_collapse (omp_for) <= 1); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Partially unrolling loop:\n %G\n", + static_cast (omp_for)); + + gomp_for *unrolled_for = as_a (copy_gimple_seq_and_replace_locals (omp_for)); + + tree final = gimple_omp_for_final (unrolled_for, 0); + tree incr = gimple_omp_for_incr (unrolled_for, 0); + tree index = gimple_omp_for_index (unrolled_for, 0); + gimple_seq body = gimple_omp_body (unrolled_for); + + tree_code cond = gimple_omp_for_cond (unrolled_for, 0); + tree step = TREE_OPERAND (incr, 1); + gimple_omp_set_body (unrolled_for, + build_unroll_body (body, unroll_factor, index, incr, + true, final, &cond)); + + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); + + push_gimplify_context (); + + tree scaled_step + = fold_build2 (MULT_EXPR, TREE_TYPE (step), + fold_convert (TREE_TYPE (step), unroll_factor), step); + + /* For combined constructs, step will be gimplified on the outer + gomp_for. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (scaled_step)) + { + tree var = create_tmp_var (TREE_TYPE (step), ".omp_unroll_step"); + gimplify_assign (var, scaled_step, + gimple_omp_for_pre_body_ptr (unrolled_for)); + scaled_step = var; + } + TREE_OPERAND (incr, 1) = scaled_step; + gimple_omp_for_set_incr (unrolled_for, 0, incr); + + pop_gimplify_context (result_bind); + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = unrolled_for; + + tree remaining_clauses = OMP_CLAUSE_CHAIN (transformation_clauses); + gimple_seq_add_stmt ( + gimple_bind_body_ptr (result_bind), + transform_gomp_for (unrolled_for, remaining_clauses, ctx)); + + return result_bind; +} + +static gimple_seq +full_unroll (gomp_for *omp_for, location_t loc, walk_ctx *ctx ATTRIBUTE_UNUSED) +{ + tree init = gimple_omp_for_initial (omp_for, 0); + unsigned HOST_WIDE_INT niter = 0; + if (!gomp_for_constant_iterations_p (omp_for, &niter)) + { + error_at (loc, "Cannot apply full unrolling to loop with " + "non-constant number of iterations"); + return omp_for; + } + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Fully unrolling loop with " + HOST_WIDE_INT_PRINT_UNSIGNED + " iterations :\n %G\n", niter, + static_cast (omp_for)); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + + tree unroll_factor = build_int_cst (TREE_TYPE (init), niter); + + gimple_seq unrolled = NULL; + gimple_seq_add_seq (&unrolled, gimple_omp_for_pre_body (omp_for)); + push_gimplify_context (); + gimple_seq_add_seq (&unrolled, + build_unroll_body (body, unroll_factor, index, incr)); + + gbind *result_bind = gimple_build_bind (NULL, unrolled, NULL); + pop_gimplify_context (result_bind); + return result_bind; +} + +/* Decides if the OMP_FOR for which the user did not specify the type of + unrolling to apply in the 'unroll' directive represented by the TRANSFORM + clause should be fully unrolled. */ + +static bool +assign_unroll_full_clause_p (gomp_for *omp_for, tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + gcc_assert (OMP_CLAUSE_CHAIN (transform) == NULL); + + /* Full unrolling turns the loop into a non-loop and hence + the following transformations would fail. */ + if (TREE_CHAIN (transform) != NULL_TREE) + return false; + + unsigned HOST_WIDE_INT num_iters; + if (!gomp_for_constant_iterations_p (omp_for, &num_iters) + || num_iters + > (unsigned HOST_WIDE_INT)param_omp_unroll_full_max_iterations) + return false; + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "assigned % clause to % with small " + "constant number of iterations\n"); + } + + return true; +} + +/* If the OMP_FOR for which the user did not specify the type of unrolling in + the 'unroll' directive in the TRANSFORM clause should be partially unrolled, + return the unroll factor, otherwise return null. */ + +static tree +assign_unroll_partial_clause_p (gomp_for *omp_for ATTRIBUTE_UNUSED, + tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + + if (param_omp_unroll_default_factor == 0) + return NULL; + + tree unroll_factor + = build_int_cst (integer_type_node, param_omp_unroll_default_factor); + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "added % clause to % directive\n", + param_omp_unroll_default_factor); + } + + return unroll_factor; +} + +/* Generate the code for an OMP_FOR that represents the result of a + loop transformation which is not associated with any directive and + which will hence not be lowered in the omp-expansion. */ + +static gimple_seq +expand_transformed_loop (gomp_for *omp_for) +{ + gcc_assert (gimple_omp_for_kind (omp_for) + == GF_OMP_FOR_KIND_TRANSFORM_LOOP); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Expanding loop:\n %G\n", + static_cast (omp_for)); + + push_gimplify_context (); + + omp_for = gomp_for_uncollapse (omp_for); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + tree init = gimple_omp_for_initial (omp_for, 0); + tree final = gimple_omp_for_final (omp_for, 0); + tree_code cond = gimple_omp_for_cond (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + gimple_seq pre_body = gimple_omp_for_pre_body (omp_for); + + gimple_seq loop = NULL; + + tree exit_label = create_artificial_label (UNKNOWN_LOCATION); + tree cycle_label = create_artificial_label (UNKNOWN_LOCATION); + tree body_label = create_artificial_label (UNKNOWN_LOCATION); + + gimple_seq_add_seq (&loop, pre_body); + gimplify_assign (index, init, &loop); + tree final_var = final; + if (TREE_CODE (final) != VAR_DECL) + { + final_var = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (final_var, final, &loop); + } + + gimple_seq_add_stmt (&loop, gimple_build_label (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_cond (cond, index, final_var, + body_label, exit_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (body_label)); + gimple_seq_add_seq (&loop, body); + gimplify_assign (index, incr, &loop); + gimple_seq_add_stmt (&loop, gimple_build_goto (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (exit_label)); + + gbind *bind = gimple_build_bind (NULL, loop, NULL); + pop_gimplify_context (bind); + + return bind; +} + +static enum tree_code +omp_adjust_neq_condition (tree v, tree step) +{ + gcc_assert (TREE_CODE (step) == INTEGER_CST); + if (TREE_CODE (TREE_TYPE (v)) == INTEGER_TYPE) + { + if (integer_onep (step)) + return LT_EXPR; + else + { + gcc_assert (integer_minus_onep (step)); + return GT_EXPR; + } + } + else + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + gcc_assert (TREE_CODE (unit) == INTEGER_CST); + if (tree_int_cst_equal (unit, step)) + return LT_EXPR; + else + { + gcc_assert (wi::neg (wi::to_widest (unit)) + == wi::to_widest (step)); + return GT_EXPR; + } + } +} + +/* Adjust *COND_CODE and *N2 so that the former is either LT_EXPR or GT_EXPR, + given that V is the loop index variable and STEP is loop step. + + This function has been derived from omp_adjust_for_condition. + In contrast to the original function it does not add 1 or + -1 to the the final value when converting <=,>= to <,> + for a pointer-type index variable. Instead, this function + adds or subtracts the type size in bytes. This is necessary + to determine the number of iterations correctly. */ + +void +omp_adjust_for_condition2 (location_t loc, enum tree_code *cond_code, tree *n2, + tree v, tree step) +{ + switch (*cond_code) + { + case LT_EXPR: + case GT_EXPR: + break; + + case NE_EXPR: + *cond_code = omp_adjust_neq_condition (v, step); + break; + + case LE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, type_unit); + } + else + *n2 = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = LT_EXPR; + break; + case GE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, -1 * type_unit); + } + else + *n2 = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = GT_EXPR; + break; + default: + gcc_unreachable (); + } +} + +/* Transform the condition of OMP_FOR to either LT_EXPR or GT_EXPR and adjust + the final value as necessary. */ + +static bool +canonicalize_conditions (gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + location_t loc = gimple_location (omp_for); + bool new_decls = false; + + gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (omp_for); + for (size_t l = 0; l < collapse; l++) + { + enum tree_code cond = gimple_omp_for_cond (omp_for, l); + + if (cond == LT_EXPR || cond == GT_EXPR) + continue; + + tree incr = gimple_omp_for_incr (omp_for, l); + tree step = omp_get_for_step_from_incr (loc, incr); + tree index = gimple_omp_for_index (omp_for, l); + tree final = gimple_omp_for_final (omp_for, l); + tree orig_final = final; + /* If final refers to the index variable of an outer level, i.e. + the loop nest is non-rectangular, only convert NE_EXPR. This + is necessary for unrolling. Unrolling needs to multiply the + step by the unrolling factor, but non-constant step values + are impossible with NE_EXPR. */ + if (TREE_CODE (final) == TREE_VEC) + { + cond = omp_adjust_neq_condition (TREE_VEC_ELT (final, 1), + TREE_OPERAND (incr, 1)); + gimple_omp_for_set_cond (omp_for, l, cond); + continue; + } + + omp_adjust_for_condition2 (loc, &cond, &final, index, step); + + gimple_omp_for_set_cond (omp_for, l, cond); + if (final == orig_final) + continue; + + /* If this is a combined construct, gimplify the final on the + outer construct. */ + if (TREE_CODE (final) != INTEGER_CST + && !gimple_omp_for_combined_into_p (omp_for)) + { + tree new_final = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (new_final, final, pre_body); + final = new_final; + new_decls = true; + } + + gimple_omp_for_set_final (omp_for, l, final); + } + + return new_decls; +} + +/* Combined distribute or taskloop constructs are represented by two + or more nested gomp_for constructs which are created during + gimplification. Loop transformations on the combined construct are + executed on the innermost gomp_for. This function adjusts the loop + header of an outer OMP_FOR loop to the changes made by the + transformations on the inner loop which is provided by the CTX. */ + +static gimple_seq +adjust_combined_loop (gomp_for *omp_for, walk_ctx *ctx) +{ + gcc_assert (gimple_omp_for_combined_p (omp_for)); + gcc_assert (ctx->inner_combined_loop); + + gomp_for *inner_omp_for = ctx->inner_combined_loop; + size_t collapse = gimple_omp_for_collapse (inner_omp_for); + + int kind = gimple_omp_for_kind (omp_for); + if (kind == GF_OMP_FOR_KIND_DISTRIBUTE || kind == GF_OMP_FOR_KIND_TASKLOOP) + { + for (size_t level = 0; level < collapse; ++level) + { + tree outer_incr = gimple_omp_for_incr (omp_for, level); + tree inner_incr = gimple_omp_for_incr (inner_omp_for, level); + gcc_assert (TREE_TYPE (inner_incr) == TREE_TYPE (outer_incr)); + + tree inner_final = gimple_omp_for_final (inner_omp_for, level); + enum tree_code inner_cond + = gimple_omp_for_cond (inner_omp_for, level); + gimple_omp_for_set_cond (omp_for, level, inner_cond); + + tree inner_step = TREE_OPERAND (inner_incr, 1); + /* If this omp_for is the outermost loop belonging to a + combined construct, gimplify the step into its + prebody. Otherwise, just gimplify the step on the inner + gomp_for and move the ungimplified step expression + here. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + TREE_OPERAND (outer_incr, 1) = step; + } + else + TREE_OPERAND (outer_incr, 1) = inner_step; + + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_final), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + gimple_omp_for_set_final (omp_for, level, final); + } + else + gimple_omp_for_set_final (omp_for, level, inner_final); + + /* Gimplify the step on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + TREE_OPERAND (inner_incr, 1) = step; + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = step; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + + /* Gimplify the final on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + gimple_omp_for_set_final (inner_omp_for, level, final); + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = final; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + } + } + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = omp_for; + else + ctx->inner_combined_loop = NULL; + + return omp_for; +} + +/* Transform OMP_FOR recursively according to the clause chain + TRANSFORMATION. Return the resulting sequence of gimple statements. + + This function dispatches OMP_FOR to the handler function for the + TRANSFORMATION clause. The handler function is responsible for invoking this + function recursively for executing the remaining transformations. */ + +static gimple_seq +transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) +{ + if (!transformation) + { + if (gimple_omp_for_kind (omp_for) == GF_OMP_FOR_KIND_TRANSFORM_LOOP) + return expand_transformed_loop (omp_for); + + return omp_for; + } + + push_gimplify_context (); + + bool added_decls = canonicalize_conditions (omp_for); + + gimple_seq result = NULL; + location_t loc = OMP_CLAUSE_LOCATION (transformation); + auto dump_loc = dump_user_location_t::from_location_t (loc); + switch (OMP_CLAUSE_CODE (transformation)) + { + case OMP_CLAUSE_UNROLL_FULL: + gcc_assert (TREE_CHAIN (transformation) == NULL); + result = full_unroll (omp_for, loc, ctx); + break; + case OMP_CLAUSE_UNROLL_NONE: + gcc_assert (TREE_CHAIN (transformation) == NULL); + if (assign_unroll_full_clause_p (omp_for, transformation)) + { + result = full_unroll (omp_for, loc, ctx); + } + else if (tree unroll_factor + = assign_unroll_partial_clause_p (omp_for, transformation)) + { + result = partial_unroll (omp_for, unroll_factor, loc, + transformation, ctx); + } + else { + if (dump_enabled_p ()) + { + /* TODO Try to inform the unrolling pass that the user + wants to unroll this loop. This could relax some + restrictions there, e.g. on the code size? */ + dump_printf_loc ( + MSG_MISSED_OPTIMIZATION, dump_loc, + "not unrolling loop with % directive. Add " + "clause to specify unrolling type or invoke the " + "compiler with --param=omp-unroll-default-factor=n for some" + "constant integer n"); + } + result = transform_gomp_for (omp_for, NULL, ctx); + } + + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (transformation); + if (!unroll_factor) + { + // TODO Use target architecture dependent constants? + unsigned factor = param_omp_unroll_default_factor > 0 + ? param_omp_unroll_default_factor + : 5; + unroll_factor = build_int_cst (integer_type_node, factor); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "% clause without unrolling " + "factor turned into % clause\n", + factor); + } + result = partial_unroll (omp_for, unroll_factor, loc, transformation, + ctx); + } + break; + default: + gcc_unreachable (); + } + + if (added_decls && gimple_code (result) != GIMPLE_BIND) + result = gimple_build_bind (NULL, result, NULL); + pop_gimplify_context (added_decls ? result : NULL); /* for decls from canonicalize_loops */ + + return result; +} + +/* Remove all loop transformation clauses from the clauses of OMP_FOR and + return a new tree chain containing just those clauses. + + The clauses correspond to transformation *directives* associated with the + OMP_FOR's loop. The returned clauses are ordered from the innermost + directive to the outermost, i.e. in the order in which the transformations + should execute. + + Example: + -------- + -------- + + The loop + + #pragma omp for nowait + #pragma omp unroll partial(5) + #pragma omp tile sizes(2,2) + LOOP + + is represented as + + #pragma omp for nowait unroll_partial(5) tile_sizes(2,2) + LOOP + + Gimplification may add clauses after the transformation clauses added + by the front ends. This function will leave only the "nowait" clause on + OMP_FOR and return the clauses "tile_sizes(2,2) unroll_partial(5)". */ + +static tree +gomp_for_remove_transformation_clauses (gomp_for *omp_for) +{ + tree *clauses = gimple_omp_for_clauses_ptr (omp_for); + tree trans_clauses = NULL; + tree last_other_clause = NULL; + + for (tree c = gimple_omp_for_clauses (omp_for); c != NULL_TREE;) + { + tree chain_tail = OMP_CLAUSE_CHAIN (c); + if (omp_loop_transform_clause_p (c)) + { + if (last_other_clause) + OMP_CLAUSE_CHAIN (last_other_clause) = chain_tail; + else + *clauses = OMP_CLAUSE_CHAIN (c); + + OMP_CLAUSE_CHAIN (c) = NULL; + trans_clauses = chainon (trans_clauses, c); + } + else + { + /* There should be no other clauses between loop transformations ... */ + gcc_assert (!trans_clauses || !last_other_clause + || TREE_CHAIN (last_other_clause) == c); + /* ... and hence stop if transformations were found before the + non-transformation clause C. */ + if (trans_clauses) + break; + last_other_clause = c; + } + + c = chain_tail; + } + + return nreverse (trans_clauses); +} + +static void +print_optimized_unroll_partial_msg (tree c) +{ + gcc_assert (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_PARTIAL); + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "replaced consecutive % directives by " + "%\n", tree_to_uhwi (unroll_factor)); +} + +/* Optimize CLAUSES by removing and merging redundant clauses. Return the + optimized clause chain. */ + +static tree +optimize_transformation_clauses (tree clauses) +{ + /* The last unroll_partial clause seen in clauses, if any, + or the last merged unroll partial clause. */ + tree unroll_partial = NULL; + /* The last clause was not a unroll_partial clause, if any. + unroll_full and unroll_none are not relevant because + they appear only at the end of a chain. */ + tree last_non_unroll = NULL; + /* Indicates that at least two unroll_partial clauses have been merged + since last_non_unroll was seen. */ + bool merged_unroll_partial = false; + + for (tree c = clauses; c != NULL_TREE; c = OMP_CLAUSE_CHAIN (c)) + { + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + + switch (code) + { + case OMP_CLAUSE_UNROLL_NONE: + /* 'unroll' without a clause cannot be followed by any + transformations because its result does not have canonical loop + nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_FULL: + /* 'unroll full' cannot be followed by any transformations because + its result does not have canonical loop nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + + /* Previous 'unroll partial' directives are useless. */ + if (unroll_partial) + { + if (last_non_unroll) + OMP_CLAUSE_CHAIN (last_non_unroll) = c; + else + clauses = c; + + if (dump_enabled_p ()) + { + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + dump_printf_loc ( + MSG_OPTIMIZED_LOCATIONS, dump_loc, + "removed useless % directives " + "preceding 'omp unroll full'\n"); + } + } + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + /* Merge a sequence of consecutive 'unroll partial' directives. + Note that it impossible for 'unroll full' or 'unroll' to + appear inbetween the 'unroll partial' clauses because they + remove the loop-nest. */ + if (unroll_partial) + { + tree factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial); + tree c_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + if (factor && c_factor) + factor = fold_build2 (MULT_EXPR, TREE_TYPE (factor), factor, + c_factor); + else if (!factor && c_factor) + factor = c_factor; + + gcc_assert (!factor || TREE_CODE (factor) == INTEGER_CST); + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial) = factor; + OMP_CLAUSE_CHAIN (unroll_partial) = OMP_CLAUSE_CHAIN (c); + OMP_CLAUSE_LOCATION (unroll_partial) = OMP_CLAUSE_LOCATION (c); + merged_unroll_partial = true; + } + else + unroll_partial = c; + } + break; + default: + gcc_unreachable (); + } + } + + if (merged_unroll_partial && dump_enabled_p ()) + print_optimized_unroll_partial_msg (unroll_partial); + + return clauses; +} + +/* Visit the current statement in GSI_P in the walk_omp_for_loops walk and + execute all loop transformations found on it. */ + +void +process_omp_for (gomp_for *omp_for, gimple_seq *containing_seq, walk_ctx *ctx) +{ + auto gsi_p = gsi_for_stmt (omp_for, containing_seq); + tree transform_clauses = gomp_for_remove_transformation_clauses (omp_for); + + /* Do not attempt to transform broken code which might violate the + assumptions of the loop transformation implementations. + + Transformation clauses must be dropped first because following + passes do not handle them. */ + if (seen_error ()) + return; + + transform_clauses = optimize_transformation_clauses (transform_clauses); + + gimple *transformed = omp_for; + if (gimple_omp_for_combined_p (omp_for) + && ctx->inner_combined_loop) + transformed = adjust_combined_loop (omp_for, ctx); + else + transformed = transform_gomp_for (omp_for, transform_clauses, ctx); + + if (transformed == omp_for) + return; + + gsi_replace_with_seq (&gsi_p, transformed, true); + + if (!dump_enabled_p () || !(dump_flags & TDF_DETAILS)) + return; + + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, transformed, + "Transformed loop: %G\n\n", transformed); +} + +/* Traverse SEQ in depth-first order and apply the loop transformation + found on gomp_for statements. */ + +static unsigned int +walk_omp_for_loops (gimple_seq *seq, walk_ctx *ctx) +{ + gimple_stmt_iterator gsi; + for (gsi = gsi_start (*seq); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + switch (gimple_code (stmt)) + { + case GIMPLE_OMP_CRITICAL: + case GIMPLE_OMP_MASTER: + case GIMPLE_OMP_MASKED: + case GIMPLE_OMP_TASKGROUP: + case GIMPLE_OMP_ORDERED: + case GIMPLE_OMP_SCAN: + case GIMPLE_OMP_SECTION: + case GIMPLE_OMP_PARALLEL: + case GIMPLE_OMP_TASK: + case GIMPLE_OMP_SCOPE: + case GIMPLE_OMP_SECTIONS: + case GIMPLE_OMP_SINGLE: + case GIMPLE_OMP_TARGET: + case GIMPLE_OMP_TEAMS: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_OMP_FOR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_for_pre_body_ptr (stmt), ctx); + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + process_omp_for (as_a (stmt), seq, ctx); + break; + } + case GIMPLE_BIND: + { + gbind *bind = as_a (stmt); + ctx->bind = bind; + walk_omp_for_loops (gimple_bind_body_ptr (bind), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_TRY: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_try_eval_ptr (as_a (stmt)), + ctx); + walk_omp_for_loops (gimple_try_cleanup_ptr (as_a (stmt)), + ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_CATCH: + { + gbind *bind = ctx->bind; + walk_omp_for_loops ( + gimple_catch_handler_ptr (as_a (stmt)), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_FILTER: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_eh_filter_failure_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_ELSE: + { + gbind *bind = ctx->bind; + geh_else *eh_else_stmt = as_a (stmt); + walk_omp_for_loops (gimple_eh_else_n_body_ptr (eh_else_stmt), ctx); + walk_omp_for_loops (gimple_eh_else_e_body_ptr (eh_else_stmt), ctx); + ctx->bind = bind; + break; + } + break; + + case GIMPLE_WITH_CLEANUP_EXPR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_wce_cleanup_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_TRANSACTION: + { + gbind *bind = ctx->bind; + auto trans = as_a (stmt); + walk_omp_for_loops (gimple_transaction_body_ptr (trans), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_ASSUME: + break; + + default: + gcc_assert (!gimple_has_substatements (stmt)); + continue; + } + } + + return true; +} + +static unsigned int +execute_omp_transform_loops () +{ + gimple_seq body = gimple_body (current_function_decl); + walk_ctx ctx; + ctx.inner_combined_loop = NULL; + ctx.bind = NULL; + walk_omp_for_loops (&body, &ctx); + + return 0; +} + +namespace +{ + +const pass_data pass_data_omp_transform_loops = { + GIMPLE_PASS, /* type */ + "omp_transform_loops", /* name */ + OPTGROUP_OMP, /* optinfo_flags */ + TV_NONE, /* tv_id */ + PROP_gimple_any, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_omp_transform_loops : public gimple_opt_pass +{ +public: + pass_omp_transform_loops (gcc::context *ctxt) + : gimple_opt_pass (pass_data_omp_transform_loops, ctxt) + { + } + + /* opt_pass methods: */ + virtual unsigned int + execute (function *) + { + return execute_omp_transform_loops (); + } + virtual bool + gate (function *) + { + return flag_openmp || flag_openmp_simd; + } + +}; // class pass_omp_transform_loops + +} // anon namespace + +gimple_opt_pass * +make_pass_omp_transform_loops (gcc::context *ctxt) +{ + return new pass_omp_transform_loops (ctxt); +} diff --git a/gcc/params.opt b/gcc/params.opt index 41d8bef245e..cf5e09bf9e0 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -820,6 +820,15 @@ Enum(openacc_privatization) String(quiet) Value(OPENACC_PRIVATIZATION_QUIET) EnumValue Enum(openacc_privatization) String(noisy) Value(OPENACC_PRIVATIZATION_NOISY) +-param=omp-unroll-full-max-iterations= +Common Joined UInteger Var(param_omp_unroll_full_max_iterations) Init(5) Param Optimization +The maximum number of iterations of a loop for which an 'omp unroll' directive on the loop without a +clause will be turned into an 'omp unroll full'. + +-param=omp-unroll-default-factor= +Common Joined UInteger Var(param_omp_unroll_default_factor) Init(0) Param Optimization +The unroll factor that will be used for loops that have an 'omp unroll partial' directive without an explicit unroll factor. + -param=parloops-chunk-size= Common Joined UInteger Var(param_parloops_chunk_size) Param Optimization Chunk size of omp schedule for loops parallelized by parloops. diff --git a/gcc/passes.def b/gcc/passes.def index c9a8f19747b..5a5f3616cf8 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_diagnose_omp_blocks); NEXT_PASS (pass_diagnose_tm_blocks); NEXT_PASS (pass_omp_oacc_kernels_decompose); + NEXT_PASS (pass_omp_transform_loops); NEXT_PASS (pass_lower_omp); NEXT_PASS (pass_lower_cf); NEXT_PASS (pass_lower_tm); diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..4cfac4c5e26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 @@ -0,0 +1,277 @@ +subroutine test1 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 + +subroutine test20 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test20 + +subroutine test21 + implicit none + integer :: i + + !$omp unroll partial ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i = 1:100) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test21 + +subroutine test22 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do concurrent (i = 1:100) ! { dg-error {\!\$OMP DO cannot be a DO CONCURRENT loop at \(1\)} } + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test22 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 new file mode 100644 index 00000000000..2c4a45d3054 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 @@ -0,0 +1,7 @@ +subroutine test(i) + ! TODO The checking that produces this message comes too late. Not important, but would be nice to have. + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} "" { xfail *-*-* } } + call dummy0 ! { dg-error {Unexpected CALL statement at \(1\)} } +end subroutine test ! { dg-error {Unexpected END statement at \(1\)} } + +! { dg-error "Unexpected end of file" "" { target "*-*-*" } 0 } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 new file mode 100644 index 00000000000..3f0d5981e9b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 @@ -0,0 +1,75 @@ +subroutine test1(i) + implicit none + integer :: i + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +subroutine test3(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll full + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test3 + +subroutine test4(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test4 + +subroutine test5(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test5 + +subroutine test6(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test6 + +subroutine test7(i) + implicit none + integer :: i + !$omp loop ! { dg-error {missing canonical loop nest after \!\$OMP LOOP at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test7 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 new file mode 100644 index 00000000000..0d8f3f5a2c0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 @@ -0,0 +1,29 @@ +subroutine test1 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do while (i < 10) + call dummy(i) + i = i + 1 + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do + call dummy(i) + i = i + 1 + if (i >= 10) exit + end do +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i=1:10) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..8496f9eefe0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_none" 1 "original" } } +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_full" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..0d233c9ab6f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 @@ -0,0 +1,17 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +! Loop should be removed with 10 copies of the body remaining + +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..fcccdb0bcf8 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should not be unrolled, but the internal representation should be lowered + +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..ee82b4d150c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 5 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..237e6b83087 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 @@ -0,0 +1,19 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(10\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 10 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..8feaf7dc4d3 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 @@ -0,0 +1,62 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i,j + !$omp parallel do + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + + !$omp taskloop + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + +end subroutine test1 + +! For the "parallel do", there should be 11 "omp for" loops, 10 for the inner loop, 1 for outer, +! for the "taskloop", there should be 10 "omp for" loops for the unrolled loop +! { dg-final { scan-tree-dump-times {#pragma omp for} 21 "omp_transform_loops" } } +! ... and two outer taskloops plus the one taskloops +! { dg-final { scan-tree-dump-times {#pragma omp taskloop} 3 "omp_transform_loops" } } + + +subroutine test2 + implicit none + integer :: i,j + do i = 1,100 + !$omp teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do + + do i = 1,100 + !$omp target teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times {#pragma omp distribute} 2 "omp_transform_loops" } } + +! After unrolling there should be 10 copies of each loop body for each loop-nest +! { dg-final { scan-tree-dump-times "dummy" 40 "omp_transform_loops" } } + +! { dg-final { scan-tree-dump-not {#pragma omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(10\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp distribute private\(j\) unroll_partial\(10\)} 2 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..9b91e5c5f98 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp parallel do collapse(1) + !$omp unroll partial(4) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll auto\(24\)'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 1 * 2 * 3 * 4 = 24 times + +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\) unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 new file mode 100644 index 00000000000..849d4e77984 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll auto' directives preceding 'omp unroll full'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 new file mode 100644 index 00000000000..079c0fdd75b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 @@ -0,0 +1,20 @@ +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,5 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,6 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 new file mode 100644 index 00000000000..4893ba46e4e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 @@ -0,0 +1,21 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=20" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,21 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 new file mode 100644 index 00000000000..60f25d3abe6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=10" } +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,21 + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..f22debbb78f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,244 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } + +subroutine test1 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 new file mode 100644 index 00000000000..faaa37c5d7e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 @@ -0,0 +1,57 @@ +! { dg-do run } +! { dg-options "-O2 -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp simd + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program + +! { dg-final { scan-tree-dump {omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index fd2be57b78c..e563408877e 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -525,6 +525,15 @@ enum omp_clause_code { /* OpenACC clause: nohost. */ OMP_CLAUSE_NOHOST, + + /* Internal representation for an "omp unroll full" directive. */ + OMP_CLAUSE_UNROLL_FULL, + + /* Internal representation for an "omp unroll" directive without a clause. */ + OMP_CLAUSE_UNROLL_NONE, + + /* Internal representation for an "omp unroll partial" directive. */ + OMP_CLAUSE_UNROLL_PARTIAL, }; #undef DEFTREESTRUCT diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 6cdaed7d4b2..813176a912f 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -425,6 +425,7 @@ extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_omp_transform_loops (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_diagnose_omp_blocks (gcc::context *ctxt); extern gimple_opt_pass *make_pass_expand_omp (gcc::context *ctxt); diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 7947f9647a1..588a992bcf3 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -505,6 +505,22 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case OMP_CLAUSE_EXCLUSIVE: name = "exclusive"; goto print_remap; + case OMP_CLAUSE_UNROLL_FULL: + pp_string (pp, "unroll_full"); + break; + case OMP_CLAUSE_UNROLL_NONE: + pp_string (pp, "unroll_none"); + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + pp_string (pp, "unroll_partial"); + if (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause)) + { + pp_left_paren (pp); + dump_generic_node (pp, OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause), spc, flags, + false); + pp_right_paren (pp); + } + break; case OMP_CLAUSE__LOOPTEMP_: name = "_looptemp_"; goto print_remap; @@ -3581,6 +3597,10 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, pp_string (pp, "#pragma omp distribute"); goto dump_omp_loop; + case OMP_LOOP_TRANS: + pp_string (pp, "#pragma omp loop_transform"); + goto dump_omp_loop; + case OMP_TASKLOOP: pp_string (pp, "#pragma omp taskloop"); goto dump_omp_loop; diff --git a/gcc/tree.cc b/gcc/tree.cc index 207293c48cb..53e44367977 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -326,6 +326,9 @@ unsigned const char omp_clause_num_ops[] = 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ + 0, /* OMP_CLAUSE_UNROLL_FULL */ + 0, /* OMP_CLAUSE_UNROLL_NONE */ + 1 /* OMP_CLAUSE_UNROLL_PARTIAL */ }; const char * const omp_clause_code_name[] = @@ -417,6 +420,9 @@ const char * const omp_clause_code_name[] = "if_present", "finalize", "nohost", + "unroll_full", + "unroll_none", + "unroll_partial" }; /* Unless specific to OpenACC, we tend to internally maintain OpenMP-centric diff --git a/gcc/tree.def b/gcc/tree.def index e639a039db9..a47e4b8dbda 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1166,6 +1166,12 @@ DEFTREECODE (OMP_TASK, "omp_task", tcc_statement, 2) unspecified by the standards. */ DEFTREECODE (OMP_FOR, "omp_for", tcc_statement, 7) +/* OpenMP - A loop nest to which a loop transformation such as #pragma omp + unroll should be applied, but which is not associated with another directive + such as #pragma omp for. The kind of loop transformations to be applied are + internally represented by clauses. Operands like for OMP_FOR. */ +DEFTREECODE (OMP_LOOP_TRANS, "omp_loop_trans", tcc_statement, 7) + /* OpenMP - #pragma omp simd [clause1 ... clauseN] Operands like for OMP_FOR. */ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 7) diff --git a/gcc/tree.h b/gcc/tree.h index abcdb5638d4..f33f815b712 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1787,6 +1787,9 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_USE_DEVICE_PTR)->base.public_flag) +#define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 0) + #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind) diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..f07aab898fa --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 @@ -0,0 +1,52 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp do + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp parallel do reduction(+:sum) + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..2ce44d4d044 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 @@ -0,0 +1,88 @@ +! { dg-additional-options "-fdump-tree-original -g" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 1,10,3 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum2() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = -20,1,3 + sum = sum + 1 + end do + end function compute_sum2 + + + integer function compute_sum3() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 30,1,-3 + sum = sum + 1 + end do + end function compute_sum3 + + + integer function compute_sum4() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 50,-60,-10 + sum = sum + 1 + end do + end function compute_sum4 + +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 4) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 8) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 10) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 12) then + call abort + end if + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..55e5cc568a5 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 @@ -0,0 +1,59 @@ +! Test lowering of the internal representation of "omp unroll" loops +! which are not unrolled. + +! { dg-additional-options "-O0" } +! { dg-additional-options "--param=omp-unroll-full-max-iterations=0" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll + do i = 0,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + do i = 0,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (1, 100) + if (result .ne. 101) then + call abort + end if + + result = compute_sum3 (2, 100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (-2, -100) + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..52a214f1049 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 @@ -0,0 +1,72 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll partial(2) + do i = 1,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum3 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum3 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum3 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..d6a4e739675 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 @@ -0,0 +1,55 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp do + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..1df8ce8d5bb --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 @@ -0,0 +1,105 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 + + integer function compute_sum5 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll auto\(50\)'} } + !$omp unroll partial(10) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum5 + + integer function compute_sum6 (step,n) result(sum) + implicit none + integer :: i, j, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + do i = 1,n,step + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll auto' directives preceding 'omp unroll full'} } + !$omp unroll partial(10) + do j = 1, 1000 + sum = sum + 1 + end do + end do + end function compute_sum6 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum5 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum5 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum5 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum5 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum5 (-2, -100) + if (result .ne. 51) then + call abort + end if + + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..d25f18002ae --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 @@ -0,0 +1,198 @@ +! { dg-additional-options "-O0 -cpp" } +! { dg-do run } + +#ifndef UNROLL_FACTOR +#define UNROLL_FACTOR 1 +#endif +module test_functions +contains + subroutine copy (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 1, 100 + array1(i) = array2(i) + end do + end subroutine + + subroutine copy2 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 0,99 + array1(i+1) = array2(i+1) + end do + end subroutine copy2 + + subroutine copy3 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do lastprivate(i) + !$omp unroll partial(UNROLL_FACTOR) + do i = -49,50 + if (i < 0) then + array1((-1)*i) = array2((-1)*i) + else + array1(50+i) = array2(50+i) + endif + end do + end subroutine copy3 + + subroutine copy4 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 2, 200, 2 + array1(i/2) = array2(i/2) + end do + end subroutine copy4 + + subroutine copy5 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 200, 2, -2 + array1(i/2) = array2(i/2) + end do + end subroutine + + subroutine copy6 (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = lower, upper, step + array1 (i) = array2(i) + end do + end subroutine + + subroutine prepare (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + + array1 = 2 + array2 = 0 + end subroutine + + subroutine check_equal (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_equal_at_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + do i=lower, upper, step + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_unchanged_at_non_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i, j + + do i=lower, upper,step + do j=i,i+step-1 + if (array2(j) /= 0) then + write (*,*) i + call abort + end if + end do + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + + call prepare (array1, array2) + call copy (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy2 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy3 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy4 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy5 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 100, 5) + call check_equal_at_steps (array1, array2, 1, 100, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 100, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 50, 5) + call check_equal_at_steps (array1, array2, 1, 50, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 50, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 3, 18, 7) + call check_equal_at_steps (array1, array2, 3 , 18, 7) + call check_unchanged_at_non_steps (array1, array2, 3, 18, 7) +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 new file mode 100644 index 00000000000..02328464c0d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that divides the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 5 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 new file mode 100644 index 00000000000..60866ef33fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that does not divide the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 3 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 new file mode 100644 index 00000000000..6d8a2ef7bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that is larger than the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 113 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..40506025aa3 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 @@ -0,0 +1,38 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + subroutine copy (array1, array2, step, n) + implicit none + + integer :: array1(n) + integer :: array2(n) + integer :: i, step, n + + call omp_set_num_threads (4) + !$omp parallel do shared(array1) shared(array2) schedule(static, 4) + !$omp unroll partial(2) + do i = 1,n + array1(i) = array2(i) + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + integer :: i + + array1 = 2 + call copy(array1, array2, 1, 100) + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..5fb64ddd6fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,33 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function compute_sum +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program