From patchwork Sun Oct 1 20:10:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 147154 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1038693vqb; Sun, 1 Oct 2023 13:14:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGecv6BjyoARwtGauy/2MJZOy9CljIuBn2q/YOq2wguoWqX1G+ZHkv4hAPsOd8dM2O6XnHA X-Received: by 2002:a17:906:2189:b0:9b2:7584:80dc with SMTP id 9-20020a170906218900b009b2758480dcmr8432924eju.20.1696191267495; Sun, 01 Oct 2023 13:14:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696191267; cv=none; d=google.com; s=arc-20160816; b=fvtPQPl5FYkH5BSEprG/o2UhbHhkQrlBU2np6JdWsiS2HIBtITSuSli3ZgO7bVFV1q aLhTWZmOmS/ZBeEs2pkH72h75mVhuRWFSimynnB7WN7IiUCm3/RIzwAPfy06i8u3F7jX HB11e46c6TvInQGlt5e7k/u6Jw5m0e6WNmFGOPLgu83f4ahxDW0VtKxeMZDQHtmxyb5R RSQ4bD4VpXub26LkxFFBATQxTa9f1wBCDnkdpNTRrScYSujtFoBw/Q/kc95neQCzQkm3 E0MiEoFq3W5gTnqAneas0hR5NO/Kk0KEVKvPonL9EI4w99xiK90dLJUTob8YSARr80hy UwNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:ironport-sdr:dmarc-filter:delivered-to; bh=zNeD2DRIfW9n2fUZQA7ZwmNccU8BT9J0dHKLIBFdV/k=; fh=OCleZEGPw6EXwGK7DgwfSjSqRFJqj4InDXFwXNGe58Y=; b=HsKXkLxj8Scip3dhX3IATjs8TmtyPE2bxBS5Ins6W7ZpkiK5Iz2qyLVkrt/1LNJ3ti kjqyjrHRVnhfMktxxt2s8FhWNRGcbVF8rTzDyjfi194gOvEs0NRob3ahdNYDOggd64uL QrYle92mWIcphuGw/zcrYlzHuQPn0ifIGy0btaLsyR24m43+40tqJcmv2AFt9KNRsmeo 48C2OrH1W4Iv1yEz5NaRc71MtZDuN7hc3cJWcpdikNIcMPjKYrGoFx+vgKizuo0d1wyT CyR9c29k9jxPwo4S76Nb7LTzd9lcOK1Ed1e7ZVv3Nmr5mrZ3kRAhI+QSAT6Ejjxaf5NE w4GA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k9-20020a170906680900b0099d0a0914b1si19348764ejr.203.2023.10.01.13.14.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Oct 2023 13:14:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A6CD3882024 for ; Sun, 1 Oct 2023 20:12:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 0317F3882675 for ; Sun, 1 Oct 2023 20:11:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0317F3882675 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-CSE-ConnectionGUID: uR03XYbQRJ2QigsLB1Hz/g== X-CSE-MsgGUID: czdsFHTbQTSn8E7OQbWQBA== X-IronPort-AV: E=Sophos;i="6.03,191,1694764800"; d="scan'208";a="18296436" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 01 Oct 2023 12:11:22 -0800 IronPort-SDR: oIRcCV1hitbdAYcjYhrT9ENtp+D+bhm9Xo2zFTgeJPxuJjdEU28ntvEWLbDZoZWFsenPn7n1Ep r0sS6MlWe1BmzDrsqNyTIfZ3rDSnOToBe3DLBUphR5O1UntLSYgYRJezoQGP184wdsZAos3Ik2 9GeWklJP6nI1T49QQBrbuqcdBuO0HWdUguy+4i+EwKcQvfuUVNAWN1x2Fn/JMuF9qs8RMjIoRr 9zj7lTqt6O6aTSqPyAjq/Be3eKFpotlZ6SZnTBHz43LGdbpYk5qDwHYBjW5dt8dKMABJZMgqZT bPk= From: Sandra Loosemore To: CC: , Subject: [WIP 3/4] OpenMP: Fortran front-end support for loop transforms. Date: Sun, 1 Oct 2023 14:10:20 -0600 Message-ID: <20231001201021.785572-4-sandra@codesourcery.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231001201021.785572-1-sandra@codesourcery.com> References: <20231001201021.785572-1-sandra@codesourcery.com> MIME-Version: 1.0 X-ClientProxiedBy: svr-orw-mbx-11.mgc.mentorg.com (147.34.90.211) To svr-orw-mbx-13.mgc.mentorg.com (147.34.90.213) X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778585454858154312 X-GMAIL-MSGID: 1778585454858154312 From: Frederik Harwath gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Print unroll clauses. (show_omp_node): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. (show_code_node): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL, ST_OMP_TILE, and ST_OMP_END_TILE. (struct gfc_omp_clauses): Add fields for tile and unroll. (enum gfc_exec_op): Add EXEC_OMP_UNROLL and EXEC_OMP_TILE. (loop_transform_p): Declare. (gfc_expr_list_len): Declare. * match.h (gfc_match_omp_tile): Declare. (gfc_match_omp_unroll): Declare. * openmp.cc (gfc_free_omp_clauses): Free tile_sizes field. (match_tile_sizes): New. (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. (gfc_match_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL and OMP_CLAUSE_UNROLL_PARTIAL syntax. (OMP_UNROLL_CLAUSES): Define. (OMP_TILE_CLAUSES): Define. (gfc_match_omp_tile): New. (gfc_match_omp_unroll): New. (find_nested_loop_in_chain): Handle loop transforms. (find_nested_loop_or_transform_in_chain): New. (find_nested_loop_or_transform_in_block): New. (diagnose_intervening_code_errors_1): Handle loop transforms. (restructure_intervening_code): Handle loop transforms. (is_outer_iteration_variable): Adjust to avoid fencepost error. (check_nested_loop_in_chain): Handle loop transforms. (expr_uses_intervening_var): Add assertion. (is_intervening_var): Add assertion. (expr_is_invariant): Adjust to avoid fencepost error. (omp_unroll_removes_loop_nest): New. (resolve_nested_loop_transforms): New. (resolve_omp_unroll): New. (resolve_nested_loops): New, split from... (resolve_omp_do) ...here. (resolve_omp_tile): New. (omp_code_to_statement): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. (resolve_oacc_nested_loops): Adjust assertion. (gfc_resolve_omp_directive): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. * parse.cc (decode_omp_directive): Handle tile/unroll directives. (case_exec_markers): Handle ST_OMP_TILE and ST_OMP_UNROLL. (gfc_ascii_statement): Handle tile/unroll directives. (parse_omp_do): Handle ST_OMP_TILE and ST_OMP_UNROLL. (parse_executable): Handle ST_OMP_TILE and ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): HANDLE EXEC_OMP_TILE and EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Handle ST_OMP_TILE and ST_OMP_UNROLL. * trans-openmp.cc (gfc_trans_omp_clauses): Handle tile/unroll directives. (loop_transform_p): New. (gfc_expr_list_len): New. (computer_transformed_depth): New. (gfc_trans_omp_do): Handle loop transformations. (gfc_trans_omp_directive): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. * trans.cc (trans_code): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/collapse1.f90: Adjust error messages. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New. * gfortran.dg/gomp/loop-transforms/tile-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-1a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-3.f90: New. * gfortran.dg/gomp/loop-transforms/tile-4.f90: New. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New. * gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-10.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-11.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-12.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-8.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New. * gfortran.dg/gomp/pure-1.f90: Move unroll/tile tests here from... * gfortran.dg/gomp/pure-2.f90: ...here. libgomp/ChangeLog: * testsuite/libgomp.fortran/imperfect-transform-1.f90: New. * testsuite/libgomp.fortran/imperfect-transform-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/nested-fn.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New. * testsuite/libgomp.fortran/target-imperfect-transform-1.f90: New. * testsuite/libgomp.fortran/target-imperfect-transform-2.f90: New. Co-Authored-By: Sandra Loosemore --- gcc/fortran/dump-parse-tree.cc | 28 + gcc/fortran/gfortran.h | 12 +- gcc/fortran/match.h | 2 + gcc/fortran/openmp.cc | 730 ++++++++++++++---- gcc/fortran/parse.cc | 48 ++ gcc/fortran/resolve.cc | 6 + gcc/fortran/st.cc | 2 + gcc/fortran/trans-openmp.cc | 182 ++++- gcc/fortran/trans.cc | 2 + gcc/testsuite/gfortran.dg/gomp/collapse1.f90 | 6 +- .../gomp/loop-transforms/inner-loops.f90 | 124 +++ .../gomp/loop-transforms/tile-1.f90 | 163 ++++ .../gomp/loop-transforms/tile-1a.f90 | 10 + .../gomp/loop-transforms/tile-2.f90 | 80 ++ .../gomp/loop-transforms/tile-3.f90 | 18 + .../gomp/loop-transforms/tile-4.f90 | 95 +++ .../loop-transforms/tile-imperfect-nest.f90 | 93 +++ .../loop-transforms/tile-inner-loops-1.f90 | 16 + .../loop-transforms/tile-inner-loops-2.f90 | 23 + .../loop-transforms/tile-inner-loops-3.f90 | 22 + .../loop-transforms/tile-inner-loops-3a.f90 | 31 + .../loop-transforms/tile-inner-loops-4.f90 | 30 + .../loop-transforms/tile-inner-loops-4a.f90 | 26 + .../loop-transforms/tile-inner-loops-5.f90 | 123 +++ .../tile-non-rectangular-1.f90 | 71 ++ .../tile-non-rectangular-2.f90 | 12 + .../gomp/loop-transforms/tile-unroll-1.f90 | 57 ++ .../gomp/loop-transforms/unroll-1.f90 | 277 +++++++ .../gomp/loop-transforms/unroll-10.f90 | 7 + .../gomp/loop-transforms/unroll-11.f90 | 75 ++ .../gomp/loop-transforms/unroll-12.f90 | 29 + .../gomp/loop-transforms/unroll-2.f90 | 22 + .../gomp/loop-transforms/unroll-3.f90 | 17 + .../gomp/loop-transforms/unroll-4.f90 | 18 + .../gomp/loop-transforms/unroll-5.f90 | 18 + .../gomp/loop-transforms/unroll-6.f90 | 19 + .../gomp/loop-transforms/unroll-7.f90 | 62 ++ .../gomp/loop-transforms/unroll-8.f90 | 22 + .../gomp/loop-transforms/unroll-9.f90 | 18 + .../loop-transforms/unroll-inner-loop.f90 | 57 ++ .../loop-transforms/unroll-no-clause-1.f90 | 20 + .../loop-transforms/unroll-no-clause-2.f90 | 21 + .../loop-transforms/unroll-no-clause-3.f90 | 23 + .../loop-transforms/unroll-non-rect-1.f90 | 31 + .../gomp/loop-transforms/unroll-simd-1.f90 | 244 ++++++ .../gomp/loop-transforms/unroll-simd-2.f90 | 57 ++ .../gomp/loop-transforms/unroll-tile-1.f90 | 37 + .../gomp/loop-transforms/unroll-tile-2.f90 | 41 + .../loop-transforms/unroll-tile-inner-1.f90 | 25 + gcc/testsuite/gfortran.dg/gomp/pure-1.f90 | 26 + gcc/testsuite/gfortran.dg/gomp/pure-2.f90 | 25 - .../libgomp.fortran/imperfect-transform-1.f90 | 70 ++ .../libgomp.fortran/imperfect-transform-2.f90 | 70 ++ .../loop-transforms/inner-1.f90 | 77 ++ .../loop-transforms/nested-fn.f90 | 19 + .../loop-transforms/tile-1.f90 | 71 ++ .../loop-transforms/tile-2.f90 | 117 +++ .../loop-transforms/tile-unroll-1.f90 | 112 +++ .../loop-transforms/tile-unroll-2.f90 | 71 ++ .../loop-transforms/tile-unroll-3.f90 | 77 ++ .../loop-transforms/tile-unroll-4.f90 | 75 ++ .../loop-transforms/unroll-1.f90 | 54 ++ .../loop-transforms/unroll-2.f90 | 88 +++ .../loop-transforms/unroll-3.f90 | 59 ++ .../loop-transforms/unroll-4.f90 | 72 ++ .../loop-transforms/unroll-5.f90 | 55 ++ .../loop-transforms/unroll-6.f90 | 105 +++ .../loop-transforms/unroll-7.f90 | 198 +++++ .../loop-transforms/unroll-7a.f90 | 7 + .../loop-transforms/unroll-7b.f90 | 7 + .../loop-transforms/unroll-7c.f90 | 7 + .../loop-transforms/unroll-8.f90 | 38 + .../loop-transforms/unroll-simd-1.f90 | 34 + .../loop-transforms/unroll-tile-1.f90 | 112 +++ .../loop-transforms/unroll-tile-2.f90 | 71 ++ .../target-imperfect-transform-1.f90 | 73 ++ .../target-imperfect-transform-2.f90 | 73 ++ 77 files changed, 4818 insertions(+), 197 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc index 68122e3e6fd..859f3f36609 100644 --- a/gcc/fortran/dump-parse-tree.cc +++ b/gcc/fortran/dump-parse-tree.cc @@ -2108,6 +2108,26 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) } if (omp_clauses->assume) show_omp_assumes (omp_clauses->assume); + if (omp_clauses->unroll_full) + fputs (" FULL", dumpfile); + if (omp_clauses->unroll_partial) + { + fputs (" PARTIAL", dumpfile); + if (omp_clauses->unroll_partial_factor > 0) + fprintf (dumpfile, "(%u)", omp_clauses->unroll_partial_factor); + } + if (omp_clauses->tile_sizes) + { + gfc_expr_list *sizes; + fputs (" TILE SIZES(", dumpfile); + for (sizes = omp_clauses->tile_sizes; sizes; sizes = sizes->next) + { + show_expr (sizes->expr); + if (sizes->next) + fputs (", ", dumpfile); + } + fputc (')', dumpfile); + } } /* Show a single OpenMP or OpenACC directive node and everything underneath it @@ -2220,6 +2240,8 @@ show_omp_node (int level, gfc_code *c) name = "TEAMS DISTRIBUTE PARALLEL DO SIMD"; break; case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: name = "TEAMS DISTRIBUTE SIMD"; break; case EXEC_OMP_TEAMS_LOOP: name = "TEAMS LOOP"; break; + case EXEC_OMP_TILE: name = "TILE"; break; + case EXEC_OMP_UNROLL: name = "UNROLL"; break; case EXEC_OMP_WORKSHARE: name = "WORKSHARE"; break; default: gcc_unreachable (); @@ -2296,6 +2318,8 @@ show_omp_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: omp_clauses = c->ext.omp_clauses; break; @@ -2357,6 +2381,8 @@ show_omp_node (int level, gfc_code *c) d = d->block; } } + else if (c->op == EXEC_OMP_UNROLL || c->op == EXEC_OMP_TILE) + show_code (level + 1, c->block != NULL ? c->block->next : c->next); else show_code (level + 1, c->block->next); if (c->op == EXEC_OMP_ATOMIC) @@ -3537,6 +3563,8 @@ show_code_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: show_omp_node (level, c); break; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6caf7765ac6..bf81d1ce009 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -321,7 +321,9 @@ enum gfc_statement ST_OMP_ALLOCATE, ST_OMP_ALLOCATE_EXEC, ST_OMP_ALLOCATORS, ST_OMP_END_ALLOCATORS, /* Note: gfc_match_omp_nothing returns ST_NONE. */ - ST_OMP_NOTHING, ST_NONE + ST_OMP_NOTHING, ST_NONE, + ST_OMP_UNROLL, ST_OMP_END_UNROLL, + ST_OMP_TILE, ST_OMP_END_TILE }; /* Types of interfaces that we can have. Assignment interfaces are @@ -1564,6 +1566,7 @@ typedef struct gfc_omp_clauses struct gfc_expr *dist_chunk_size; struct gfc_expr *message; struct gfc_omp_assumptions *assume; + struct gfc_expr_list *tile_sizes; const char *critical_name; enum gfc_omp_default_sharing default_sharing; enum gfc_omp_atomic_op atomic_op; @@ -1577,6 +1580,8 @@ typedef struct gfc_omp_clauses unsigned grainsize_strict:1, num_tasks_strict:1, compare:1, weak:1; unsigned non_rectangular:1, order_concurrent:1; unsigned contains_teams_construct:1, target_first_st_is_teams:1; + unsigned unroll_full:1, unroll_none:1, unroll_partial:1; + unsigned unroll_partial_factor; ENUM_BITFIELD (gfc_omp_sched_kind) sched_kind:3; ENUM_BITFIELD (gfc_omp_device_type) device_type:2; ENUM_BITFIELD (gfc_omp_memorder) memorder:3; @@ -3011,6 +3016,7 @@ enum gfc_exec_op EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED, EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD, EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE, + EXEC_OMP_UNROLL, EXEC_OMP_TILE, EXEC_OMP_ERROR, EXEC_OMP_ALLOCATE, EXEC_OMP_ALLOCATORS }; @@ -3927,6 +3933,10 @@ void gfc_generate_module_code (gfc_namespace *); /* trans-intrinsic.cc */ bool gfc_inline_intrinsic_function_p (gfc_expr *); +/* trans-openmp.cc */ +bool loop_transform_p (gfc_exec_op op); +int gfc_expr_list_len (gfc_expr_list *); + /* bbt.cc */ typedef int (*compare_fn) (void *, void *); void gfc_insert_bbt (void *, void *, compare_fn); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 7d72725ed3c..d7156b9f308 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -228,6 +228,8 @@ match gfc_match_omp_teams_distribute_parallel_do_simd (void); match gfc_match_omp_teams_distribute_simd (void); match gfc_match_omp_teams_loop (void); match gfc_match_omp_threadprivate (void); +match gfc_match_omp_tile (void); +match gfc_match_omp_unroll (void); match gfc_match_omp_workshare (void); match gfc_match_omp_end_critical (void); match gfc_match_omp_end_nowait (void); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 6b9c5e81a37..6dbdd0d5685 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -193,6 +193,7 @@ gfc_free_omp_clauses (gfc_omp_clauses *c) i == OMP_LIST_USES_ALLOCATORS); gfc_free_expr_list (c->wait_list); gfc_free_expr_list (c->tile_list); + gfc_free_expr_list (c->tile_sizes); free (CONST_CAST (char *, c->critical_name)); if (c->assume) { @@ -989,6 +990,76 @@ cleanup: return MATCH_ERROR; } +static match +match_tile_sizes (gfc_expr_list **list) +{ + gfc_expr_list *head, *tail, *p; + locus old_loc; + gfc_expr *expr; + match m; + + head = tail = NULL; + + old_loc = gfc_current_locus; + + m = gfc_match_char ('('); + if (m != MATCH_YES) + goto syntax; + + for (;;) + { + m = gfc_match_expr (&expr); + if (m == MATCH_YES) + { + p = gfc_get_expr_list (); + if (head == NULL) + head = tail = p; + else + { + tail->next = p; + tail = tail->next; + } + int size = 0; + if (m == MATCH_YES) + { + if (gfc_extract_int (expr, &size, 1)) + goto cleanup; + else if (size < 1) + { + gfc_error_now ("tile size not constant " + "positive integer at %C"); + goto cleanup; + } + tail->expr = expr; + } + goto next_item; + } + if (m == MATCH_ERROR) + goto cleanup; + goto syntax; + + next_item: + if (gfc_match_char (')') == MATCH_YES) + break; + if (gfc_match_char (',') != MATCH_YES) + goto syntax; + } + + while (*list) + list = &(*list)->next; + + *list = head; + return MATCH_YES; + +syntax: + gfc_error ("Syntax error in 'tile sizes' list at %C"); + +cleanup: + gfc_free_expr_list (head); + gfc_current_locus = old_loc; + return MATCH_ERROR; +} + /* OpenMP clauses. */ enum omp_mask1 { @@ -1063,6 +1134,10 @@ enum omp_mask1 /* More OpenMP clauses and OpenACC 2.0+ specific clauses. */ enum omp_mask2 { + OMP_CLAUSE_UNROLL_FULL, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_NONE, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_PARTIAL, /* OpenMP 5.1. */ + OMP_CLAUSE_TILE, /* OpenMP 5.1. */ OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, @@ -2667,6 +2742,15 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, && gfc_match_motion_var_list ("from (", &c->lists[OMP_LIST_FROM], &head) == MATCH_YES) continue; + if ((mask & OMP_CLAUSE_UNROLL_FULL) + && (m = gfc_match_dupl_check (!c->unroll_full, "full")) + != MATCH_NO) + { + if (m == MATCH_ERROR) + goto error; + c->unroll_full = needs_space = true; + continue; + } break; case 'g': if ((mask & OMP_CLAUSE_GANG) @@ -3326,6 +3410,32 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, } break; case 'p': + if (mask & OMP_CLAUSE_UNROLL_PARTIAL) + { + if ((m = gfc_match_dupl_check (!c->unroll_partial, "partial")) + != MATCH_NO) + { + int unroll_factor; + if (m == MATCH_ERROR) + goto error; + + c->unroll_partial = true; + + gfc_expr *cexpr = NULL; + m = gfc_match (" ( %e )", &cexpr); + if (m == MATCH_NO) + ; + else if (m == MATCH_YES + && !gfc_extract_int (cexpr, &unroll_factor, -1) + && unroll_factor > 0) + c->unroll_partial_factor = unroll_factor; + else + gfc_error_now ("PARTIAL clause argument not constant " + "positive integer at %C"); + gfc_free_expr (cexpr); + continue; + } + } if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], @@ -4446,6 +4556,10 @@ cleanup: (omp_mask (OMP_CLAUSE_AT) | OMP_CLAUSE_MESSAGE | OMP_CLAUSE_SEVERITY) #define OMP_WORKSHARE_CLAUSES \ omp_mask (OMP_CLAUSE_NOWAIT) +#define OMP_UNROLL_CLAUSES \ + (omp_mask (OMP_CLAUSE_UNROLL_FULL) | OMP_CLAUSE_UNROLL_PARTIAL) +#define OMP_TILE_CLAUSES \ + (omp_mask (OMP_CLAUSE_TILE)) #define OMP_ALLOCATORS_CLAUSES \ omp_mask (OMP_CLAUSE_ALLOCATE) @@ -6654,6 +6768,30 @@ gfc_match_omp_teams_distribute_simd (void) | OMP_SIMD_CLAUSES); } +match +gfc_match_omp_tile (void) +{ + gfc_omp_clauses *c = gfc_get_omp_clauses(); + new_st.op = EXEC_OMP_TILE; + new_st.ext.omp_clauses = c; + + return match_tile_sizes (&c->tile_sizes); +} + +match +gfc_match_omp_unroll (void) +{ + match m = match_omp (EXEC_OMP_UNROLL, OMP_UNROLL_CLAUSES); + + /* Add an internal clause as a marker to indicate that this "unroll" + directive had no clause. */ + if (new_st.ext.omp_clauses + && !new_st.ext.omp_clauses->unroll_full + && !new_st.ext.omp_clauses->unroll_partial) + new_st.ext.omp_clauses->unroll_none = true; + + return m; +} match gfc_match_omp_workshare (void) @@ -9602,6 +9740,11 @@ find_nested_loop_in_chain (gfc_code *chain) { if (code->op == EXEC_DO) return code; + else if (loop_transform_p (code->op) && code->block) + { + code = code->block; + continue; + } else if (code->op == EXEC_BLOCK) { gfc_code *c = find_nested_loop_in_block (code); @@ -9624,6 +9767,63 @@ find_nested_loop_in_block (gfc_code *block) return find_nested_loop_in_chain (ns->code); } +/* Forward declaration for mutually recursive functions. */ +static gfc_code * +find_next_loop_or_transform_in_block (gfc_code *block, gfc_code **imperfectp); + +/* Like find_nested_loop_in_chain, but also stop when a loop transform is + found and check for intervening code too. Return the first nested + DO loop or loop transform in CHAIN, and set *IMPERFECTP to the first + intervening code statement if one is found. */ +static gfc_code * +find_next_loop_or_transform_in_chain (gfc_code *chain, gfc_code **imperfectp) +{ + gfc_code *code; + gfc_code *result = NULL; + + if (!chain) + return NULL; + + for (code = chain; code; code = code->next) + { + /* DO WHILE and DO CONCURRENT are errors, but we need to catch them + here to ensure the right error is diagnosed elsewhere. */ + if (!result + && (code->op == EXEC_DO + || code->op == EXEC_DO_WHILE + || code->op == EXEC_DO_CONCURRENT + || loop_transform_p (code->op))) + result = code; + else if (!result && code->op == EXEC_BLOCK) + { + result = find_next_loop_or_transform_in_block (code, imperfectp); + /* If no loop in the block, the block itself is intervening code. */ + if (!result && !*imperfectp) + *imperfectp = code; + } + else if (code->op == EXEC_NOP || code->op == EXEC_CONTINUE) + continue; + else if (!*imperfectp) + *imperfectp = code; + if (result && *imperfectp) + break; + } + return result; +} + +/* Like find_nested_loop_in_block, but also checks for intervening code. + Return the first nested DO loop in BLOCK, or NULL if there + isn't one. Sets *IMPERFECTP to the first piece of intervening code. */ +static gfc_code * +find_next_loop_or_transform_in_block (gfc_code *block, gfc_code **imperfectp) +{ + gfc_namespace *ns; + gcc_assert (block->op == EXEC_BLOCK); + ns = block->ext.block.ns; + gcc_assert (ns); + return find_next_loop_or_transform_in_chain (ns->code, imperfectp); +} + void gfc_resolve_omp_do_blocks (gfc_code *code, gfc_namespace *ns) { @@ -10059,6 +10259,9 @@ diagnose_intervening_code_errors_1 (gfc_code *chain, gfc_namespace* ns = code->ext.block.ns; diagnose_intervening_code_errors_1 (ns->code, state); } + else if (loop_transform_p (code->op) && code->block) + /* Recurse on loop transformations. */ + diagnose_intervening_code_errors_1 (code->block->next, state); else /* Treat the whole statement as a unit. */ { @@ -10125,19 +10328,32 @@ restructure_intervening_code (gfc_code **chainp, gfc_code *outer_loop, for (code = *chainp; code; code = code->next, chainp = &((*chainp)->next)) { - if (code->op == EXEC_DO) + if (code->op == EXEC_DO || loop_transform_p (code->op)) { - /* Cut CODE free from its chain, leaving the ends dangling. */ + gfc_code *c = code; + + /* Treat a series of loop transforms as a unit, same as a single + EXEC_DO. CODE is the first and C is the last in the chain. */ + while (loop_transform_p (c->op) && !c->block) + c = c->next; + + gcc_assert (c); + gcc_assert (c->op == EXEC_DO + || (loop_transform_p (c->op) && c->block)); + + /* Cut the transforms and the loop they apply to free from the + chain, leaving the ends dangling. */ *chainp = NULL; - tail = code->next; - code->next = NULL; + tail = c->next; + c->next = NULL; - if (count == 1) - innermost_loop = code; + if (count == 1 && c->op == EXEC_DO) + innermost_loop = c; else innermost_loop - = restructure_intervening_code (&(code->block->next), - code, count - 1); + = restructure_intervening_code (&(c->block->next), c, + (loop_transform_p (c->op) + ? count : count - 1)); break; } else if (code->op == EXEC_BLOCK @@ -10190,7 +10406,7 @@ restructure_intervening_code (gfc_code **chainp, gfc_code *outer_loop, /* For loops, finally splice CODE into OUTER_LOOP. We already handled relinking EXEC_BLOCK above. */ - if (code->op == EXEC_DO && outer_loop) + if ((code->op == EXEC_DO || loop_transform_p (code->op)) && outer_loop) outer_loop->block->next = code; return innermost_loop; @@ -10204,13 +10420,13 @@ is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) int i; gfc_code *do_code = code; - for (i = 1; i < depth; i++) + for (i = 0; i < depth; i++) { - do_code = find_nested_loop_in_chain (do_code->block->next); - gcc_assert (do_code); + gcc_assert (do_code && do_code->op == EXEC_DO); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (var == ivar) return true; + do_code = find_nested_loop_in_chain (do_code->block->next); } return false; } @@ -10232,6 +10448,11 @@ check_nested_loop_in_chain (gfc_code *chain, gfc_expr *expr, gfc_symbol *sym, { if (code->op == EXEC_DO) return code; + else if (loop_transform_p (code->op) && code->block) + { + code = code->block; + continue; + } else if (code->op == EXEC_BLOCK) { gfc_code *c = check_nested_loop_in_block (code, expr, sym, bad); @@ -10299,6 +10520,7 @@ expr_uses_intervening_var (gfc_code *code, int depth, gfc_expr *expr) for (i = 0; i < depth; i++) { bool bad = false; + gcc_assert (do_code && do_code->op == EXEC_DO); do_code = check_nested_loop_in_chain (do_code->block->next, expr, NULL, &bad); if (bad) @@ -10318,6 +10540,7 @@ is_intervening_var (gfc_code *code, int depth, gfc_symbol *sym) for (i = 0; i < depth; i++) { bool bad = false; + gcc_assert (do_code && do_code->op == EXEC_DO); do_code = check_nested_loop_in_chain (do_code->block->next, NULL, sym, &bad); if (bad) @@ -10334,13 +10557,13 @@ expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) int i; gfc_code *do_code = code; - for (i = 1; i < depth; i++) + for (i = 0; i < depth; i++) { - do_code = find_nested_loop_in_chain (do_code->block->next); - gcc_assert (do_code); + gcc_assert (do_code && do_code->op == EXEC_DO); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (gfc_find_sym_in_expr (ivar, expr)) return false; + do_code = find_nested_loop_in_chain (do_code->block->next); } return true; } @@ -10408,135 +10631,131 @@ bound_expr_is_canonical (gfc_code *code, int depth, gfc_expr *expr, return false; } -static void -resolve_omp_do (gfc_code *code) +static bool +omp_unroll_removes_loop_nest (gfc_code *code) { - gfc_code *do_code, *next; - int list, i, count; - gfc_omp_namelist *n; - gfc_symbol *dovar; - const char *name; - bool is_simd = false; - bool errorp = false; - bool perfect_nesting_errorp = false; + gcc_checking_assert (code->op == EXEC_OMP_UNROLL); + if (!code->ext.omp_clauses) + return true; - switch (code->op) + if (code->ext.omp_clauses->unroll_none) { - case EXEC_OMP_DISTRIBUTE: name = "!$OMP DISTRIBUTE"; break; - case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_DISTRIBUTE_SIMD: - name = "!$OMP DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_DO: name = "!$OMP DO"; break; - case EXEC_OMP_DO_SIMD: name = "!$OMP DO SIMD"; is_simd = true; break; - case EXEC_OMP_LOOP: name = "!$OMP LOOP"; break; - case EXEC_OMP_PARALLEL_DO: name = "!$OMP PARALLEL DO"; break; - case EXEC_OMP_PARALLEL_DO_SIMD: - name = "!$OMP PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_PARALLEL_LOOP: name = "!$OMP PARALLEL LOOP"; break; - case EXEC_OMP_PARALLEL_MASKED_TASKLOOP: - name = "!$OMP PARALLEL MASKED TASKLOOP"; - break; - case EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD: - name = "!$OMP PARALLEL MASKED TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_PARALLEL_MASTER_TASKLOOP: - name = "!$OMP PARALLEL MASTER TASKLOOP"; - break; - case EXEC_OMP_PARALLEL_MASTER_TASKLOOP_SIMD: - name = "!$OMP PARALLEL MASTER TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_MASKED_TASKLOOP: name = "!$OMP MASKED TASKLOOP"; break; - case EXEC_OMP_MASKED_TASKLOOP_SIMD: - name = "!$OMP MASKED TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_MASTER_TASKLOOP: name = "!$OMP MASTER TASKLOOP"; break; - case EXEC_OMP_MASTER_TASKLOOP_SIMD: - name = "!$OMP MASTER TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_SIMD: name = "!$OMP SIMD"; is_simd = true; break; - case EXEC_OMP_TARGET_PARALLEL_DO: name = "!$OMP TARGET PARALLEL DO"; break; - case EXEC_OMP_TARGET_PARALLEL_DO_SIMD: - name = "!$OMP TARGET PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_PARALLEL_LOOP: - name = "!$OMP TARGET PARALLEL LOOP"; - break; - case EXEC_OMP_TARGET_SIMD: - name = "!$OMP TARGET SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE: - name = "!$OMP TARGET TEAMS DISTRIBUTE"; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD: - name = "!$OMP TARGET TEAMS DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_LOOP: name = "!$OMP TARGET TEAMS LOOP"; break; - case EXEC_OMP_TASKLOOP: name = "!$OMP TASKLOOP"; break; - case EXEC_OMP_TASKLOOP_SIMD: - name = "!$OMP TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE: name = "!$OMP TEAMS DISTRIBUTE"; break; - case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: - name = "!$OMP TEAMS DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; - default: gcc_unreachable (); + gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " + "into a non-loop", + &code->loc); + return true; } + if (code->ext.omp_clauses->unroll_full) + { + gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " + "non-loop", + &code->loc); + return true; + } + return false; +} - if (code->ext.omp_clauses) - resolve_omp_clauses (code, code->ext.omp_clauses, NULL); +static gfc_code * +resolve_nested_loop_transforms (gfc_code *code, const char *name, + int required_depth, locus *loc) +{ + if (!code) + return code; - do_code = code->block->next; - if (code->ext.omp_clauses->orderedc) - count = code->ext.omp_clauses->orderedc; - else + bool error = false; + while (loop_transform_p (code->op)) { - count = code->ext.omp_clauses->collapse; - if (count <= 0) - count = 1; + if (!error && code->op == EXEC_OMP_UNROLL) + { + if (omp_unroll_removes_loop_nest (code)) + { + gfc_error ("missing canonical loop nest after %s at %L", name, + loc); + error = true; + } + else if (required_depth > 1) + { + gfc_error ("loop nest depth after !$OMP UNROLL at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + } + else if (!error && code->op == EXEC_OMP_TILE + && required_depth > gfc_expr_list_len (code->ext.omp_clauses->tile_sizes)) + { + gfc_error ("loop nest depth after !$OMP TILE at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + + if (code->block) + code = code->block->next; + else + code = code->next; } + gcc_checking_assert (!loop_transform_p (code->op)); - /* While the spec defines the loop nest depth independently of the COLLAPSE - clause, in practice the middle end only pays attention to the COLLAPSE - depth and treats any further inner loops as the final-loop-body. So - here we also check canonical loop nest form only for the number of - outer loops specified by the COLLAPSE clause too. */ - for (i = 1; i <= count; i++) + return code; +} + +static void +resolve_omp_unroll (gfc_code *code) +{ + const char *descr = "!$OMP UNROLL"; + locus *loc = &code->loc; + + if (!code->block || code->block->op == EXEC_DO) + return; + + code = resolve_nested_loop_transforms (code->block->next, descr, 1, + &code->loc); + + if (code->op == EXEC_DO) + return; + + if (code->op == EXEC_DO_WHILE) + { + gfc_error ("%s invalid around DO WHILE or DO without loop " + "control at %L", descr, loc); + return; + } + + if (code->op == EXEC_DO_CONCURRENT) { + gfc_error ("%s invalid around DO CONCURRENT loop at %L", + descr, loc); + return; + } + + gfc_error ("missing canonical loop nest after %s at %L", + descr, loc); +} + +/* Shared helper function for resolve_omp_do and resolve_omp_tile: + check that we have NUM_LOOPS nested loops at DO_CODE. CODE and NAME + are for the outer OMP construct, used for error checking. + Note that DO_CODE should be an EXEC_DO, with all the outer loop + transformations stripped off already. */ + +static void +resolve_nested_loops (gfc_code *code, const char *name, gfc_code *do_code, + int num_loops, bool is_simd, bool is_tile) +{ + bool errorp = false; + bool perfect_nesting_errorp = false; + bool is_nested_tile = false; + gfc_omp_namelist *n; + gfc_code *next; + int list; + bool any_imperfect = false; + gfc_code *outer_do_code = do_code; + + for (int i = 0; i < num_loops; i++) + { + gfc_symbol *dovar; gfc_symbol *start_var = NULL, *end_var = NULL; + gfc_code *imperfect = NULL; + /* Parse errors are not recoverable. */ if (do_code->op == EXEC_DO_WHILE) { @@ -10550,7 +10769,16 @@ resolve_omp_do (gfc_code *code) &do_code->loc); return; } + if (do_code->op != EXEC_DO) + { + gfc_error ("%s must be DO loop at %L", name, + &do_code->loc); + break; + } + gcc_assert (do_code->op == EXEC_DO); + if (!gfc_resolve_expr (do_code->ext.iterator->var)) + break; if (do_code->ext.iterator->var->ts.type != BT_INTEGER) { gfc_error ("%s iteration variable must be of type integer at %L", @@ -10584,20 +10812,20 @@ resolve_omp_do (gfc_code *code) "LINEAR at %L", name, &do_code->loc); errorp = true; } - if (is_outer_iteration_variable (code, i, dovar)) + if (is_outer_iteration_variable (outer_do_code, i, dovar)) { gfc_error ("%s iteration variable used in more than one loop at %L", name, &do_code->loc); errorp = true; } - else if (is_intervening_var (code, i, dovar)) + else if (is_intervening_var (outer_do_code, i, dovar)) { gfc_error ("%s iteration variable at %L is bound in " "intervening code", name, &do_code->loc); errorp = true; } - else if (!bound_expr_is_canonical (code, i, + else if (!bound_expr_is_canonical (outer_do_code, i, do_code->ext.iterator->start, &start_var)) { @@ -10605,7 +10833,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->start)) { gfc_error ("%s loop start expression at %L uses variable bound in " @@ -10613,7 +10841,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (!bound_expr_is_canonical (code, i, + else if (!bound_expr_is_canonical (outer_do_code, i, do_code->ext.iterator->end, &end_var)) { @@ -10621,7 +10849,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->end)) { gfc_error ("%s loop end expression at %L uses variable bound in " @@ -10635,13 +10863,14 @@ resolve_omp_do (gfc_code *code) "iteration variables at %L", name, &do_code->loc); errorp = true; } - else if (!expr_is_invariant (code, i, do_code->ext.iterator->step)) + else if (!expr_is_invariant (outer_do_code, i, + do_code->ext.iterator->step)) { gfc_error ("%s loop increment not in canonical form at %L", name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->step)) { gfc_error ("%s loop increment expression at %L uses variable " @@ -10654,21 +10883,24 @@ resolve_omp_do (gfc_code *code) /* Only parse loop body into nested loop and intervening code if there are supposed to be more loops in the nest to collapse. */ - if (i == count) + if (i == num_loops - 1) break; - next = find_nested_loop_in_chain (do_code->block->next); + next = find_next_loop_or_transform_in_chain (do_code->block->next, + &imperfect); if (!next) { /* Parse error, can't recover from this. */ - gfc_error ("not enough DO loops for collapsed %s (level %d) at %L", - name, i, &code->loc); + gfc_error ("not enough DO loops for %s (level %d) at %L", + name, i + 1, &code->loc); return; } - else if (next != do_code->block->next || next->next) + else if (imperfect) /* Imperfectly nested loop found. */ { + any_imperfect = true; + /* Only diagnose violation of imperfect nesting constraints once. */ if (!perfect_nesting_errorp) { @@ -10686,7 +10918,19 @@ resolve_omp_do (gfc_code *code) name, &code->loc); perfect_nesting_errorp = true; } - /* FIXME: Also diagnose for TILE directives. */ + else if (is_tile) + { + gfc_error ("%s inner loops must be perfectly nested at %L", + name, &code->loc); + perfect_nesting_errorp = true; + } + else if (is_nested_tile) + { + gfc_error ("%s inner loops must be perfectly nested with " + "nested !$OMP TILE at %L", + name, &code->loc); + perfect_nesting_errorp = true; + } if (perfect_nesting_errorp) errorp = true; } @@ -10694,6 +10938,32 @@ resolve_omp_do (gfc_code *code) name, next)) errorp = true; } + + /* Check for presence of nested TILE directive, used for next level + of the imperfect loop error checking above. Then resolve all the + transforms at this level. */ + if (!is_tile && !is_nested_tile && !perfect_nesting_errorp) + for (gfc_code *c = next; c && loop_transform_p (c->op); ) + { + if (c->op == EXEC_OMP_TILE) + { + is_nested_tile = true; + break; + } + if (c->block) + c = c->block->next; + else + c = c->next; + } + next = resolve_nested_loop_transforms (next, name, num_loops - i - 1, + &code->loc); + if (!next) + { + gfc_error ("not enough DO loops for %s at %L", + name, &code->loc); + return; + } + do_code = next; } @@ -10701,9 +10971,162 @@ resolve_omp_do (gfc_code *code) if (errorp) return; - restructure_intervening_code (&(code->block->next), code, count); + /* Only restructure intervening code if we found some. Note that + restructure_intervening_code assumes CODE is a DO loop instead of a + top-level TILE directive, which should have been rejected already if + if contains intervening code. */ + if (is_tile) + gcc_assert (!any_imperfect); + else if (any_imperfect) + { + gcc_assert (code->block); + restructure_intervening_code (&(code->block->next), code, num_loops); + } +} + +static void +resolve_omp_do (gfc_code *code) +{ + gfc_code *do_code; + int count; + const char *name; + bool is_simd = false; + + switch (code->op) + { + case EXEC_OMP_DISTRIBUTE: name = "!$OMP DISTRIBUTE"; break; + case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_DISTRIBUTE_SIMD: + name = "!$OMP DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_DO: name = "!$OMP DO"; break; + case EXEC_OMP_DO_SIMD: name = "!$OMP DO SIMD"; is_simd = true; break; + case EXEC_OMP_LOOP: name = "!$OMP LOOP"; break; + case EXEC_OMP_PARALLEL_DO: name = "!$OMP PARALLEL DO"; break; + case EXEC_OMP_PARALLEL_DO_SIMD: + name = "!$OMP PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_PARALLEL_LOOP: name = "!$OMP PARALLEL LOOP"; break; + case EXEC_OMP_PARALLEL_MASKED_TASKLOOP: + name = "!$OMP PARALLEL MASKED TASKLOOP"; + break; + case EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD: + name = "!$OMP PARALLEL MASKED TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_PARALLEL_MASTER_TASKLOOP: + name = "!$OMP PARALLEL MASTER TASKLOOP"; + break; + case EXEC_OMP_PARALLEL_MASTER_TASKLOOP_SIMD: + name = "!$OMP PARALLEL MASTER TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_MASKED_TASKLOOP: name = "!$OMP MASKED TASKLOOP"; break; + case EXEC_OMP_MASKED_TASKLOOP_SIMD: + name = "!$OMP MASKED TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_MASTER_TASKLOOP: name = "!$OMP MASTER TASKLOOP"; break; + case EXEC_OMP_MASTER_TASKLOOP_SIMD: + name = "!$OMP MASTER TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_SIMD: name = "!$OMP SIMD"; is_simd = true; break; + case EXEC_OMP_TARGET_PARALLEL_DO: name = "!$OMP TARGET PARALLEL DO"; break; + case EXEC_OMP_TARGET_PARALLEL_DO_SIMD: + name = "!$OMP TARGET PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_PARALLEL_LOOP: + name = "!$OMP TARGET PARALLEL LOOP"; + break; + case EXEC_OMP_TARGET_SIMD: + name = "!$OMP TARGET SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE: + name = "!$OMP TARGET TEAMS DISTRIBUTE"; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD: + name = "!$OMP TARGET TEAMS DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_LOOP: name = "!$OMP TARGET TEAMS LOOP"; break; + case EXEC_OMP_TASKLOOP: name = "!$OMP TASKLOOP"; break; + case EXEC_OMP_TASKLOOP_SIMD: + name = "!$OMP TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE: name = "!$OMP TEAMS DISTRIBUTE"; break; + case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + name = "!$OMP TEAMS DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "!$OMP UNROLL"; break; + case EXEC_OMP_TILE: name = "!$OMP TILE"; break; + default: gcc_unreachable (); + } + + if (code->ext.omp_clauses) + resolve_omp_clauses (code, code->ext.omp_clauses, NULL); + + if (code->ext.omp_clauses->orderedc) + count = code->ext.omp_clauses->orderedc; + else + { + count = code->ext.omp_clauses->collapse; + if (count <= 0) + count = 1; + } + + /* While the spec defines the loop nest depth independently of the COLLAPSE + clause, in practice the middle end only pays attention to the COLLAPSE + depth and treats any further inner loops as the final-loop-body. So + here we also check canonical loop nest form only for the number of + outer loops specified by the COLLAPSE clause too. */ + do_code = resolve_nested_loop_transforms (code->block->next, name, count, + &code->loc); + resolve_nested_loops (code, name, do_code, count, is_simd, false); } +static void +resolve_omp_tile (gfc_code *code) +{ + gfc_code *do_code; + const char *name = "!$OMP TILE"; + + unsigned num_loops = 0; + gcc_assert (code->ext.omp_clauses->tile_sizes); + for (gfc_expr_list *el = code->ext.omp_clauses->tile_sizes; el; + el = el->next) + num_loops++; + + do_code = resolve_nested_loop_transforms (code, name, num_loops, &code->loc); + resolve_nested_loops (code, name, do_code, num_loops, false, true); +} static gfc_statement omp_code_to_statement (gfc_code *code) @@ -10852,6 +11275,10 @@ omp_code_to_statement (gfc_code *code) return ST_OMP_PARALLEL_LOOP; case EXEC_OMP_DEPOBJ: return ST_OMP_DEPOBJ; + case EXEC_OMP_TILE: + return ST_OMP_TILE; + case EXEC_OMP_UNROLL: + return ST_OMP_UNROLL; default: gcc_unreachable (); } @@ -10950,6 +11377,7 @@ resolve_oacc_nested_loops (gfc_code *code, gfc_code* do_code, int collapse, &do_code->loc); break; } + gcc_assert (do_code->op != EXEC_OMP_UNROLL); gcc_assert (do_code->op == EXEC_DO); if (do_code->ext.iterator->var->ts.type != BT_INTEGER) gfc_error ("!$ACC LOOP iteration variable must be of type integer at %L", @@ -11316,6 +11744,12 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_TEAMS_LOOP: resolve_omp_do (code); break; + case EXEC_OMP_TILE: + resolve_omp_tile (code); + break; + case EXEC_OMP_UNROLL: + resolve_omp_unroll (code); + break; case EXEC_OMP_TARGET: resolve_omp_target (code); gcc_fallthrough (); diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc index 58386805ffe..5ea613fc6db 100644 --- a/gcc/fortran/parse.cc +++ b/gcc/fortran/parse.cc @@ -1151,6 +1151,8 @@ decode_omp_directive (void) ST_OMP_END_TEAMS_DISTRIBUTE); matcho ("end teams loop", gfc_match_omp_eos_error, ST_OMP_END_TEAMS_LOOP); matcho ("end teams", gfc_match_omp_eos_error, ST_OMP_END_TEAMS); + matchs ("end unroll", gfc_match_omp_eos_error, ST_OMP_END_UNROLL); + matchs ("end tile", gfc_match_omp_eos_error, ST_OMP_END_TILE); matcho ("end workshare", gfc_match_omp_end_nowait, ST_OMP_END_WORKSHARE); break; @@ -1278,6 +1280,10 @@ decode_omp_directive (void) matcho ("teams", gfc_match_omp_teams, ST_OMP_TEAMS); matchdo ("threadprivate", gfc_match_omp_threadprivate, ST_OMP_THREADPRIVATE); + matchs ("tile sizes", gfc_match_omp_tile, ST_OMP_TILE); + break; + case 'u': + matchs ("unroll", gfc_match_omp_unroll, ST_OMP_UNROLL); break; case 'w': matcho ("workshare", gfc_match_omp_workshare, ST_OMP_WORKSHARE); @@ -1910,6 +1916,7 @@ next_statement (void) case ST_OMP_LOOP: case ST_OMP_PARALLEL_LOOP: case ST_OMP_TEAMS_LOOP: \ case ST_OMP_TARGET_PARALLEL_LOOP: case ST_OMP_TARGET_TEAMS_LOOP: \ case ST_OMP_ALLOCATE_EXEC: case ST_OMP_ALLOCATORS: case ST_OMP_ASSUME: \ + case ST_OMP_TILE: case ST_OMP_UNROLL: \ case ST_CRITICAL: \ case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \ case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \ @@ -2282,6 +2289,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_END_UNION: p = "END UNION"; break; + case ST_OMP_END_UNROLL: + p = "!$OMP END UNROLL"; + break; case ST_END_MAP: p = "END MAP"; break; @@ -2962,6 +2972,12 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_OMP_THREADPRIVATE: p = "!$OMP THREADPRIVATE"; break; + case ST_OMP_TILE: + p = "!$OMP TILE"; + break; + case ST_OMP_UNROLL: + p = "!$OMP UNROLL"; + break; case ST_OMP_WORKSHARE: p = "!$OMP WORKSHARE"; break; @@ -5384,6 +5400,7 @@ parse_omp_do (gfc_statement omp_st) gfc_statement st; gfc_code *cp, *np; gfc_state_data s; + int num_unroll = 0; accept_statement (omp_st); @@ -5400,6 +5417,17 @@ parse_omp_do (gfc_statement omp_st) unexpected_eof (); else if (st == ST_DO) break; + else if (st == ST_OMP_UNROLL) + { + accept_statement (st); + num_unroll++; + continue; + } + else if (st == ST_OMP_TILE) + { + accept_statement (st); + continue; + } else unexpected_statement (st); } @@ -5511,8 +5539,26 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_TEAMS_LOOP: omp_end_st = ST_OMP_END_TEAMS_LOOP; break; + case ST_OMP_TILE: + omp_end_st = ST_OMP_END_TILE; + break; + case ST_OMP_UNROLL: + omp_end_st = ST_OMP_END_UNROLL; + break; default: gcc_unreachable (); } + + for (; num_unroll > 0; num_unroll--) + { + if (st == ST_OMP_END_UNROLL) + { + gfc_clear_new_st (); + gfc_commit_symbols (); + gfc_warning_check (); + st = next_statement (); + } + } + if (st == omp_end_st) { if (new_st.op == EXEC_OMP_END_NOWAIT) @@ -6296,6 +6342,8 @@ parse_executable (gfc_statement st) case ST_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case ST_OMP_TEAMS_DISTRIBUTE_SIMD: case ST_OMP_TEAMS_LOOP: + case ST_OMP_TILE: + case ST_OMP_UNROLL: st = parse_omp_do (st); if (st == ST_IMPLIED_ENDDO) return st; diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 861f69ac20f..4f5d6decc42 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11129,6 +11129,8 @@ gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_LOOP: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: break; @@ -12296,6 +12298,8 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TARGET_SIMD: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: gfc_resolve_omp_do_blocks (code, ns); break; case EXEC_SELECT_TYPE: @@ -12794,6 +12798,8 @@ start: case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_resolve_omp_directive (code, ns); break; diff --git a/gcc/fortran/st.cc b/gcc/fortran/st.cc index b6d87c40207..8b083de7308 100644 --- a/gcc/fortran/st.cc +++ b/gcc/fortran/st.cc @@ -279,6 +279,8 @@ gfc_free_statement (gfc_code *p) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_free_omp_clauses (p->ext.omp_clauses); break; diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 06c5a123973..9c1eb3e6a9c 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -4112,6 +4112,51 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, omp_clauses = gfc_trans_add_clause (c, omp_clauses); } + if (clauses->unroll_full) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_none) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_partial) + { + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) + = (clauses->unroll_partial_factor + ? build_int_cst (integer_type_node, clauses->unroll_partial_factor) + : NULL_TREE); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->tile_sizes) + { + vec *tvec; + gfc_expr_list *el; + + vec_alloc (tvec, 4); + + for (el = clauses->tile_sizes; el; el = el->next) + vec_safe_push (tvec, gfc_convert_expr_to_tree (block, el->expr)); + + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_TILE); + OMP_CLAUSE_TILE_SIZES (c) = build_tree_list_vec (tvec); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + + tvec->truncate (0); + } + if (clauses->ordered) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_ORDERED); @@ -5308,6 +5353,12 @@ gfc_trans_omp_cancel (gfc_code *code) return gfc_finish_block (&block); } +bool +loop_transform_p (gfc_exec_op op) +{ + return op == EXEC_OMP_UNROLL || op == EXEC_OMP_TILE; +} + static tree gfc_trans_omp_cancellation_point (gfc_code *code) { @@ -5479,13 +5530,46 @@ gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n, return true; } +int +gfc_expr_list_len (gfc_expr_list *list) +{ + unsigned len = 0; + for (; list; list = list->next) + len++; + + return len; +} + +/* Traverse the loops with nesting depth at most + COLLAPSE from CODE and determine the largest + loop nest depth required by the loop transformations + found on the loops. */ +int compute_transformed_depth (gfc_code *code, int collapse) +{ + int new_collapse = collapse; + for (int i = 0; i < new_collapse; i++) + { + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); + while (loop_transform_p (code->op)) + { + int tile_depth + = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + new_collapse = MAX (new_collapse, i + tile_depth); + code = code->block ? code->block->next : code->next; + } + code = code->block->next; + } + + return new_collapse; +} + static tree gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_omp_clauses *do_clauses, tree par_clauses) { gfc_se se; tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls; - tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses; + tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses, loop_transform_clauses; stmtblock_t block; stmtblock_t body; gfc_omp_clauses *clauses = code->ext.omp_clauses; @@ -5494,45 +5578,80 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, dovar_init *di; unsigned ix; vec *saved_doacross_steps = doacross_steps; - gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list; gfc_code *orig_code = code; + locus top_loc = code->loc; + gfc_expr_list *oacc_tile + = do_clauses ? do_clauses->tile_list : clauses->tile_list; + gfc_expr_list *omp_tile + = do_clauses ? do_clauses->tile_sizes : clauses->tile_sizes; + gcc_assert (!omp_tile || op == EXEC_OMP_TILE); + gcc_assert (!(oacc_tile && omp_tile)); + + if (pblock == NULL) + { + gfc_start_block (&block); + pblock = █ + } + code = code->block->next; + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); + /* Loop transformation directives surrounding the associated loop of an "omp + do" (or similar directive) are represented as clauses on the "omp do". */ + loop_transform_clauses = NULL; + int omp_tile_depth = gfc_expr_list_len (omp_tile); + tree clauses_tail = NULL; + while (loop_transform_p (code->op)) + { + tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, + code->loc); + /* There might be several "!$omp tile" transformations surrounding the + loop. Use the innermost one which must have the largest tiling depth. + If an inner directive has a smaller tiling depth than an outer + directive, an error will be emitted in pass-omp_transform_loops. */ + omp_tile_depth = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + + if (!loop_transform_clauses) + { + loop_transform_clauses = clauses; + clauses_tail = tree_last (clauses); + } + else + clauses_tail = chainon (clauses_tail, clauses); + + code = code->block ? code->block->next : code->next; + } + gcc_checking_assert (!loop_transform_p (code->op)); + gcc_assert (code->op == EXEC_DO); /* Both collapsed and tiled loops are lowered the same way. In OpenACC, those clauses are not compatible, so prioritize the tile clause, if present. */ - if (tile) - { - collapse = 0; - for (gfc_expr_list *el = tile; el; el = el->next) - collapse++; - } + if (oacc_tile) + collapse = gfc_expr_list_len (oacc_tile); doacross_steps = NULL; if (clauses->orderedc) collapse = clauses->orderedc; if (collapse <= 0) collapse = 1; + collapse = MAX (collapse, omp_tile_depth); + gfc_code *first_loop = loop_transform_p (orig_code->op) ? + orig_code : orig_code->block->next; + int transform_depth = compute_transformed_depth (first_loop, collapse); - code = code->block->next; - gcc_assert (code->op == EXEC_DO); - + collapse = transform_depth; init = make_tree_vec (collapse); cond = make_tree_vec (collapse); incr = make_tree_vec (collapse); orig_decls = clauses->ordered ? make_tree_vec (collapse) : NULL_TREE; - if (pblock == NULL) - { - gfc_start_block (&block); - pblock = █ - } - /* simd schedule modifier is only useful for composite do simd and other constructs including that, where gfc_trans_omp_do is only called on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = NULL; + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); + omp_clauses = chainon (omp_clauses, loop_transform_clauses); for (i = 0; i < collapse; i++) { @@ -5784,7 +5903,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } gcc_assert (local_dovar == dovar || c != NULL); } - if (local_dovar != dovar) + if (local_dovar != dovar && op != EXEC_OMP_UNROLL) { if (op != EXEC_OMP_SIMD || dovar_found == 1) tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE); @@ -5802,7 +5921,26 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } if (i + 1 < collapse) - code = code->block->next; + { + code = code->block->next; + + loop_transform_clauses = NULL; + clauses_tail = omp_clauses; + while (loop_transform_p (code->op)) + { + loop_transform_clauses = gfc_trans_omp_clauses ( + pblock, code->ext.omp_clauses, code->loc); + for (tree c = loop_transform_clauses; c; + c = OMP_CLAUSE_CHAIN (c)) + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, i + 1); + + clauses_tail = chainon (clauses_tail, loop_transform_clauses); + clauses_tail = tree_last (loop_transform_clauses); + + code = code->block ? code->block->next : code->next; + } + } } if (pblock != &block) @@ -5873,6 +6011,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break; case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break; case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break; + case EXEC_OMP_TILE: stmt = make_node (OMP_LOOP_TRANS); break; + case EXEC_OMP_UNROLL: stmt = make_node (OMP_LOOP_TRANS); break; default: gcc_unreachable (); } @@ -7979,6 +8119,8 @@ gfc_trans_omp_directive (gfc_code *code) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TASKLOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses, NULL); case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: diff --git a/gcc/fortran/trans.cc b/gcc/fortran/trans.cc index e2e1b694012..95b724e1e0d 100644 --- a/gcc/fortran/trans.cc +++ b/gcc/fortran/trans.cc @@ -2607,6 +2607,8 @@ trans_code (gfc_code * code, tree cond) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: res = gfc_trans_omp_directive (code); break; diff --git a/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 b/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 index 613f06f6ea9..b155e0fcb5b 100644 --- a/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 @@ -9,7 +9,7 @@ subroutine collapse1 !$omp threadprivate (thr) l = .false. a(:, :, :) = 0 - !$omp parallel do collapse(4) schedule(static, 4) ! { dg-error "not enough DO loops for collapsed" } + !$omp parallel do collapse(4) schedule(static, 4) ! { dg-error "not enough DO loops for" } do i = 1, 3 do j = 4, 6 do k = 5, 7 @@ -33,9 +33,9 @@ subroutine collapse1 end do k = 4 end do - !$omp parallel do collapse(2) ! { dg-error "not enough DO loops" } + !$omp parallel do collapse(2) do i = 1, 3 - do + do ! { dg-error "cannot be a DO WHILE or DO without loop control" } end do end do !$omp parallel do collapse(2) diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 new file mode 100644 index 00000000000..fa2e2f17c6b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 @@ -0,0 +1,124 @@ +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp unroll partial(2) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +subroutine test2 + !$omp parallel do collapse(3) + do i=0,100 + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test2 + +subroutine test3 +!$omp parallel do collapse(3) +do i=0,100 + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test3 + +subroutine test4 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test4 + +subroutine test5 + !$omp parallel do collapse(3) + !$omp tile sizes(3,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do i=0,100 + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test5 + +subroutine test6 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,2) + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test6 + +subroutine test7 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test7 + +subroutine test8 +!$omp parallel do collapse(1) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test8 + +subroutine test9 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test9 + +subroutine test10 +!$omp parallel do +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test10 + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..8284dc8193d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 @@ -0,0 +1,163 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2+3) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(-21) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(0) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes( ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2 ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes() ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2,) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,2) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,i) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i,) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + call dummy(j) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + call dummy(i) + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 new file mode 100644 index 00000000000..441d89b61e9 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 @@ -0,0 +1,10 @@ + +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..d14af08c27a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 @@ -0,0 +1,80 @@ +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(2) + !$omp tile sizes (3,4) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop + + !$omp taskloop simd + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop simd +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(3) + !$omp tile sizes (1,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TASKLOOP} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 new file mode 100644 index 00000000000..308e3b3e4d0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 @@ -0,0 +1,18 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) ! { dg-error {'ordered' invalid in conjunction with 'omp tile'} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp target + +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 new file mode 100644 index 00000000000..b2dca0bbec6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 @@ -0,0 +1,95 @@ + +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp target teams distribute + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test3 + +subroutine test4 + implicit none + integer :: i, j, k + + !$omp target teams distribute collapse(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TARGET TEAMS DISTRIBUTE} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test4 + +subroutine test5 + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp tile + !$end omp target + +end subroutine test5 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 new file mode 100644 index 00000000000..e9cf88f4def --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 @@ -0,0 +1,93 @@ +subroutine test0 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + end do + end do + end do +end subroutine test0 + +subroutine test0m + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test0m + +subroutine test1 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test1 + + +subroutine test2 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test2 + +subroutine test3 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 new file mode 100644 index 00000000000..6474b9da1e2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 @@ -0,0 +1,16 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp tile sizes(4) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +! Collapse of the gimple_omp_for should be unaffacted by the transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) tile sizes\(4\).1\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) private\(j.0\) private\(j\)\n +for \(i = 0; i < 101; i = i \+ 1\)\n +for \(.omp_tile_index.\d = -300; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 4\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 new file mode 100644 index 00000000000..0d462debd72 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test2 + !$omp parallel do + !$omp tile sizes(3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2 + +! One gimple_omp_for should cover the outer two loops, another the inner two loops +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 new file mode 100644 index 00000000000..3ce87ad8a4b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait private\(l.0\) private\(k\)\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 new file mode 100644 index 00000000000..2c06d2094ba --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 @@ -0,0 +1,31 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } + +! Third level is tiled first by the inner construct. The resulting floor loop is tiled by the outer construct. +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.1} 2 "omp_transform_loops" } } + +! All other levels are tiled once +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.2} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.3} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.4} 1 "omp_transform_loops" } } + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 new file mode 100644 index 00000000000..355d977fe35 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 @@ -0,0 +1,30 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! The outer gimple_omp_for should not cover the loop with the tile transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! After transformations, the outer loop should be a floor loop created +! by the tiling and the outer construct type and non-transformation +! clauses should be unaffected by the tiling +! { dg-final { scan-tree-dump {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 3\)} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp} 2 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp parallel} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 new file mode 100644 index 00000000000..0c83da660f5 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 @@ -0,0 +1,26 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! There should be separate gimple_omp_for constructs for the tile constructs because the tiling depth +! of the outer construct does not reach the level of the inner construct +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index} 3 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 new file mode 100644 index 00000000000..670e14caa12 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 @@ -0,0 +1,123 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1a + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2a + +subroutine test3a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3a + +subroutine test4a + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4a + +subroutine test1b + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1b + +subroutine test2b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2b + +subroutine test3b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3b + +subroutine test4b + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 new file mode 100644 index 00000000000..169c2b10e54 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 @@ -0,0 +1,71 @@ +subroutine test1 + !$omp tile sizes(1) + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + !$omp tile sizes(1,2) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test2 + +subroutine test3 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test3 + +subroutine test4 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test4 + +subroutine test5 + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test5 + +subroutine test6 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test6 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 new file mode 100644 index 00000000000..f0f3e046511 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 @@ -0,0 +1,12 @@ +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..27920701b36 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,57 @@ +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} "" { target *-*-*} .-1 } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8) + !$omp unroll partial(1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp parallel do collapse(2) ! { dg-error {missing canonical loop nest after \!\$OMP PARALLEL DO at \(1\)} } + !$omp tile sizes (8,8) ! { dg-error {missing canonical loop nest after \!\$OMP TILE at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..4cfac4c5e26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 @@ -0,0 +1,277 @@ +subroutine test1 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 + +subroutine test20 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test20 + +subroutine test21 + implicit none + integer :: i + + !$omp unroll partial ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i = 1:100) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test21 + +subroutine test22 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do concurrent (i = 1:100) ! { dg-error {\!\$OMP DO cannot be a DO CONCURRENT loop at \(1\)} } + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test22 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 new file mode 100644 index 00000000000..2c4a45d3054 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 @@ -0,0 +1,7 @@ +subroutine test(i) + ! TODO The checking that produces this message comes too late. Not important, but would be nice to have. + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} "" { xfail *-*-* } } + call dummy0 ! { dg-error {Unexpected CALL statement at \(1\)} } +end subroutine test ! { dg-error {Unexpected END statement at \(1\)} } + +! { dg-error "Unexpected end of file" "" { target "*-*-*" } 0 } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 new file mode 100644 index 00000000000..3f0d5981e9b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 @@ -0,0 +1,75 @@ +subroutine test1(i) + implicit none + integer :: i + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +subroutine test3(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll full + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test3 + +subroutine test4(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test4 + +subroutine test5(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test5 + +subroutine test6(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test6 + +subroutine test7(i) + implicit none + integer :: i + !$omp loop ! { dg-error {missing canonical loop nest after \!\$OMP LOOP at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test7 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 new file mode 100644 index 00000000000..0d8f3f5a2c0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 @@ -0,0 +1,29 @@ +subroutine test1 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do while (i < 10) + call dummy(i) + i = i + 1 + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do + call dummy(i) + i = i + 1 + if (i >= 10) exit + end do +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i=1:10) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..8496f9eefe0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_none" 1 "original" } } +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_full" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..0d233c9ab6f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 @@ -0,0 +1,17 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +! Loop should be removed with 10 copies of the body remaining + +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..fcccdb0bcf8 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should not be unrolled, but the internal representation should be lowered + +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..ee82b4d150c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 5 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..237e6b83087 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 @@ -0,0 +1,19 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(10\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 10 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..8feaf7dc4d3 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 @@ -0,0 +1,62 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i,j + !$omp parallel do + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + + !$omp taskloop + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + +end subroutine test1 + +! For the "parallel do", there should be 11 "omp for" loops, 10 for the inner loop, 1 for outer, +! for the "taskloop", there should be 10 "omp for" loops for the unrolled loop +! { dg-final { scan-tree-dump-times {#pragma omp for} 21 "omp_transform_loops" } } +! ... and two outer taskloops plus the one taskloops +! { dg-final { scan-tree-dump-times {#pragma omp taskloop} 3 "omp_transform_loops" } } + + +subroutine test2 + implicit none + integer :: i,j + do i = 1,100 + !$omp teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do + + do i = 1,100 + !$omp target teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times {#pragma omp distribute} 2 "omp_transform_loops" } } + +! After unrolling there should be 10 copies of each loop body for each loop-nest +! { dg-final { scan-tree-dump-times "dummy" 40 "omp_transform_loops" } } + +! { dg-final { scan-tree-dump-not {#pragma omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(10\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp distribute private\(j\) unroll_partial\(10\)} 2 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..dab3f0fb5cf --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp parallel do collapse(1) + !$omp unroll partial(4) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll partial\(24\)'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 1 * 2 * 3 * 4 = 24 times + +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\).0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 new file mode 100644 index 00000000000..91e13ff1b37 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll partial' directives preceding 'omp unroll full'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full.0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\).0} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 new file mode 100644 index 00000000000..efcc691185d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 @@ -0,0 +1,57 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1a + +subroutine test1b + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1b + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2a + +subroutine test2b + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 new file mode 100644 index 00000000000..079c0fdd75b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 @@ -0,0 +1,20 @@ +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,5 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,6 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 new file mode 100644 index 00000000000..4893ba46e4e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 @@ -0,0 +1,21 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=20" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,21 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 new file mode 100644 index 00000000000..60f25d3abe6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=10" } +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,21 + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 new file mode 100644 index 00000000000..3da99158cc0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 @@ -0,0 +1,31 @@ +subroutine test + implicit none + + integer :: i, j, k + !$omp target parallel do collapse(2) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 2 step '5'} } + do i = -300, 100 + !$omp unroll partial + do j = i,i*2 + call dummy (i) + end do + end do + + !$omp target parallel do collapse(3) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 3 step '5'} } + do i = -300, 100 + do j = 1,10 + !$omp unroll partial + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do + + !$omp unroll full + do i = -3, 5 + do j = 1,10 + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do +end subroutine diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..f22debbb78f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,244 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } + +subroutine test1 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 new file mode 100644 index 00000000000..faaa37c5d7e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 @@ -0,0 +1,57 @@ +! { dg-do run } +! { dg-options "-O2 -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp simd + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program + +! { dg-final { scan-tree-dump {omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..20617e25105 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,37 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do + !$omp unroll partial(1) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(1\)@0 tile sizes\(8, 8\)@0} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Tiling adds two floor and two tile loops. + +! Number of conditional statements after tiling: +! 5 +! = 2 (lowering of 2 tile loops) +! + 1 (partial tile handling in 2 tile loops) +! + 1 (lowering of non-associated floor loop) + +! The unrolling with unroll factor 1 currently gets executed (TODO could/should be skipped?) + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 5 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..c1e7f356a87 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,41 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + c = 0 + + !$omp target + !$omp parallel do + !$omp unroll partial(2) + !$omp tile sizes (8,8,4) + do i = 1,m + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + !$omp end target +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(2\)@0 tile sizes\(8, 8, 4\)@0} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Check the number of loops + +! Tiling adds three tile and three floor loops. +! The outermost floor loop is associated with the "!$omp parallel do" +! and hence it isn't lowered in the transformation pass. +! Number of conditional statements after tiling: +! 8 +! = 2 (inner floor loop lowering) +! + 3 (partial tile handling in 3 tile loops) +! + 3 (lowering of 3 tile loops) +! +! Unrolling creates 2 copies of the tiled loop nest. + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 16 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 new file mode 100644 index 00000000000..bc7a890df17 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 @@ -0,0 +1,25 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + !$omp unroll partial(10) + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_partial" 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 b/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 index 598e455d2e9..eadf34a022a 100644 --- a/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 @@ -86,3 +86,29 @@ pure integer function func_simd(n) end do func_simd = r end + +!pure integer function func_unroll(n) +integer function func_unroll(n) + implicit none + integer, value :: n + integer :: j, r + r = 0 + !$omp unroll partial(2) + do j = 1, n + r = r + j + end do + func_unroll = r +end + +!pure integer function func_tile(n) +integer function func_tile(n) + implicit none + integer, value :: n + integer :: j, r + r = 0 + !$omp tile sizes(2) + do j = 1, n + r = r + j + end do + func_tile = r +end diff --git a/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 b/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 index 1e3cf8c9416..35503c6a284 100644 --- a/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 @@ -46,28 +46,3 @@ logical function func_reverse(n) end do end -!pure integer function func_unroll(n) -integer function func_unroll(n) - implicit none - integer, value :: n - integer :: j, r - r = 0 - !$omp unroll partial(2) ! { dg-error "Unclassifiable OpenMP directive" } - do j = 1, n - r = r + j - end do - func_unroll = r -end - -!pure integer function func_tile(n) -integer function func_tile(n) - implicit none - integer, value :: n - integer :: j, r - r = 0 - !$omp tile sizes(2) ! { dg-error "Unclassifiable OpenMP directive" } - do j = 1, n - r = r + j - end do - func_tile = r -end diff --git a/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 b/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 new file mode 100644 index 00000000000..aa956707414 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 @@ -0,0 +1,70 @@ +! { dg-do run } + +! Like imperfect1.f90, but also includes loop transforms. + +program foo + integer, save :: f1count(3), f2count(3) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp do collapse(3) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp unroll partial + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 b/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 new file mode 100644 index 00000000000..be199ab9218 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 @@ -0,0 +1,70 @@ +! { dg-do run } + +! Like imperfect1.f90, but also includes loop transforms. + +program foo + integer, save :: f1count(3), f2count(3) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp do collapse(3) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp tile sizes(5) + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 new file mode 100644 index 00000000000..1db97feb34d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp target parallel do collapse(2) private(inner) map(to:a,b) map(from:c) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 new file mode 100644 index 00000000000..dc70c9228fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 @@ -0,0 +1,19 @@ +! { dg-do run } + +program foo + integer :: count +contains + +subroutine s1 () + integer :: i, count + + count = 0 + + !$omp target parallel do + !$omp unroll partial + do i = 1, 100 + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..bb48c31224e --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..a7cb5e7635d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 @@ -0,0 +1,117 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum1() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(2) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(16) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum3() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(100) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum4() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum5() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp parallel do collapse(2) reduction(+:sum) + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum5 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 new file mode 100644 index 00000000000..518968f1335 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + ! omp do private(inner) + do i = 1,m + !$omp unroll partial(4) + !$omp tile sizes (5) + do j = 1,n + do k = 1, n + write (*,*) "i", i, "j", j, "k", k + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 new file mode 100644 index 00000000000..807135df5e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 @@ -0,0 +1,75 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,m + do j = 1,n + c(j, i) = 0 + end do + end do + + !$omp parallel do + do i = 1,m + !$omp tile sizes (5,2) + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..b91ea275577 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 @@ -0,0 +1,54 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp parallel do reduction(+:sum) + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..2ce44d4d044 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 @@ -0,0 +1,88 @@ +! { dg-additional-options "-fdump-tree-original -g" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 1,10,3 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum2() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = -20,1,3 + sum = sum + 1 + end do + end function compute_sum2 + + + integer function compute_sum3() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 30,1,-3 + sum = sum + 1 + end do + end function compute_sum3 + + + integer function compute_sum4() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 50,-60,-10 + sum = sum + 1 + end do + end function compute_sum4 + +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 4) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 8) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 10) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 12) then + call abort + end if + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..55e5cc568a5 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 @@ -0,0 +1,59 @@ +! Test lowering of the internal representation of "omp unroll" loops +! which are not unrolled. + +! { dg-additional-options "-O0" } +! { dg-additional-options "--param=omp-unroll-full-max-iterations=0" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll + do i = 0,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + do i = 0,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (1, 100) + if (result .ne. 101) then + call abort + end if + + result = compute_sum3 (2, 100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (-2, -100) + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..52a214f1049 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 @@ -0,0 +1,72 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll partial(2) + do i = 1,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum3 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum3 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum3 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..d6a4e739675 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 @@ -0,0 +1,55 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp do + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..b953ce31b5b --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 @@ -0,0 +1,105 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 + + integer function compute_sum5 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll partial\(50\)'} } + !$omp unroll partial(10) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum5 + + integer function compute_sum6 (step,n) result(sum) + implicit none + integer :: i, j, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + do i = 1,n,step + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll partial' directives preceding 'omp unroll full'} } + !$omp unroll partial(10) + do j = 1, 1000 + sum = sum + 1 + end do + end do + end function compute_sum6 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum5 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum5 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum5 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum5 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum5 (-2, -100) + if (result .ne. 51) then + call abort + end if + + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..d25f18002ae --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 @@ -0,0 +1,198 @@ +! { dg-additional-options "-O0 -cpp" } +! { dg-do run } + +#ifndef UNROLL_FACTOR +#define UNROLL_FACTOR 1 +#endif +module test_functions +contains + subroutine copy (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 1, 100 + array1(i) = array2(i) + end do + end subroutine + + subroutine copy2 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 0,99 + array1(i+1) = array2(i+1) + end do + end subroutine copy2 + + subroutine copy3 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do lastprivate(i) + !$omp unroll partial(UNROLL_FACTOR) + do i = -49,50 + if (i < 0) then + array1((-1)*i) = array2((-1)*i) + else + array1(50+i) = array2(50+i) + endif + end do + end subroutine copy3 + + subroutine copy4 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 2, 200, 2 + array1(i/2) = array2(i/2) + end do + end subroutine copy4 + + subroutine copy5 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 200, 2, -2 + array1(i/2) = array2(i/2) + end do + end subroutine + + subroutine copy6 (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = lower, upper, step + array1 (i) = array2(i) + end do + end subroutine + + subroutine prepare (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + + array1 = 2 + array2 = 0 + end subroutine + + subroutine check_equal (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_equal_at_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + do i=lower, upper, step + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_unchanged_at_non_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i, j + + do i=lower, upper,step + do j=i,i+step-1 + if (array2(j) /= 0) then + write (*,*) i + call abort + end if + end do + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + + call prepare (array1, array2) + call copy (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy2 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy3 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy4 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy5 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 100, 5) + call check_equal_at_steps (array1, array2, 1, 100, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 100, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 50, 5) + call check_equal_at_steps (array1, array2, 1, 50, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 50, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 3, 18, 7) + call check_equal_at_steps (array1, array2, 3 , 18, 7) + call check_unchanged_at_non_steps (array1, array2, 3, 18, 7) +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 new file mode 100644 index 00000000000..02328464c0d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that divides the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 5 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 new file mode 100644 index 00000000000..60866ef33fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that does not divide the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 3 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 new file mode 100644 index 00000000000..6d8a2ef7bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that is larger than the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 113 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..40506025aa3 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 @@ -0,0 +1,38 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + subroutine copy (array1, array2, step, n) + implicit none + + integer :: array1(n) + integer :: array2(n) + integer :: i, step, n + + call omp_set_num_threads (4) + !$omp parallel do shared(array1) shared(array2) schedule(static, 4) + !$omp unroll partial(2) + do i = 1,n + array1(i) = array2(i) + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + integer :: i + + array1 = 2 + call copy(array1, array2, 1, 100) + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..7a43458f0dd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,34 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp simd reduction(+:sum) + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function compute_sum +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 new file mode 100644 index 00000000000..34b6e075e05 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 @@ -0,0 +1,73 @@ +! { dg-do run } + +! Like imperfect-transform.f90, but enables offloading. + +program foo + integer, save :: f1count(3), f2count(3) + !$omp declare target enter (f1count, f2count) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + !$omp atomic + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + !$omp atomic + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp target parallel do collapse(3) map(always, tofrom:f1count, f2count) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp unroll partial + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 new file mode 100644 index 00000000000..188cca1e5b4 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 @@ -0,0 +1,73 @@ +! { dg-do run } + +! Like imperfect-transform.f90, but enables offloading. + +program foo + integer, save :: f1count(3), f2count(3) + !$omp declare target enter (f1count, f2count) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + !$omp atomic + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + !$omp atomic + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp target parallel do collapse(3) map(always, tofrom:f1count, f2count) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp tile sizes(5) + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program