From patchwork Sun Oct 1 20:10:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 147152 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1037883vqb; Sun, 1 Oct 2023 13:12:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IELfJoC6WtyetTmEodcCwxxqKe2+WM8QYfKHvpKlS9ySi8A8/A2CmIAH5m7a7aPXb3Ju84n X-Received: by 2002:a17:906:30cd:b0:9a9:e3be:1310 with SMTP id b13-20020a17090630cd00b009a9e3be1310mr7395056ejb.53.1696191126842; Sun, 01 Oct 2023 13:12:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696191126; cv=none; d=google.com; s=arc-20160816; b=w60hCcTYaCQvWssAimXFuiAr7jv/Pv/lSPnQ18E3KNrvFfS95WODo2YxTeFIDyCCNs fHW2SaUWF5+gwyqRZe7Xpq+5516gmnc40amYnX+ySEGgwwzAUVx7SmvZbtnMGM64RcEu mWobuteb/VUUdOWVsIdFdxgcazih7YzvWjRb91oOCrLbfzWQt7/OepPaCrNwhbcoyqfj mWSrI8TjZY8zG1aR54LmeoHr7hfU6YllzPDGonG0uMyHDEqrPcCIHPzM/3ukm0r9FR61 gUErlJ5mNJfxA7x1OvmIdaqDFjeTmWEq3SqxkPrH9eSCC4akalqGKh7fp16s0+Jy/i5O esrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:ironport-sdr:dmarc-filter:delivered-to; bh=waUygBG2KSAw0szH1Q63rzj4shE+tiqzsCBN++Og2w0=; fh=OCleZEGPw6EXwGK7DgwfSjSqRFJqj4InDXFwXNGe58Y=; b=L2sh7bN/JhQ2WzFb91PTkWhBaE4E/oom/7lptFtSHe8ILyA7cICj7V1XaJnmZZiC2H TYe8wH9dw97C8w+d8smRwGFOjdyuwv8yQrGVXLsXGpeXX1tApTZGxOpECTMgCV/2qk1G EDVpFdWsZGmogmGCQUyjIr3Mk8uGTbQoaXPHtyXMqRfyurRDeCQJTQj4de74+XoBbsc3 0hd+uEyOL/4NDYegQqbqWHXW30tNCcJGyUabdWwyloem8sO993wFtFVbMNLvn1cjMOga c/hY8GG9f+poKosKDMH9IFuPGLwk5G/OQWUDtAfr65Vnwxb4ixAgyIur8R3goZeVLl/o BDRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id lo4-20020a170906fa0400b0099bcb187a0bsi16372367ejb.387.2023.10.01.13.12.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Oct 2023 13:12:06 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F3F5F388215A for ; Sun, 1 Oct 2023 20:11:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 4C58E385CCBA for ; Sun, 1 Oct 2023 20:10:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4C58E385CCBA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-CSE-ConnectionGUID: +3qdhEDaST2LkpXPCLH4KQ== X-CSE-MsgGUID: Z4kuLFisQESQHZk2QuxM1Q== X-IronPort-AV: E=Sophos;i="6.03,191,1694764800"; d="scan'208";a="18280085" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 01 Oct 2023 12:10:42 -0800 IronPort-SDR: FEaOfyneTyXcjVZ802LGkmzIz1eckk+YEg8C0mohikpnfQkXdWbLAeq2dLu/GrA6RSF50rBKjd hAMoRqg7IkJCV9p+X1P+OCgk+RGy+83R6Fp0PVKpUapPHMCMBAaXGoJsDvyF+DwZVBUJMOSkvj blMjzaMF40TY/SFuQ0pChylAiJNUgXh/r8iPT8nwzF0L+E7l9zeqSSX2X6uhJ4e7DagktCQDSI V8MPFNCpDOTDmaHDblLDFcnZQkBUNUqbVrm8ssMjxTdEy/wtz/mov5coBuTHlshpiGPkSEjjFl /Ws= From: Sandra Loosemore To: CC: , Subject: [WIP 1/4] openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE Date: Sun, 1 Oct 2023 14:10:18 -0600 Message-ID: <20231001201021.785572-2-sandra@codesourcery.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231001201021.785572-1-sandra@codesourcery.com> References: <20231001201021.785572-1-sandra@codesourcery.com> MIME-Version: 1.0 X-ClientProxiedBy: svr-orw-mbx-14.mgc.mentorg.com (147.34.90.214) To svr-orw-mbx-13.mgc.mentorg.com (147.34.90.213) X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778585306712048867 X-GMAIL-MSGID: 1778585306712048867 From: Frederik Harwath OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_TILE_ITERVAR): Rename to ... (OMP_CLAUSE_OACC_TILE_ITERVAR): ... this. (OMP_CLAUSE_TILE_COUNT): Rename to ... (OMP_CLAUSE_OACC_TILE_COUNT): this. * gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_for): Likewise. * omp-general.cc (omp_extract_for_data): Likewise. * omp-low.cc (scan_sharing_clauses): Likewise. (lower_oacc_head_mark): Likewise. * tree-nested.cc (convert_nonlocal_omp_clauses): Likewise. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Likewise. * tree.cc: Likewise. gcc/c-family/ChangeLog: * c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings. (c_parser_oacc_clause_tile): Likewise. (c_parser_omp_for_loop): Likewise. * c-typeck.cc (c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings. (cp_parser_omp_clause_collapse): Likewise. (cp_parser_omp_for_loop): Likewise. * pt.cc (tsubst_omp_clauses): Likewise. * semantics.cc (finish_omp_clauses): Likewise. (finish_omp_for): Likewise. gcc/fortran/ChangeLog: * openmp.cc (enum omp_mask2): Adjust to renamings. (gfc_match_omp_clauses): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise. --- gcc/c-family/c-omp.cc | 2 +- gcc/c/c-parser.cc | 12 ++++++------ gcc/c/c-typeck.cc | 2 +- gcc/cp/parser.cc | 12 ++++++------ gcc/cp/pt.cc | 2 +- gcc/cp/semantics.cc | 8 ++++---- gcc/fortran/openmp.cc | 6 +++--- gcc/fortran/trans-openmp.cc | 4 ++-- gcc/gimplify.cc | 8 ++++---- gcc/omp-general.cc | 8 ++++---- gcc/omp-low.cc | 6 +++--- gcc/tree-core.h | 2 +- gcc/tree-nested.cc | 4 ++-- gcc/tree-pretty-print.cc | 4 ++-- gcc/tree.cc | 2 +- gcc/tree.h | 12 ++++++------ 16 files changed, 47 insertions(+), 47 deletions(-) diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index 95b6c1e623f..5de3b77c450 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -1899,7 +1899,7 @@ c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses, { /* Loop clauses. */ case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 0d468b86bd8..e6342d2188d 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -14658,7 +14658,7 @@ c_parser_omp_clause_collapse (c_parser *parser, tree list) location_t loc; check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse"); - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile"); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile"); loc = c_parser_peek_token (parser)->location; matching_parens parens; @@ -15842,7 +15842,7 @@ c_parser_oacc_clause_tile (c_parser *parser, tree list) location_t loc; tree tile = NULL_TREE; - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile"); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile"); check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse"); loc = c_parser_peek_token (parser)->location; @@ -15894,9 +15894,9 @@ c_parser_oacc_clause_tile (c_parser *parser, tree list) /* Consume the trailing ')'. */ c_parser_consume_token (parser); - c = build_omp_clause (loc, OMP_CLAUSE_TILE); + c = build_omp_clause (loc, OMP_CLAUSE_OACC_TILE); tile = nreverse (tile); - OMP_CLAUSE_TILE_LIST (c) = tile; + OMP_CLAUSE_OACC_TILE_LIST (c) = tile; OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -21137,10 +21137,10 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); - else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_TILE) + else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { tiling = true; - collapse = list_length (OMP_CLAUSE_TILE_LIST (cl)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED && OMP_CLAUSE_ORDERED_EXPR (cl)) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index e55e887da14..54a5f208cb5 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -15876,7 +15876,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_NOHOST: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index f3abae716fe..defb81ca8c1 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -38279,7 +38279,7 @@ cp_parser_oacc_clause_tile (cp_parser *parser, location_t clause_loc, tree list) so, but the spec authors never considered such a case and have differing opinions on what it might mean, including 'not allowed'.) */ - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile", clause_loc); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile", clause_loc); check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse", clause_loc); @@ -38308,9 +38308,9 @@ cp_parser_oacc_clause_tile (cp_parser *parser, location_t clause_loc, tree list) /* Consume the trailing ')'. */ cp_lexer_consume_token (parser->lexer); - c = build_omp_clause (clause_loc, OMP_CLAUSE_TILE); + c = build_omp_clause (clause_loc, OMP_CLAUSE_OACC_TILE); tile = nreverse (tile); - OMP_CLAUSE_TILE_LIST (c) = tile; + OMP_CLAUSE_OACC_TILE_LIST (c) = tile; OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -38423,7 +38423,7 @@ cp_parser_omp_clause_collapse (cp_parser *parser, tree list, location_t location } check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse", location); - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile", location); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile", location); c = build_omp_clause (loc, OMP_CLAUSE_COLLAPSE); OMP_CLAUSE_CHAIN (c) = list; OMP_CLAUSE_COLLAPSE_EXPR (c) = num; @@ -44602,10 +44602,10 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); - else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_TILE) + else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { tiling = true; - collapse = list_length (OMP_CLAUSE_TILE_LIST (cl)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED && OMP_CLAUSE_ORDERED_EXPR (cl)) diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 73ac1cb597c..d44d20767ca 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18114,7 +18114,7 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, = tsubst_expr (OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR (oc), args, complain, in_decl); /* FALLTHRU */ - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF: case OMP_CLAUSE_NUM_THREADS: case OMP_CLAUSE_SCHEDULE: diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 80ef1364e33..c934659c9f3 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -8836,8 +8836,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) mergeable_seen = true; break; - case OMP_CLAUSE_TILE: - for (tree list = OMP_CLAUSE_TILE_LIST (c); !remove && list; + case OMP_CLAUSE_OACC_TILE: + for (tree list = OMP_CLAUSE_OACC_TILE_LIST (c); !remove && list; list = TREE_CHAIN (list)) { t = TREE_VALUE (list); @@ -10558,9 +10558,9 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, { tree c; - c = omp_find_clause (clauses, OMP_CLAUSE_TILE); + c = omp_find_clause (clauses, OMP_CLAUSE_OACC_TILE); if (c) - collapse = list_length (OMP_CLAUSE_TILE_LIST (c)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (c)); else { c = omp_find_clause (clauses, OMP_CLAUSE_COLLAPSE); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index dc0c8013c3d..6b9c5e81a37 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -1084,7 +1084,7 @@ enum omp_mask2 OMP_CLAUSE_WAIT, OMP_CLAUSE_DELETE, OMP_CLAUSE_AUTO, - OMP_CLAUSE_TILE, + OMP_CLAUSE_OACC_TILE, OMP_CLAUSE_IF_PRESENT, OMP_CLAUSE_FINALIZE, OMP_CLAUSE_ATTACH, @@ -3610,7 +3610,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, c->threads = needs_space = true; continue; } - if ((mask & OMP_CLAUSE_TILE) + if ((mask & OMP_CLAUSE_OACC_TILE) && !c->tile_list && match_oacc_expr_list ("tile (", &c->tile_list, true) == MATCH_YES) @@ -3815,7 +3815,7 @@ error: (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER \ | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT \ | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_REDUCTION | OMP_CLAUSE_AUTO \ - | OMP_CLAUSE_TILE) + | OMP_CLAUSE_OACC_TILE) #define OACC_PARALLEL_LOOP_CLAUSES \ (OACC_LOOP_CLAUSES | OACC_PARALLEL_CLAUSES) #define OACC_KERNELS_LOOP_CLAUSES \ diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 2f116fd6738..06c5a123973 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -4576,8 +4576,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, for (el = clauses->tile_list; el; el = el->next) vec_safe_push (tvec, gfc_convert_expr_to_tree (block, el->expr)); - c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_TILE); - OMP_CLAUSE_TILE_LIST (c) = build_tree_list_vec (tvec); + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_OACC_TILE); + OMP_CLAUSE_OACC_TILE_LIST (c) = build_tree_list_vec (tvec); omp_clauses = gfc_trans_add_clause (c, omp_clauses); tvec->truncate (0); } diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 9f4722f7458..d24fd114f71 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -12117,7 +12117,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, case OMP_CLAUSE_ORDERED: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: case OMP_CLAUSE_INDEPENDENT: @@ -13279,7 +13279,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_VECTOR: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: @@ -14172,9 +14172,9 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_COLLAPSE); if (c) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (c)); - c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_TILE); + c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_OACC_TILE); if (c) - tile = list_length (OMP_CLAUSE_TILE_LIST (c)); + tile = list_length (OMP_CLAUSE_OACC_TILE_LIST (c)); c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ALLOCATE); hash_set *allocate_uids = NULL; if (c) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 1e31014c454..906d36bdec1 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -271,12 +271,12 @@ omp_extract_for_data (gomp_for *for_stmt, struct omp_for_data *fd, collapse_count = &OMP_CLAUSE_COLLAPSE_COUNT (t); } break; - case OMP_CLAUSE_TILE: - fd->tiling = OMP_CLAUSE_TILE_LIST (t); + case OMP_CLAUSE_OACC_TILE: + fd->tiling = OMP_CLAUSE_OACC_TILE_LIST (t); fd->collapse = list_length (fd->tiling); gcc_assert (fd->collapse); - collapse_iter = &OMP_CLAUSE_TILE_ITERVAR (t); - collapse_count = &OMP_CLAUSE_TILE_COUNT (t); + collapse_iter = &OMP_CLAUSE_OACC_TILE_ITERVAR (t); + collapse_count = &OMP_CLAUSE_OACC_TILE_COUNT (t); break; case OMP_CLAUSE__REDUCTEMP_: fd->have_reductemp = true; diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 91ef74f1f6a..39a18a2e02e 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -1747,7 +1747,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE__SIMT_: case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_NONTEMPORAL: @@ -1966,7 +1966,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE__SIMT_: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: @@ -8276,7 +8276,7 @@ lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses, tag |= OLF_INDEPENDENT; break; - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: tag |= OLF_TILE; break; diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 91551fde900..75bdc1eda4b 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -514,7 +514,7 @@ enum omp_clause_code { OMP_CLAUSE_VECTOR_LENGTH, /* OpenACC clause: tile ( size-expr-list ). */ - OMP_CLAUSE_TILE, + OMP_CLAUSE_OACC_TILE, /* OpenACC clause: if_present. */ OMP_CLAUSE_IF_PRESENT, diff --git a/gcc/tree-nested.cc b/gcc/tree-nested.cc index 31c7b6001bd..987839577a2 100644 --- a/gcc/tree-nested.cc +++ b/gcc/tree-nested.cc @@ -1474,7 +1474,7 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_COPYIN: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_MERGEABLE: case OMP_CLAUSE_PROC_BIND: @@ -2271,7 +2271,7 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_COPYIN: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_MERGEABLE: case OMP_CLAUSE_PROC_BIND: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 12c57c14dd4..dcd0c585c09 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -1431,9 +1431,9 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case OMP_CLAUSE_INDEPENDENT: pp_string (pp, "independent"); break; - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: pp_string (pp, "tile("); - dump_generic_node (pp, OMP_CLAUSE_TILE_LIST (clause), + dump_generic_node (pp, OMP_CLAUSE_OACC_TILE_LIST (clause), spc, flags, false); pp_right_paren (pp); break; diff --git a/gcc/tree.cc b/gcc/tree.cc index 54ca5e750df..067b8edf2e7 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -322,7 +322,7 @@ unsigned const char omp_clause_num_ops[] = 1, /* OMP_CLAUSE_NUM_GANGS */ 1, /* OMP_CLAUSE_NUM_WORKERS */ 1, /* OMP_CLAUSE_VECTOR_LENGTH */ - 3, /* OMP_CLAUSE_TILE */ + 3, /* OMP_CLAUSE_OACC_TILE */ 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ diff --git a/gcc/tree.h b/gcc/tree.h index 005c157e9b0..fbf2a7e33e7 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -2009,12 +2009,12 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_ENTER_TO(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_ENTER)->base.public_flag) -#define OMP_CLAUSE_TILE_LIST(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0) -#define OMP_CLAUSE_TILE_ITERVAR(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 1) -#define OMP_CLAUSE_TILE_COUNT(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 2) +#define OMP_CLAUSE_OACC_TILE_LIST(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 0) +#define OMP_CLAUSE_OACC_TILE_ITERVAR(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 1) +#define OMP_CLAUSE_OACC_TILE_COUNT(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 2) /* _CONDTEMP_ holding temporary with iteration count. */ #define OMP_CLAUSE__CONDTEMP__ITER(NODE) \ From patchwork Sun Oct 1 20:10:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 147153 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1038435vqb; Sun, 1 Oct 2023 13:13:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH65f+CmgqDiDPhpsgXw3ylKP3vXLl3GhFJvklI/SBYJaIiX9Dcp9GFHloXVZmXm/Ogcfsk X-Received: by 2002:aa7:d9cb:0:b0:532:c6d7:b93c with SMTP id v11-20020aa7d9cb000000b00532c6d7b93cmr9556406eds.5.1696191218635; Sun, 01 Oct 2023 13:13:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696191218; cv=none; d=google.com; s=arc-20160816; b=wCo/sxJIpMvBlFGGgZx6pfeqAiaVwoKVBmtbQ8NwVLdO5uKvKo2Pwr2DFDdGIJv2DU tVYpmcJcwTkq5BcvKceonKNjIUPxkTiSf/3rFiB2zgVb4LTsgJKoeIjNkqmNTWkRd8xQ NvWufUlQ3Bwl8/VpQ/v4LG0Of5jM+A5ynk/P3xrB03Z33XAVi1hMmZIXnqp+qkSiLJNV vsrSYr8SqhFjtsI6Ja70bAKXrV9sBzHJO+fGABr8geaOcfnJ6SgbAE+Erx7Tjf5JLRah 8dngKv94zvCGZRpW3fr+kBUq/iB+bmfGbbkpfneocnk/8Azok7uFyuMrDpZaf+KHuhcq XM5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:ironport-sdr:dmarc-filter:delivered-to; bh=HOajbVf+/YmpTHoD3sPDQ1F9WSwNmwV21CerAR6Vfx8=; fh=OCleZEGPw6EXwGK7DgwfSjSqRFJqj4InDXFwXNGe58Y=; b=nhSsyXkdTwXhTsOZkNzoqFYP9yFwP7Pki1AwO4zKcJA1iMbpJwtuo3ykApSJqTwvip Dla/tcHVPJUq6XFkZ+Qea6709lmgf8ufNSsFwYV25jzl2Vw0o2qf+rOHsuTyFIBfsxwP iDdt2f9sVZl+cRgRlkuwVXcPoZpzeAH6ITOm5Hg5r0xq9XgPRGtJiaQntfhYVMCyFwhh Z4pqa/oaYFEWVEaumoXAfzD/dS9Qanui7vKuksHwRG7fxdvvOIcS+KKsssDDzmHwdkDt tjAQJ48QjQ43IDnQPI8YfgjX2wRDJov1jT0IB5sJNH+6kSGN6Wz5R7BJFBp2J02Le/1V XCGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id eg12-20020a056402288c00b00533e4393d44si12519820edb.486.2023.10.01.13.13.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Oct 2023 13:13:38 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 16469388551E for ; Sun, 1 Oct 2023 20:12:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 51222385DC01 for ; Sun, 1 Oct 2023 20:10:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 51222385DC01 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-CSE-ConnectionGUID: +3qdhEDaST2LkpXPCLH4KQ== X-CSE-MsgGUID: oY7qjvW4RKmTnPqS0C8c6Q== X-IronPort-AV: E=Sophos;i="6.03,191,1694764800"; d="scan'208";a="18280089" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 01 Oct 2023 12:10:45 -0800 IronPort-SDR: IxKIMZe9lKPa14514K9p4glQGeuU3ubrmAwgyxDOcd57a1DxgvlusIkDHtPaC4W6JZR93mnG+9 AiZIt8RLAoG3rq8wj1rDF+fiDNdEhESlqBU9bfYiDjkbWUgntwKJyKFvZRm6vvxdccOS908Oi4 Fn/+aIq0kZFAsI9+nYzwKBxsrP64MZ0iIS41AHis9ZPy+vF5JH1oU0n8v/eJUslxqvOq5ErbiC k72b3qHjKm+yMZSCJO2J4YcfKSZ9uvNq1lzCN9W0oBwpEQObO0QXE1u2uXTTUnqsVHYbeEqxn9 Zww= From: Sandra Loosemore To: CC: , Subject: [WIP 2/4] OpenMP: Language-independent parts of loop transform support. Date: Sun, 1 Oct 2023 14:10:19 -0600 Message-ID: <20231001201021.785572-3-sandra@codesourcery.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231001201021.785572-1-sandra@codesourcery.com> References: <20231001201021.785572-1-sandra@codesourcery.com> MIME-Version: 1.0 X-ClientProxiedBy: svr-orw-mbx-14.mgc.mentorg.com (147.34.90.214) To svr-orw-mbx-13.mgc.mentorg.com (147.34.90.213) X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778585403035175939 X-GMAIL-MSGID: 1778585403035175939 From: Frederik Harwath This patch adds support for the OMP_LOOP_TRANS tree node, internal OpenMP clauses representing loop transformations, and the omp_transform_loops pass to lower them. gcc/ChangeLog: * Makefile.in (OBJS): Add omp-transform-loops.o. * gimple-pretty-print.cc (dump_gimple_omp_for): Handle GF_OMP_FOR_KIND_TRANSFORM_LOOP. * gimple.h (enum gf_mask): Add GF_FOR_KIND_TRANSFORM_LOOP. * gimplify.cc (is_gimple_stmt): Add OMP_LOOP_TRANS. (gimplify_scan_omp_clauses): Handle loop transform clauses. (gimplify_adjust_omp_clauses): Likewise. (omp_for_drop_tile_clauses): New function. (gimplify_omp_for): Call omp_for_drop_tile_clauses. Handle OMP_LOOP_TRANS and loop transform clauses. (gimplify_omp_loop): Handle loop transform clauses. (gimplify_expr): Handle OMP_LOOP_TRANS. * omp-general.cc (omp_loop_transform_clause_p): New function. * omp-general.h (omp_loop_transform_clause_p): Declare. * omp-transform-loops.cc: New file. * params.opt (omp-unroll-full-max-iterations): New. (omp-unroll-default-factor): New. * passes.def: Add pass_omp_transform_loops before pass_lower_omp. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree-nested.cc (convert_nonlocal_omp_clauses): Add support for loop transform clauses. (convert_local_omp_clauses): Likewise. * tree-pass.h (make_pass_omp_transform_loops): Declare. * tree-pretty-print.cc (dump_omp_clause): Add support for loop transform clauses. (dump_generic_node): Handle OMP_LOOP_TRANS. * tree.cc (omp_clause_num_ops): Add entries for loop transforms. (omp_clause_code_name): Likewise. * tree.def (OMP_LOOP_TRANS): New. * tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New. (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): New. (OMP_CLAUSE_TILE_SIZES): New. * doc/invoke.texi (Optimize Options): Document the new parameters. Co-Authored-By: Sandra Loosemore --- gcc/Makefile.in | 1 + gcc/doc/invoke.texi | 9 + gcc/gimple-pretty-print.cc | 6 + gcc/gimple.h | 1 + gcc/gimplify.cc | 65 +- gcc/omp-general.cc | 14 + gcc/omp-general.h | 1 + gcc/omp-transform-loops.cc | 1815 ++++++++++++++++++++++++++++++++++++ gcc/params.opt | 8 + gcc/passes.def | 1 + gcc/tree-core.h | 12 + gcc/tree-nested.cc | 14 + gcc/tree-pass.h | 1 + gcc/tree-pretty-print.cc | 52 ++ gcc/tree.cc | 8 + gcc/tree.def | 6 + gcc/tree.h | 11 + 17 files changed, 2024 insertions(+), 1 deletion(-) create mode 100644 gcc/omp-transform-loops.cc diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 9cc16268abf..767823e223b 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1582,6 +1582,7 @@ OBJS = \ omp-expand.o \ omp-general.o \ omp-low.o \ + omp-transform-loops.o \ omp-oacc-kernels-decompose.o \ omp-oacc-neuter-broadcast.o \ omp-simd-clone.o \ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4085fc90907..e904dc4b3c4 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16470,6 +16470,15 @@ With @option{--param=openacc-privatization=quiet}, don't diagnose. This is the current default. With @option{--param=openacc-privatization=noisy}, do diagnose. +@item omp-unroll-full-max-iterations +The maximum number of iterations of a loop for which an OpenMP @samp{omp unroll} +directive on the loop without a clause is turned into an +@samp{omp unroll full}. + +@item omp-unroll-default-factor +The unroll factor used for loops that have an OpenMP @samp{omp unroll partial} +directive without an explicit unroll factor. + @end table The following choices of @var{name} are available on AArch64 targets: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index 320df9197b4..1548feea092 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -1474,6 +1474,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: kind = " simd"; break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + kind = " unroll"; + break; default: gcc_unreachable (); } @@ -1511,6 +1514,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: pp_string (buffer, "#pragma omp simd"); break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + pp_string (buffer, "#pragma omp loop_transform"); + break; default: gcc_unreachable (); } diff --git a/gcc/gimple.h b/gcc/gimple.h index 2d0ac103636..b4d37a0f809 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -159,6 +159,7 @@ enum gf_mask { GF_OMP_FOR_KIND_TASKLOOP = 2, GF_OMP_FOR_KIND_OACC_LOOP = 4, GF_OMP_FOR_KIND_SIMD = 5, + GF_OMP_FOR_KIND_TRANSFORM_LOOP = 6, GF_OMP_FOR_COMBINED = 1 << 3, GF_OMP_FOR_COMBINED_INTO = 1 << 4, GF_OMP_TARGET_KIND_MASK = (1 << 5) - 1, diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index d24fd114f71..890063eca49 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -6069,6 +6069,7 @@ is_gimple_stmt (tree t) case OACC_CACHE: case OMP_PARALLEL: case OMP_FOR: + case OMP_LOOP_TRANS: case OMP_SIMD: case OMP_DISTRIBUTE: case OMP_LOOP: @@ -12300,6 +12301,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } break; + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_TILE: + break; case OMP_CLAUSE_NOHOST: default: gcc_unreachable (); @@ -13284,6 +13290,10 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: case OMP_CLAUSE_EXCLUSIVE: + case OMP_CLAUSE_TILE: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: break; case OMP_CLAUSE_NOHOST: @@ -13775,6 +13785,29 @@ find_standalone_omp_ordered (tree *tp, int *walk_subtrees, void *) return NULL_TREE; } +static void omp_for_drop_tile_clauses (tree for_stmt) +{ + /* Drop erroneous loop transformation clauses to avoid follow up errors + in pass-omp_transform_loops. */ + tree last_c = NULL_TREE; + for (tree c = OMP_FOR_CLAUSES (for_stmt); c; + c = OMP_CLAUSE_CHAIN (c)) + { + + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + if (last_c) + TREE_CHAIN (last_c) = TREE_CHAIN (c); + else + OMP_FOR_CLAUSES (for_stmt) = TREE_CHAIN (c); + + error_at (OMP_CLAUSE_LOCATION (c), + "'tile' loop transformation may not appear on " + "non-rectangular for"); + } +} + /* Gimplify the gross structure of an OMP_FOR statement. */ static enum gimplify_status @@ -13966,6 +13999,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_FOR: if (OMP_FOR_NON_RECTANGULAR (inner_for_stmt ? inner_for_stmt : for_stmt)) { + omp_for_drop_tile_clauses (for_stmt); + if (omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_SCHEDULE)) error_at (EXPR_LOCATION (for_stmt), @@ -14010,6 +14045,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_SIMD: ort = ORT_SIMD; break; + case OMP_LOOP_TRANS: + if (OMP_FOR_NON_RECTANGULAR (inner_for_stmt ? inner_for_stmt : for_stmt)) + omp_for_drop_tile_clauses (for_stmt); + break; default: gcc_unreachable (); } @@ -14370,6 +14409,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) n->value &= ~GOVD_LASTPRIVATE_CONDITIONAL; } } + else if (TREE_CODE (orig_for_stmt) == OMP_LOOP_TRANS) + { + /* This loop is not going to be associated with any + directive after its transformation in + pass_omp_transform_loops. It will be lowered there + and the loop iteration variable will be used in the + context. */ + omp_notice_variable (gimplify_omp_ctxp, decl, true); + } else omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN); @@ -14412,7 +14460,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) c2 = NULL_TREE; } } - else + else if (TREE_CODE (orig_for_stmt) != OMP_LOOP_TRANS) omp_add_variable (gimplify_omp_ctxp, var, GOVD_PRIVATE | GOVD_SEEN); } @@ -14693,6 +14741,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break; case OMP_TASKLOOP: kind = GF_OMP_FOR_KIND_TASKLOOP; break; case OACC_LOOP: kind = GF_OMP_FOR_KIND_OACC_LOOP; break; + case OMP_LOOP_TRANS: kind = GF_OMP_FOR_KIND_TRANSFORM_LOOP; break; default: gcc_unreachable (); } @@ -14877,6 +14926,14 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gtask_clauses_ptr = &OMP_CLAUSE_CHAIN (c); } break; + /* Move loop transformations to inner loop */ + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_TILE: + *gfor_clauses_ptr = c; + gfor_clauses_ptr = &OMP_CLAUSE_CHAIN (c); + break; default: gcc_unreachable (); } @@ -15317,6 +15374,11 @@ gimplify_omp_loop (tree *expr_p, gimple_seq *pre_p) } pc = &OMP_CLAUSE_CHAIN (*pc); break; + case OMP_CLAUSE_TILE: + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + break; default: gcc_unreachable (); } @@ -17098,6 +17160,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, case OMP_FOR: case OMP_DISTRIBUTE: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: ret = gimplify_omp_for (expr_p, pre_p); break; diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 906d36bdec1..7e03c552b8f 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -2253,6 +2253,20 @@ omp_declare_variant_remove_hook (struct cgraph_node *node, void *) } } +/* Return true if C is a clause that represents an OpenMP loop transformation + directive, false otherwise. */ + +bool +omp_loop_transform_clause_p (tree c) +{ + if (c == NULL) + return false; + + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + return (code == OMP_CLAUSE_UNROLL_FULL || code == OMP_CLAUSE_UNROLL_PARTIAL + || code == OMP_CLAUSE_UNROLL_NONE || code == OMP_CLAUSE_TILE); +} + /* Try to resolve declare variant, return the variant decl if it should be used instead of base, or base otherwise. */ diff --git a/gcc/omp-general.h b/gcc/omp-general.h index 1a52bfdb56b..71337a1e212 100644 --- a/gcc/omp-general.h +++ b/gcc/omp-general.h @@ -114,6 +114,7 @@ extern int omp_context_selector_matches (tree); extern int omp_context_selector_set_compare (const char *, tree, tree); extern tree omp_get_context_selector (tree, const char *, const char *); extern tree omp_resolve_declare_variant (tree); +extern bool omp_loop_transform_clause_p (tree); extern tree oacc_launch_pack (unsigned code, tree device, unsigned op); extern tree oacc_replace_fn_attrib_attr (tree attribs, tree dims); extern void oacc_replace_fn_attrib (tree fn, tree dims); diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc new file mode 100644 index 00000000000..cdabbb60b3c --- /dev/null +++ b/gcc/omp-transform-loops.cc @@ -0,0 +1,1815 @@ +/* OMP loop transformation pass. Transforms loops according to + loop transformations directives such as "omp unroll". + + Copyright (C) 2023 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "pretty-print.h" +#include "diagnostic-core.h" +#include "backend.h" +#include "target.h" +#include "tree.h" +#include "tree-inline.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "tree-pass.h" +#include "gimple-walk.h" +#include "gimple-pretty-print.h" +#include "gimplify.h" +#include "ssa.h" +#include "tree-into-ssa.h" +#include "fold-const.h" +#include "print-tree.h" +#include "omp-general.h" + +/* Context information for walk_omp_for_loops. */ +struct walk_ctx +{ + /* The most recently visited gomp_for that has been transformed and + for which gimple_omp_for_set_combined_into_p returned true. */ + gomp_for *inner_combined_loop; + + /* The innermost bind enclosing the currently visited node. */ + gbind *bind; +}; + +static unsigned int walk_omp_for_loops (gimple_seq *, walk_ctx *); +static enum tree_code omp_adjust_neq_condition (tree v, tree step); + +static bool +non_rectangular_p (const gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + for (size_t i = 0; i < collapse; i++) + { + if (TREE_CODE (gimple_omp_for_final (omp_for, i)) == TREE_VEC + || TREE_CODE (gimple_omp_for_initial (omp_for, i)) == TREE_VEC) + return true; + } + + return false; +} + +/* Callback for subst_var. */ + +static tree +subst_var_in_op (tree *t, int *subtrees ATTRIBUTE_UNUSED, void *data) +{ + + auto *wi = (struct walk_stmt_info *)data; + auto from_to = (std::pair *)wi->info; + + if (*t == from_to->first) + { + *t = from_to->second; + wi->changed = true; + } + + return NULL_TREE; +} + +/* Substitute all occurrences of FROM in the operands of the GIMPLE statements + in SEQ by TO. */ + +static void +subst_var (gimple_seq *seq, tree from, tree to) +{ + gcc_assert (VAR_P (from)); + gcc_assert (VAR_P (to)); + + std::pair from_to (from, to); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + + walk_gimple_seq_mod (seq, NULL, subst_var_in_op, &wi); +} + +/* Return the type that should be used for computing the iteration count of a + loop with the given index VAR and upper/lower bound FINAL according to + OpenMP 5.1. */ + +tree +gomp_for_iter_count_type (tree var, tree final) +{ + tree var_type = TREE_TYPE (var); + + if (POINTER_TYPE_P (var_type)) + return ptrdiff_type_node; + + tree operand_type = TREE_TYPE (final); + if (TYPE_UNSIGNED (var_type) && !TYPE_UNSIGNED (operand_type)) + return signed_type_for (operand_type); + + return var_type; +} + +extern tree +gimple_assign_rhs_to_tree (gimple *stmt); + +/* Substitute all definitions from SEQ bottom-up into EXPR. This is used to + reconstruct a tree from a gimplified expression for determinig whether or not + the number of iterations of a loop is constant. */ + +tree +subst_defs (tree expr, gimple_seq seq) +{ + gimple_seq_node last = gimple_seq_last (seq); + gimple_seq_node first = gimple_seq_first (seq); + for (auto n = last; n != NULL; n = n != first ? n->prev : NULL) + { + if (!is_gimple_assign (n)) + continue; + + tree lhs = gimple_assign_lhs (n); + tree rhs = gimple_assign_rhs_to_tree (n); + std::pair from_to (lhs, rhs); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + walk_tree (&expr, subst_var_in_op, &wi, NULL); + expr = fold (expr); + } + + return expr; +} + +/* Return an expression for the number of iterations of the loop at + the given LEVEL of OMP_FOR. + + If the expression is a negative constant, this means that the loop + is infinite. This can only be recognized for loops with constant + initial, final, and step values. In general, according to the + OpenMP specification, the behaviour is unspecified if the number of + iterations does not fit the types used for their computation, and + hence in particular if the loop is infinite. */ + +tree +gomp_for_number_of_iterations (const gomp_for *omp_for, size_t level) +{ + gcc_assert (!non_rectangular_p (omp_for)); + tree init = gimple_omp_for_initial (omp_for, level); + tree final = gimple_omp_for_final (omp_for, level); + tree_code cond = gimple_omp_for_cond (omp_for, level); + tree index = gimple_omp_for_index (omp_for, level); + tree type = gomp_for_iter_count_type (index, final); + tree incr = gimple_omp_for_incr (omp_for, level); + tree step = omp_get_for_step_from_incr (gimple_location (omp_for), incr); + + init = subst_defs (init, gimple_omp_for_pre_body (omp_for)); + init = fold (init); + final = subst_defs (final, gimple_omp_for_pre_body (omp_for)); + final = fold (final); + + tree_code minus_code = MINUS_EXPR; + tree diff_type = type; + if (POINTER_TYPE_P (TREE_TYPE (final))) + { + minus_code = POINTER_DIFF_EXPR; + diff_type = ptrdiff_type_node; + } + + + /* Identify a simple case in which the loop does not iterate. The + computation below could not tell this apart from an infinite + loop, hence we handle this separately for better diagnostic + messages. */ + gcc_assert (cond == GT_EXPR || cond == LT_EXPR); + if (TREE_CONSTANT (init) && TREE_CONSTANT (final) + && ((cond == GT_EXPR && tree_int_cst_le (init, final)) + || (cond == LT_EXPR && tree_int_cst_le (final, init)))) + return build_int_cst (diff_type, 0); + + tree diff = fold_build2 (minus_code, diff_type, final, init); + + /* Divide diff by the step. + + We could always use CEIL_DIV_EXPR since only non-negative results + correspond to valid number of iterations and the behaviour is + unspecified by the spec otherwise. But we try to get the rounding + right for constant negative values to identify infinite loops + more precisely for better warnings. */ + tree_code div_expr = CEIL_DIV_EXPR; + if (TREE_CONSTANT (diff) && TREE_CONSTANT (step)) + { + bool diff_is_neg = tree_int_cst_lt (diff, size_zero_node); + bool step_is_neg = tree_int_cst_lt (step, size_zero_node); + if ((diff_is_neg && !step_is_neg) + || (!diff_is_neg && step_is_neg)) + div_expr = FLOOR_DIV_EXPR; + } + + diff = fold_build2 (div_expr, type, diff, step); + return diff; +} + +/* Return true if the expression representing the number of iterations + for OMP_FOR is a non-negative constant and set ITERATIONS to the + value of that expression. Otherwise, return false. Set INFINITE to + true if the number of iterations was recognized to be infinite. */ + +bool +gomp_for_constant_iterations_p (gomp_for *omp_for, + unsigned HOST_WIDE_INT *iterations, + bool *infinite = NULL) +{ + tree t = gomp_for_number_of_iterations (omp_for, 0); + if (!TREE_CONSTANT (t)) + return false; + + if (infinite && + tree_int_cst_lt (t, size_zero_node)) + *infinite = true; + else if (tree_fits_uhwi_p (t)) + { + *iterations = tree_to_uhwi (t); + return true; + } + + return false; +} + +static gimple_seq +expand_transformed_loop (gomp_for *omp_for); + +/* Split a gomp_for that represents a collapsed loop-nest into single + loops. The result is a gomp_for of the same kind which is not collapsed + (i.e. gimple_omp_for_collapse (OMP_FOR) == 1) and which contains nested, + non-collapsed gomp_for loops whose kind is GF_OMP_FOR_KIND_TRANSFORM_LOOP + (i.e. they will be lowered into plain, non-omp loops by this pass) for each + of the loops of OMP_FOR. All loops whose depth is strictly less than + FROM_DEPTH are left collapsed. */ + +static gomp_for* +gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0, bool expand = false) +{ + int collapse = gimple_omp_for_collapse (omp_for); + gcc_assert (from_depth < collapse); + gcc_assert (from_depth >= 0); + + if (collapse <= 1) + return omp_for; + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Uncollapsing loop:\n %G\n", + static_cast (omp_for)); + + gimple_seq body = gimple_omp_body (omp_for); + gomp_for *level_omp_for = omp_for; + for (int level = collapse - 1; level >= from_depth; level--) + { + level_omp_for = gimple_build_omp_for (body, + GF_OMP_FOR_KIND_TRANSFORM_LOOP, + NULL, 1, NULL); + gimple_omp_for_set_cond (level_omp_for, 0, + gimple_omp_for_cond (omp_for, level)); + gimple_omp_for_set_initial (level_omp_for, 0, + gimple_omp_for_initial (omp_for, level)); + gimple_omp_for_set_final (level_omp_for, 0, + gimple_omp_for_final (omp_for, level)); + gimple_omp_for_set_incr (level_omp_for, 0, + gimple_omp_for_incr (omp_for, level)); + gimple_omp_for_set_index (level_omp_for, 0, + gimple_omp_for_index (omp_for, level)); + + + if (expand) + body = expand_transformed_loop (level_omp_for); + else + body = level_omp_for; + } + + omp_for->collapse = from_depth; + + if (from_depth > 0) + { + gimple_omp_set_body (omp_for, body); + omp_for->collapse = from_depth; + return omp_for; + } + + gimple_omp_for_set_clauses (level_omp_for, gimple_omp_for_clauses (omp_for)); + gimple_omp_for_set_pre_body (level_omp_for, gimple_omp_for_pre_body (omp_for)); + gimple_omp_for_set_combined_into_p (level_omp_for, + gimple_omp_for_combined_into_p (omp_for)); + gimple_omp_for_set_combined_p (level_omp_for, + gimple_omp_for_combined_p (omp_for)); + + return level_omp_for; +} + +static tree +build_loop_exit_cond (tree index, tree_code cond, tree final, gimple_seq *seq) +{ + tree exit_cond + = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, + fold_build2 (cond, boolean_type_node, index, final)); + tree res = create_tmp_var (boolean_type_node); + gimplify_assign (res, exit_cond, seq); + + return res; +} + +/* Returns a register that contains the final value of a loop as described by + FINAL. This is necessary for non-rectangular loops. */ + +static tree +build_loop_final (tree final, gimple_seq *seq) +{ + if (TREE_CODE (final) != TREE_VEC) /* rectangular loop-nest */ + return final; + + tree coeff = TREE_VEC_ELT (final, 0); + tree outer_var = TREE_VEC_ELT (final, 1); + tree constt = TREE_VEC_ELT (final, 2); + + tree type = TREE_TYPE (outer_var); + tree val = fold_build2 (MULT_EXPR, type, coeff, outer_var); + val = fold_build2 (PLUS_EXPR, type, val, constt); + + tree res = create_tmp_var (type); + gimplify_assign (res, val, seq); + + return res; +} + +/* Unroll the loop BODY UNROLL_FACTOR times, replacing the INDEX + variable by a local copy in each copy of the body that will be + incremented as specified by INCR. If BUILD_EXIT_CONDS is true, + insert a test of the loop exit condition given COND and FINAL + before each copy of the body that will exit the loop if the value + of the local index variable satisfies the loop exit condition. + + For example, the unrolling with BUILD_EXIT_CONDS == true turns + + for (i = 0; i < 3; i = i + 1) + { + BODY + } + + into + + for (i = 0; i < n; i = i + 1) + { + i.0 = i + if (!(i_0 < n)) + goto exit + BODY_COPY_1[i/i.0] i.e. index var i replaced by i.0 + if (!(i_1 < n)) + goto exit + i.1 = i.0 + 1 + BODY_COPY_2[i/i.1] + if (!(i_3 < n)) + goto exit + i.2 = i.2 + 1 + BODY_COPY_3[i/i.2] + exit: + } + */ +static gimple_seq +build_unroll_body (gimple_seq body, tree unroll_factor, tree index, tree incr, + bool build_exit_conds = false, tree final = NULL_TREE, + tree_code *cond = NULL) +{ + gcc_assert ((!build_exit_conds && !final && !cond) + || (build_exit_conds && final && cond)); + + gimple_seq new_body = NULL; + + push_gimplify_context (); + + if (build_exit_conds) + final = build_loop_final (final, &new_body); + + tree local_index = create_tmp_var (TREE_TYPE (index)); + subst_var (&body, index, local_index); + tree local_incr = unshare_expr (incr); + TREE_OPERAND (local_incr, 0) = local_index; + + tree exit_label = create_artificial_label (gimple_location (body)); + + unsigned HOST_WIDE_INT n = tree_to_uhwi (unroll_factor); + for (unsigned HOST_WIDE_INT i = 0; i < n; i++) + { + if (i == 0) + gimplify_assign (local_index, index, &new_body); + else + gimplify_assign (local_index, local_incr, &new_body); + + tree body_copy_label = create_artificial_label (gimple_location (body)); + + if (build_exit_conds) + { + tree exit_cond + = build_loop_exit_cond (local_index, *cond, final, &new_body); + gimple_seq_add_stmt ( + &new_body, + gimple_build_cond (EQ_EXPR, exit_cond, boolean_true_node, + exit_label, body_copy_label)); + } + + gimple_seq body_copy = copy_gimple_seq_and_replace_locals (body); + gimple_seq_add_stmt (&new_body, gimple_build_label (body_copy_label)); + gimple_seq_add_seq (&new_body, body_copy); + } + + + gbind *bind = gimple_build_bind (NULL, new_body, NULL); + pop_gimplify_context (bind); + + gimple_seq result = NULL; + gimple_seq_add_stmt (&result, bind); + gimple_seq_add_stmt (&result, gimple_build_label (exit_label)); + return result; +} + +static gimple_seq transform_gomp_for (gomp_for *, tree, walk_ctx *ctx); + +/* Execute the partial unrolling transformation for OMP_FOR with the given + UNROLL_FACTOR and return the resulting gimple bind. LOC is the location for + diagnostic messages. + + Example + -------- + -------- + + Original loop + ------------- + + #pragma omp for unroll_partial(3) + for (i = 0; i < 100; i = i + 1) + { + BODY + } + + gets, roughly, translated to + + { + #pragma omp for + for (i = 0; i < 100; i = i + 3) + { + i.0 = i + if i.0 > 100: + goto exit_label + BODY_COPY_1[i/i.0] i.e. index var replaced + i.1 = i + 1 + if i.1 > 100: + goto exit_label + BODY_COPY_2[i/1.1] + i.2 = i + 2 + if i.2 > 100: + goto exit_label + BODY_COPY_3[i/i.2] + + exit_label: + } +*/ + +static gimple_seq +partial_unroll (gomp_for *omp_for, size_t level, tree unroll_factor, + location_t loc, tree transformation_clauses, walk_ctx *ctx) +{ + gcc_assert (unroll_factor); + gcc_assert ( + OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_PARTIAL + || OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_NONE); + + /* Partial unrolling reduces the loop nest depth of a canonical loop nest to 1 + hence outer directives cannot require a greater collapse. */ + gcc_assert (gimple_omp_for_collapse (omp_for) <= level + 1); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Partially unrolling loop:\n %G\n", + static_cast (omp_for)); + + gomp_for *unrolled_for = as_a (copy_gimple_seq_and_replace_locals (omp_for)); + + tree final = gimple_omp_for_final (unrolled_for, level); + tree incr = gimple_omp_for_incr (unrolled_for, level); + tree index = gimple_omp_for_index (unrolled_for, level); + gimple_seq body = gimple_omp_body (unrolled_for); + + tree_code cond = gimple_omp_for_cond (unrolled_for, level); + tree step = TREE_OPERAND (incr, 1); + gimple_omp_set_body (unrolled_for, + build_unroll_body (body, unroll_factor, index, incr, + true, final, &cond)); + + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); + + push_gimplify_context (); + + tree scaled_step + = fold_build2 (MULT_EXPR, TREE_TYPE (step), + fold_convert (TREE_TYPE (step), unroll_factor), step); + + /* For combined constructs, step will be gimplified on the outer + gomp_for. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (scaled_step)) + { + tree var = create_tmp_var (TREE_TYPE (step), ".omp_unroll_step"); + gimplify_assign (var, scaled_step, + gimple_omp_for_pre_body_ptr (unrolled_for)); + scaled_step = var; + } + TREE_OPERAND (incr, 1) = scaled_step; + gimple_omp_for_set_incr (unrolled_for, level, incr); + + pop_gimplify_context (result_bind); + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = unrolled_for; + + tree remaining_clauses = OMP_CLAUSE_CHAIN (transformation_clauses); + gimple_seq_add_stmt ( + gimple_bind_body_ptr (result_bind), + transform_gomp_for (unrolled_for, remaining_clauses, ctx)); + + return result_bind; +} + +static gimple_seq +full_unroll (gomp_for *omp_for, location_t loc, walk_ctx *ctx ATTRIBUTE_UNUSED) +{ + tree init = gimple_omp_for_initial (omp_for, 0); + unsigned HOST_WIDE_INT niter = 0; + bool infinite = false; + bool constant = gomp_for_constant_iterations_p (omp_for, &niter, &infinite); + + if (infinite) + { + warning_at (loc, 0, "Cannot apply full unrolling to infinite loop"); + return NULL; + } + if (!constant) + { + error_at (loc, "Cannot apply full unrolling to loop with " + "non-constant number of iterations"); + return omp_for; + } + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Fully unrolling loop with " + HOST_WIDE_INT_PRINT_UNSIGNED + " iterations :\n %G\n", niter, + static_cast (omp_for)); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + + tree unroll_factor = build_int_cst (TREE_TYPE (init), niter); + + gimple_seq unrolled = NULL; + gimple_seq_add_seq (&unrolled, gimple_omp_for_pre_body (omp_for)); + gimplify_assign (index, init, &unrolled); + push_gimplify_context (); + gimple_seq_add_seq (&unrolled, + build_unroll_body (body, unroll_factor, index, incr)); + + gbind *result_bind = gimple_build_bind (NULL, unrolled, NULL); + pop_gimplify_context (result_bind); + return result_bind; +} + +/* Decides if the OMP_FOR for which the user did not specify the type of + unrolling to apply in the 'unroll' directive represented by the TRANSFORM + clause should be fully unrolled. */ + +static bool +assign_unroll_full_clause_p (gomp_for *omp_for, tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + gcc_assert (OMP_CLAUSE_CHAIN (transform) == NULL); + + /* Full unrolling turns the loop into a non-loop and hence + the following transformations would fail. */ + if (TREE_CHAIN (transform) != NULL_TREE) + return false; + + unsigned HOST_WIDE_INT num_iters; + if (!gomp_for_constant_iterations_p (omp_for, &num_iters) + || num_iters + > (unsigned HOST_WIDE_INT)param_omp_unroll_full_max_iterations) + return false; + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "assigned % clause to % with small " + "constant number of iterations\n"); + } + + return true; +} + +/* If the OMP_FOR for which the user did not specify the type of unrolling in + the 'unroll' directive in the TRANSFORM clause should be partially unrolled, + return the unroll factor, otherwise return null. */ + +static tree +assign_unroll_partial_clause_p (gomp_for *omp_for ATTRIBUTE_UNUSED, + tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + + if (param_omp_unroll_default_factor == 0) + return NULL; + + tree unroll_factor + = build_int_cst (integer_type_node, param_omp_unroll_default_factor); + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "added % clause to % directive\n", + param_omp_unroll_default_factor); + } + + return unroll_factor; +} + +/* Generate the code for an OMP_FOR that represents the result of a + loop transformation which is not associated with any directive and + which will hence not be lowered in the omp-expansion. */ + +static gimple_seq +expand_transformed_loop (gomp_for *omp_for) +{ + gcc_assert (gimple_omp_for_kind (omp_for) + == GF_OMP_FOR_KIND_TRANSFORM_LOOP); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Expanding loop:\n %G\n", + static_cast (omp_for)); + + push_gimplify_context (); + + omp_for = gomp_for_uncollapse (omp_for); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + tree init = gimple_omp_for_initial (omp_for, 0); + tree final = gimple_omp_for_final (omp_for, 0); + tree_code cond = gimple_omp_for_cond (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + gimple_seq pre_body = gimple_omp_for_pre_body (omp_for); + + gimple_seq loop = NULL; + + tree exit_label = create_artificial_label (UNKNOWN_LOCATION); + tree cycle_label = create_artificial_label (UNKNOWN_LOCATION); + tree body_label = create_artificial_label (UNKNOWN_LOCATION); + + gimple_seq_add_seq (&loop, pre_body); + gimplify_assign (index, init, &loop); + tree final_var = final; + if (TREE_CODE (final) != VAR_DECL) + { + final_var = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (final_var, final, &loop); + } + + gimple_seq_add_stmt (&loop, gimple_build_label (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_cond (cond, index, final_var, + body_label, exit_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (body_label)); + gimple_seq_add_seq (&loop, body); + gimplify_assign (index, incr, &loop); + gimple_seq_add_stmt (&loop, gimple_build_goto (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (exit_label)); + + gbind *bind = gimple_build_bind (NULL, loop, NULL); + pop_gimplify_context (bind); + + return bind; +} + +static enum tree_code +omp_adjust_neq_condition (tree v, tree step) +{ + gcc_assert (TREE_CODE (step) == INTEGER_CST); + if (TREE_CODE (TREE_TYPE (v)) == INTEGER_TYPE) + { + if (integer_onep (step)) + return LT_EXPR; + else + { + gcc_assert (integer_minus_onep (step)); + return GT_EXPR; + } + } + else + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + gcc_assert (TREE_CODE (unit) == INTEGER_CST); + if (tree_int_cst_equal (unit, step)) + return LT_EXPR; + else + { + gcc_assert (wi::neg (wi::to_widest (unit)) + == wi::to_widest (step)); + return GT_EXPR; + } + } +} + +/* Adjust *COND_CODE and *N2 so that the former is either LT_EXPR or GT_EXPR, + given that V is the loop index variable and STEP is loop step. + + This function has been derived from omp_adjust_for_condition. + In contrast to the original function it does not add 1 or + -1 to the the final value when converting <=,>= to <,> + for a pointer-type index variable. Instead, this function + adds or subtracts the type size in bytes. This is necessary + to determine the number of iterations correctly. */ + +void +omp_adjust_for_condition2 (location_t loc, enum tree_code *cond_code, tree *n2, + tree v, tree step) +{ + switch (*cond_code) + { + case LT_EXPR: + case GT_EXPR: + break; + + case NE_EXPR: + *cond_code = omp_adjust_neq_condition (v, step); + break; + + case LE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, type_unit); + } + else + *n2 = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = LT_EXPR; + break; + case GE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, -1 * type_unit); + } + else + *n2 = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = GT_EXPR; + break; + default: + gcc_unreachable (); + } +} + +/* Transform the condition of OMP_FOR to either LT_EXPR or GT_EXPR and adjust + the final value as necessary. */ + +static bool +canonicalize_conditions (gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + location_t loc = gimple_location (omp_for); + bool new_decls = false; + + gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (omp_for); + for (size_t l = 0; l < collapse; l++) + { + enum tree_code cond = gimple_omp_for_cond (omp_for, l); + + if (cond == LT_EXPR || cond == GT_EXPR) + continue; + + tree incr = gimple_omp_for_incr (omp_for, l); + tree step = omp_get_for_step_from_incr (loc, incr); + tree index = gimple_omp_for_index (omp_for, l); + tree final = gimple_omp_for_final (omp_for, l); + tree orig_final = final; + /* If final refers to the index variable of an outer level, i.e. + the loop nest is non-rectangular, only convert NE_EXPR. This + is necessary for unrolling. Unrolling needs to multiply the + step by the unrolling factor, but non-constant step values + are impossible with NE_EXPR. */ + if (TREE_CODE (final) == TREE_VEC) + { + cond = omp_adjust_neq_condition (TREE_VEC_ELT (final, 1), + TREE_OPERAND (incr, 1)); + gimple_omp_for_set_cond (omp_for, l, cond); + continue; + } + + omp_adjust_for_condition2 (loc, &cond, &final, index, step); + + gimple_omp_for_set_cond (omp_for, l, cond); + if (final == orig_final) + continue; + + /* If this is a combined construct, gimplify the final on the + outer construct. */ + if (TREE_CODE (final) != INTEGER_CST + && !gimple_omp_for_combined_into_p (omp_for)) + { + tree new_final = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (new_final, final, pre_body); + final = new_final; + new_decls = true; + } + + gimple_omp_for_set_final (omp_for, l, final); + } + + return new_decls; +} + +/* Execute the tiling transformation for OMP_FOR with the given TILE_SIZES and + return the resulting gimple bind. TILE_SIZES must be a non-empty tree chain + of integer constants and the collapse of OMP_FOR must be at least the length + of TILE_SIZES. TRANSFORMATION_CLAUSES are the loop transformations that + must be applied to OMP_FOR. Those are applied on the result of the tiling + transformation. LOC is the location for diagnostic messages. + + Example 1 + --------- + --------- + + Original loop + ------------- + + #pragma omp for + #pragma omp tile sizes(3) + for (i = 1; i <= n; i = i + 1) + { + body; + } + + Internally, the tile directive is represented as a clause on the + omp for, i.e. as #pragma omp for tile_sizes(3). + + Transformed loop + ---------------- + + #pragma omp for + for (.omp_tile_index = 1; .omp_tile_index < ceil(n/3); .omp_tile_index = .omp_tile_index + 3) + { + D.4287 = .omp_tile_index + 3 + 1 + #pragma omp loop_transform + for (i = .omp_tile_index; i < D.4287; i = i + 1) + { + if (i.0 > n) + goto L.0 + body; + } + L_0: + } + + The outer loop is the "floor loop" and the inner loop is the "tile + loop". The tile loop is never in canonical loop nest form and + hence it cannot be associated with any loop construct. The + GCC-internal "omp loop transform" construct will be lowered after + the tiling transformation. + */ + +static gimple_seq +tile (gomp_for *omp_for, location_t loc, size_t start_level, tree tile_sizes, + tree transformation_clauses, walk_ctx *ctx) +{ + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Executing tile transformation %T:\n %G\n", + transformation_clauses, static_cast (omp_for)); + + gimple_seq tile_loops = copy_gimple_seq_and_replace_locals (omp_for); + gimple_seq floor_loops = copy_gimple_seq_and_replace_locals (omp_for); + + size_t collapse = gimple_omp_for_collapse (omp_for); + size_t tiling_depth = list_length (tile_sizes); + tree clauses = gimple_omp_for_clauses (omp_for); + size_t clause_collapse = 1; + tree collapse_clause = NULL; + + if (tree c = omp_find_clause (clauses, OMP_CLAUSE_ORDERED)) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% invalid in conjunction with %"); + return omp_for; + } + + if (tree c = omp_find_clause (clauses, OMP_CLAUSE_COLLAPSE)) + { + tree expr = OMP_CLAUSE_COLLAPSE_EXPR (c); + clause_collapse = tree_to_uhwi (expr); + collapse_clause = c; + } + + /* The tiled loop nest is a canonical loop nest with nesting depth + tiling_depth. The tile loops below that level are not in + canonical loop nest form and hence cannot be associated with a + loop construct. */ + if (clause_collapse > tiling_depth + start_level) + { + error_at (OMP_CLAUSE_LOCATION (collapse_clause), + "collapse cannot extend below the floor loops " + "generated by the % construct"); + OMP_CLAUSE_COLLAPSE_EXPR (collapse_clause) + = build_int_cst (unsigned_type_node, start_level + tiling_depth); + return transform_gomp_for (omp_for, NULL, ctx); + } + + if (start_level + tiling_depth > collapse) + return transform_gomp_for (omp_for, NULL, ctx); + + gcc_assert (collapse >= clause_collapse); + + push_gimplify_context (); + + /* Create the index variables for iterating the tiles in the floor + loops which will be the loops at levels start_level + ... start_level + tiling_depth of the transformed loop nest. The + loops at level 0 ... start_level - 1 are left unchanged. */ + gimple_seq floor_loops_pre_body = NULL; + size_t tile_level = 0; + auto_vec sizes_vec; + for (tree el = tile_sizes; el; el = TREE_CHAIN (el), tile_level++) + { + size_t nest_level = start_level + tile_level; + tree index = gimple_omp_for_index (omp_for, nest_level); + tree init = gimple_omp_for_initial (omp_for, nest_level); + tree incr = gimple_omp_for_incr (omp_for, nest_level); + tree step = TREE_OPERAND (incr, 1); + + /* Initialize original index variables in the pre-body. The + loop lowering will not initialize them because of the changed + index variables. */ + gimplify_assign (index, init, &floor_loops_pre_body); + + tree tile_size = fold_convert (TREE_TYPE (step), TREE_VALUE (el)); + sizes_vec.safe_push (tile_size); + tree tile_index = create_tmp_var (TREE_TYPE (index), ".omp_tile_index"); + gimplify_assign (tile_index, init, &floor_loops_pre_body); + + /* Floor loops */ + step = fold_build2 (MULT_EXPR, TREE_TYPE (step), step, tile_size); + tree tile_step = step; + /* For combined constructs, step will be gimplified on the outer + gomp_for. */ + if (!gimple_omp_for_combined_into_p (omp_for) && !TREE_CONSTANT (step)) + { + tile_step = create_tmp_var (TREE_TYPE (step), ".omp_tile_step"); + gimplify_assign (tile_step, step, &floor_loops_pre_body); + } + incr = fold_build2 (TREE_CODE (incr), TREE_TYPE (incr), tile_index, + tile_step); + gimple_omp_for_set_incr (floor_loops, nest_level, incr); + gimple_omp_for_set_index (floor_loops, nest_level, tile_index); + } + + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); + pop_gimplify_context (result_bind); + gimple_seq_add_seq (gimple_omp_for_pre_body_ptr (floor_loops), + floor_loops_pre_body); + + /* The tiling loops will not form a perfect loop nest because the + loop for each tiling dimension needs to check if the current tile + is incomplete and this check is intervening code. Since OpenMP + 5.1 does not allow the collapse of the loop-nest to extend beyond + the floor loops, this is not a problem. + + "Uncollapse" the tiling loop nest, i.e. split the loop nest into + nested separate gomp_for structures for each level. This allows + to add the incomplete tile checks to each level loop. */ + + tile_loops = gomp_for_uncollapse (as_a (tile_loops)); + for (size_t i = 0; i < start_level; i++) + tile_loops = gimple_omp_body (tile_loops); + + gimple_omp_for_set_kind (as_a (tile_loops), + GF_OMP_FOR_KIND_TRANSFORM_LOOP); + gimple_omp_for_set_clauses (tile_loops, NULL_TREE); + gimple_omp_for_set_pre_body (tile_loops, NULL); + + /* Transform the loop bodies of the "uncollapsed" tiling loops and + add them to the body of the floor loops. At this point, the + loop nest consists of perfectly nested gimple_omp_for constructs, + each representing a single loop. */ + gimple_seq floor_loops_body = NULL; + gimple *level_loop = tile_loops; + gimple_seq_add_stmt (&floor_loops_body, tile_loops); + gimple_seq *surrounding_seq = &floor_loops_body; + + push_gimplify_context (); + + tree break_label = create_artificial_label (UNKNOWN_LOCATION); + gimple_seq_add_stmt (surrounding_seq, gimple_build_label (break_label)); + for (size_t tile_level = 0; tile_level < tiling_depth; tile_level++) + { + gimple_seq level_preamble = NULL; + gimple_seq level_body = gimple_omp_body (level_loop); + auto gsi = gsi_start (level_body); + + int nest_level = start_level + tile_level; + tree original_index = gimple_omp_for_index (omp_for, nest_level); + tree original_final = gimple_omp_for_final (omp_for, nest_level); + + tree tile_index + = gimple_omp_for_index (floor_loops, nest_level); + tree tile_size = sizes_vec[tile_level]; + tree type = TREE_TYPE (tile_index); + tree plus_type = type; + + tree incr = gimple_omp_for_incr (omp_for, nest_level); + tree step = omp_get_for_step_from_incr (gimple_location (omp_for), incr); + + gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (level_loop); + gcc_assert (gimple_omp_for_collapse (level_loop) == 1); + tree_code original_cond = gimple_omp_for_cond (omp_for, nest_level); + + gimple_omp_for_set_initial (level_loop, 0, tile_index); + + tree tile_final = create_tmp_var (type); + tree scaled_tile_size + = fold_build2 (MULT_EXPR, TREE_TYPE (tile_size), tile_size, step); + + tree_code plus_code = PLUS_EXPR; + if (POINTER_TYPE_P (TREE_TYPE (tile_index))) + { + plus_code = POINTER_PLUS_EXPR; + int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scaled_tile_size)); + plus_type + = signed_or_unsigned_type_for (unsignedp, ptrdiff_type_node); + } + + scaled_tile_size = fold_convert (plus_type, scaled_tile_size); + gimplify_assign ( + tile_final, + fold_build2 (plus_code, type, tile_index, scaled_tile_size), + pre_body); + gimple_omp_for_set_final (level_loop, 0, tile_final); + + push_gimplify_context (); + + tree body_label = create_artificial_label (UNKNOWN_LOCATION); + + /* Handle partial tiles, i.e. add a check that breaks from the tile loop + if the new index value does not belong to the iteration space of the + original loop. */ + gimple_seq_add_stmt (&level_preamble, + gimple_build_cond (original_cond, original_index, + original_final, body_label, + break_label)); + gimple_seq_add_stmt (&level_preamble, gimple_build_label (body_label)); + + gsi_insert_seq_before (&gsi, level_preamble, GSI_SAME_STMT); + gbind *level_bind = gimple_build_bind (NULL, NULL, NULL); + pop_gimplify_context (level_bind); + gimple_bind_set_body (level_bind, level_body); + gimple_omp_set_body (level_loop, level_bind); + + surrounding_seq = &level_body; + level_loop = gsi_stmt (gsi); + + /* The label for jumping out of the loop at the next + nesting level. For the outermost level, the label is put + after the loop-nest, for the last one it is not necessary. */ + if (tile_level != tiling_depth - 1) + { + break_label = create_artificial_label (UNKNOWN_LOCATION); + gsi_insert_after (&gsi, gimple_build_label (break_label), + GSI_NEW_STMT); + } + } + + gbind *tile_loops_bind; + tile_loops_bind = gimple_build_bind (NULL, tile_loops, NULL); + pop_gimplify_context (tile_loops_bind); + + gimple_omp_set_body (floor_loops, tile_loops_bind); + + tree remaining_clauses = OMP_CLAUSE_CHAIN (transformation_clauses); + + /* Collapsing of the OMP_FOR is used both for the "omp tile" + implementation and for the actual "collapse" clause. If the + tiling depth was greater than the collapse depth required by the + clauses on OMP_FOR, the collapse of OMP_FOR must be adjusted to + the latter value and all loops below the new collapse depth must + be transformed to GF_OMP_FOR_KIND_TRANSFORM_LOOP to ensure their + lowering in this pass. */ + size_t new_collapse = clause_collapse; + + /* Keep the omp_for collapsed if there are further transformations */ + if (remaining_clauses) + { + size_t next_transform_depth = 1; + if (OMP_CLAUSE_CODE (remaining_clauses) == OMP_CLAUSE_TILE) + next_transform_depth + = list_length (OMP_CLAUSE_TILE_SIZES (remaining_clauses)); + + size_t next_level + = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (remaining_clauses)); + /* The current "omp tile" transformation reduces the nesting depth + of the canonical loop-nest to TILING_DEPTH. + Hence the following "omp tile" transformation is invalid if + it requires a greater nesting depth. */ + gcc_assert (next_level + next_transform_depth <= + start_level + tiling_depth); + if (next_level + next_transform_depth > new_collapse) + new_collapse = next_level + next_transform_depth; + } + + if (collapse > new_collapse) + floor_loops = gomp_for_uncollapse (as_a (floor_loops), + new_collapse, true); + + /* Lower the uncollapsed tile loops. */ + walk_omp_for_loops (gimple_bind_body_ptr (tile_loops_bind), ctx); + + gcc_assert (remaining_clauses || !collapse_clause + || gimple_omp_for_collapse (floor_loops) + == (size_t)clause_collapse); + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = as_a (floor_loops); + + /* Apply remaining transformation clauses and assemble the transformation + result. */ + gimple_bind_set_body (result_bind, + transform_gomp_for (as_a (floor_loops), + remaining_clauses, ctx)); + + return result_bind; +} + +/* Combined distribute or taskloop constructs are represented by two + or more nested gomp_for constructs which are created during + gimplification. Loop transformations on the combined construct are + executed on the innermost gomp_for. This function adjusts the loop + header of an outer OMP_FOR loop to the changes made by the + transformations on the inner loop which is provided by the CTX. */ +static gimple_seq +adjust_combined_loop (gomp_for *omp_for, walk_ctx *ctx) +{ + gcc_assert (gimple_omp_for_combined_p (omp_for)); + gcc_assert (ctx->inner_combined_loop); + + gomp_for *inner_omp_for = ctx->inner_combined_loop; + size_t collapse = gimple_omp_for_collapse (inner_omp_for); + + int kind = gimple_omp_for_kind (omp_for); + if (kind == GF_OMP_FOR_KIND_DISTRIBUTE || kind == GF_OMP_FOR_KIND_TASKLOOP) + { + for (size_t level = 0; level < collapse; ++level) + { + tree outer_incr = gimple_omp_for_incr (omp_for, level); + tree inner_incr = gimple_omp_for_incr (inner_omp_for, level); + gcc_assert (TREE_TYPE (inner_incr) == TREE_TYPE (outer_incr)); + + tree inner_final = gimple_omp_for_final (inner_omp_for, level); + enum tree_code inner_cond + = gimple_omp_for_cond (inner_omp_for, level); + gimple_omp_for_set_cond (omp_for, level, inner_cond); + + tree inner_step = TREE_OPERAND (inner_incr, 1); + /* If this omp_for is the outermost loop belonging to a + combined construct, gimplify the step into its + prebody. Otherwise, just gimplify the step on the inner + gomp_for and move the ungimplified step expression + here. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + TREE_OPERAND (outer_incr, 1) = step; + } + else + TREE_OPERAND (outer_incr, 1) = inner_step; + + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_final), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + gimple_omp_for_set_final (omp_for, level, final); + } + else + gimple_omp_for_set_final (omp_for, level, inner_final); + + /* Gimplify the step on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + TREE_OPERAND (inner_incr, 1) = step; + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = step; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + + /* Gimplify the final on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + gimple_omp_for_set_final (inner_omp_for, level, final); + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = final; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + } + } + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = omp_for; + else + ctx->inner_combined_loop = NULL; + + return omp_for; +} + +/* Transform OMP_FOR recursively according to the clause chain + TRANSFORMATION. Return the resulting sequence of gimple statements. + + This function dispatches OMP_FOR to the handler function for the + TRANSFORMATION clause. The handler function is responsible for invoking this + function recursively for executing the remaining transformations. */ + +static gimple_seq +transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) +{ + if (!transformation) + { + if (gimple_omp_for_kind (omp_for) == GF_OMP_FOR_KIND_TRANSFORM_LOOP) + return expand_transformed_loop (omp_for); + + return omp_for; + } + + push_gimplify_context (); + + bool added_decls = canonicalize_conditions (omp_for); + + gimple_seq result = NULL; + location_t loc = OMP_CLAUSE_LOCATION (transformation); + auto dump_loc = dump_user_location_t::from_location_t (loc); + size_t level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (transformation)); + switch (OMP_CLAUSE_CODE (transformation)) + { + case OMP_CLAUSE_UNROLL_FULL: + gcc_assert (TREE_CHAIN (transformation) == NULL); + gcc_assert (level == 0); + result = full_unroll (omp_for, loc, ctx); + break; + case OMP_CLAUSE_UNROLL_NONE: + gcc_assert (TREE_CHAIN (transformation) == NULL); + gcc_assert (level == 0); + if (assign_unroll_full_clause_p (omp_for, transformation)) + { + result = full_unroll (omp_for, loc, ctx); + } + else if (tree unroll_factor + = assign_unroll_partial_clause_p (omp_for, transformation)) + { + result = partial_unroll (omp_for, level, unroll_factor, loc, + transformation, ctx); + } + else { + if (dump_enabled_p ()) + { + /* TODO Try to inform the unrolling pass that the user + wants to unroll this loop. This could relax some + restrictions there, e.g. on the code size? */ + dump_printf_loc ( + MSG_MISSED_OPTIMIZATION, dump_loc, + "not unrolling loop with % directive. Add " + "clause to specify unrolling type or invoke the " + "compiler with --param=omp-unroll-default-factor=n for some" + "constant integer n"); + } + result = transform_gomp_for (omp_for, NULL, ctx); + } + + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (transformation); + if (!unroll_factor) + { + // TODO Use target architecture dependent constants? + unsigned factor = param_omp_unroll_default_factor > 0 + ? param_omp_unroll_default_factor + : 5; + unroll_factor = build_int_cst (integer_type_node, factor); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "% clause without unrolling " + "factor turned into % clause\n", + factor); + } + + result = partial_unroll (omp_for, level, + unroll_factor, loc, transformation, ctx); + } + break; + case OMP_CLAUSE_TILE: + result = tile (omp_for, loc, level, + OMP_CLAUSE_TILE_SIZES (transformation), + transformation, ctx); + break; + default: + gcc_unreachable (); + } + + if (added_decls && gimple_code (result) != GIMPLE_BIND) + result = gimple_build_bind (NULL, result, NULL); + pop_gimplify_context (added_decls ? result : NULL); /* for decls from canonicalize_loops */ + + return result; +} + +/* Remove all loop transformation clauses from the clauses of OMP_FOR and + return a new tree chain containing just those clauses. + + The clauses correspond to transformation *directives* associated with the + OMP_FOR's loop. The returned clauses are ordered from the innermost + directive to the outermost, i.e. in the order in which the transformations + should execute. + + Example: + -------- + -------- + + The loop + + #pragma omp for nowait + #pragma omp unroll partial(5) + #pragma omp tile sizes(2,2) + LOOP + + is represented as + + #pragma omp for nowait unroll_partial(5) tile_sizes(2,2) + LOOP + + Gimplification may add clauses after the transformation clauses added + by the front ends. This function will leave only the "nowait" clause on + OMP_FOR and return the clauses "tile_sizes(2,2) unroll_partial(5)". */ + +static tree +gomp_for_remove_transformation_clauses (gomp_for *omp_for) +{ + tree *clauses = gimple_omp_for_clauses_ptr (omp_for); + tree trans_clauses = NULL; + tree last_other_clause = NULL; + + for (tree c = gimple_omp_for_clauses (omp_for); c != NULL_TREE;) + { + tree chain_tail = OMP_CLAUSE_CHAIN (c); + if (omp_loop_transform_clause_p (c)) + { + if (last_other_clause) + OMP_CLAUSE_CHAIN (last_other_clause) = chain_tail; + else + *clauses = OMP_CLAUSE_CHAIN (c); + + OMP_CLAUSE_CHAIN (c) = NULL; + trans_clauses = chainon (trans_clauses, c); + } + else + { + /* There should be no other clauses between loop transformations ... */ + gcc_assert (!trans_clauses || !last_other_clause + || TREE_CHAIN (last_other_clause) == c); + /* ... and hence stop if transformations were found before the + non-transformation clause C. */ + if (trans_clauses) + break; + last_other_clause = c; + } + + c = chain_tail; + } + + return nreverse (trans_clauses); +} + +static void +print_optimized_unroll_partial_msg (tree c) +{ + gcc_assert (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_PARTIAL); + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "replaced consecutive % directives by " + "%\n", tree_to_uhwi (unroll_factor)); +} + +/* Optimize CLAUSES by removing and merging redundant clauses. Return the + optimized clause chain. */ + +static tree +optimize_transformation_clauses (tree clauses) +{ + if (!clauses) + return NULL_TREE; + + /* The last unroll_partial clause seen in clauses, if any, + or the last merged unroll partial clause. */ + tree unroll_partial = NULL; + /* The last clause was not a unroll_partial clause, if any. + unroll_full and unroll_none are not relevant because + they appear only at the end of a chain. */ + tree last_non_unroll = NULL; + /* Indicates that at least two unroll_partial clauses have been merged + since last_non_unroll was seen. */ + bool merged_unroll_partial = false; + + size_t level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (clauses)); + for (tree c = clauses; c != NULL_TREE; c = OMP_CLAUSE_CHAIN (c)) + { + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + + switch (code) + { + case OMP_CLAUSE_UNROLL_NONE: + /* 'unroll' without a clause cannot be followed by any + transformations because its result does not have canonical loop + nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_FULL: + /* 'unroll full' cannot be followed by any transformations because + its result does not have canonical loop nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + + /* Previous 'unroll partial' directives are useless. */ + if (unroll_partial) + { + if (last_non_unroll) + OMP_CLAUSE_CHAIN (last_non_unroll) = c; + else + clauses = c; + + if (dump_enabled_p ()) + { + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + dump_printf_loc ( + MSG_OPTIMIZED_LOCATIONS, dump_loc, + "removed useless % directives " + "preceding 'omp unroll full'\n"); + } + } + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + /* Merge a sequence of consecutive 'unroll partial' directives. + Note that it impossible for 'unroll full' or 'unroll' to + appear inbetween the 'unroll partial' clauses because they + remove the loop-nest. */ + if (unroll_partial) + { + tree factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial); + tree c_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + if (factor && c_factor) + factor = fold_build2 (MULT_EXPR, TREE_TYPE (factor), factor, + c_factor); + else if (!factor && c_factor) + factor = c_factor; + + gcc_assert (!factor || TREE_CODE (factor) == INTEGER_CST); + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial) = factor; + OMP_CLAUSE_CHAIN (unroll_partial) = OMP_CLAUSE_CHAIN (c); + OMP_CLAUSE_LOCATION (unroll_partial) = OMP_CLAUSE_LOCATION (c); + merged_unroll_partial = true; + } + else + unroll_partial = c; + } + break; + case OMP_CLAUSE_TILE: + { + /* No optimization for those clauses yet, but they end any chain of + "unroll partial" clauses. */ + if (merged_unroll_partial && dump_enabled_p ()) + print_optimized_unroll_partial_msg (unroll_partial); + + if (unroll_partial) + OMP_CLAUSE_CHAIN (unroll_partial) = c; + + unroll_partial = NULL; + merged_unroll_partial = false; + last_non_unroll = c; + } + break; + default: + gcc_unreachable (); + } + + /* The transformations are ordered by the level of the loop-nest to which + they apply in decreasing order. Handle the different levels separately + as long as we do not implement optimizations across the levels. */ + tree next_c = OMP_CLAUSE_CHAIN (c); + if (!next_c) + break; + + size_t next_level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (next_c)); + if (next_level != level) + { + gcc_assert (next_level < level); + tree tail = optimize_transformation_clauses (next_c); + OMP_CLAUSE_CHAIN (c) = tail; + break; + } + else level = next_level; + + } + + if (merged_unroll_partial && dump_enabled_p ()) + print_optimized_unroll_partial_msg (unroll_partial); + + return clauses; +} + +/* Visit the current statement in GSI_P in the walk_omp_for_loops walk and + execute all loop transformations found on it. */ + +void +process_omp_for (gomp_for *omp_for, gimple_seq *containing_seq, walk_ctx *ctx) +{ + auto gsi_p = gsi_for_stmt (omp_for, containing_seq); + tree transform_clauses = gomp_for_remove_transformation_clauses (omp_for); + + /* Do not attempt to transform broken code which might violate the + assumptions of the loop transformation implementations. + + Transformation clauses must be dropped first because following + passes do not handle them. */ + if (seen_error ()) + return; + + transform_clauses = optimize_transformation_clauses (transform_clauses); + + gimple *transformed = omp_for; + if (gimple_omp_for_combined_p (omp_for) + && ctx->inner_combined_loop) + transformed = adjust_combined_loop (omp_for, ctx); + else + transformed = transform_gomp_for (omp_for, transform_clauses, ctx); + + if (transformed == omp_for) + return; + + gsi_replace_with_seq (&gsi_p, transformed, true); + + if (!dump_enabled_p () || !(dump_flags & TDF_DETAILS)) + return; + + if (transformed) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, transformed, + "Transformed loop: %G\n\n", transformed); +} + +/* Traverse SEQ in depth-first order and apply the loop transformation + found on gomp_for statements. */ + +static unsigned int +walk_omp_for_loops (gimple_seq *seq, walk_ctx *ctx) +{ + gimple_stmt_iterator gsi; + for (gsi = gsi_start (*seq); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + switch (gimple_code (stmt)) + { + case GIMPLE_OMP_CRITICAL: + case GIMPLE_OMP_MASTER: + case GIMPLE_OMP_MASKED: + case GIMPLE_OMP_TASKGROUP: + case GIMPLE_OMP_ORDERED: + case GIMPLE_OMP_SCAN: + case GIMPLE_OMP_SECTION: + case GIMPLE_OMP_PARALLEL: + case GIMPLE_OMP_TASK: + case GIMPLE_OMP_SCOPE: + case GIMPLE_OMP_SECTIONS: + case GIMPLE_OMP_SINGLE: + case GIMPLE_OMP_TARGET: + case GIMPLE_OMP_TEAMS: + case GIMPLE_OMP_STRUCTURED_BLOCK: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_OMP_FOR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_for_pre_body_ptr (stmt), ctx); + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + process_omp_for (as_a (stmt), seq, ctx); + break; + } + case GIMPLE_BIND: + { + gbind *bind = as_a (stmt); + ctx->bind = bind; + walk_omp_for_loops (gimple_bind_body_ptr (bind), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_TRY: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_try_eval_ptr (as_a (stmt)), + ctx); + walk_omp_for_loops (gimple_try_cleanup_ptr (as_a (stmt)), + ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_CATCH: + { + gbind *bind = ctx->bind; + walk_omp_for_loops ( + gimple_catch_handler_ptr (as_a (stmt)), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_FILTER: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_eh_filter_failure_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_ELSE: + { + gbind *bind = ctx->bind; + geh_else *eh_else_stmt = as_a (stmt); + walk_omp_for_loops (gimple_eh_else_n_body_ptr (eh_else_stmt), ctx); + walk_omp_for_loops (gimple_eh_else_e_body_ptr (eh_else_stmt), ctx); + ctx->bind = bind; + break; + } + break; + + case GIMPLE_WITH_CLEANUP_EXPR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_wce_cleanup_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_TRANSACTION: + { + gbind *bind = ctx->bind; + auto trans = as_a (stmt); + walk_omp_for_loops (gimple_transaction_body_ptr (trans), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_ASSUME: + break; + + default: + gcc_assert (!gimple_has_substatements (stmt)); + continue; + } + } + + return true; +} + +static unsigned int +execute_omp_transform_loops () +{ + gimple_seq body = gimple_body (current_function_decl); + walk_ctx ctx; + ctx.inner_combined_loop = NULL; + ctx.bind = NULL; + walk_omp_for_loops (&body, &ctx); + + return 0; +} + +namespace +{ + +const pass_data pass_data_omp_transform_loops = { + GIMPLE_PASS, /* type */ + "omp_transform_loops", /* name */ + OPTGROUP_OMP, /* optinfo_flags */ + TV_NONE, /* tv_id */ + PROP_gimple_any, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_omp_transform_loops : public gimple_opt_pass +{ +public: + pass_omp_transform_loops (gcc::context *ctxt) + : gimple_opt_pass (pass_data_omp_transform_loops, ctxt) + { + } + + /* opt_pass methods: */ + virtual unsigned int + execute (function *) + { + return execute_omp_transform_loops (); + } + virtual bool + gate (function *) + { + return flag_openmp || flag_openmp_simd; + } + +}; // class pass_omp_transform_loops + +} // anon namespace + +gimple_opt_pass * +make_pass_omp_transform_loops (gcc::context *ctxt) +{ + return new pass_omp_transform_loops (ctxt); +} diff --git a/gcc/params.opt b/gcc/params.opt index fffa8b1bc64..97365010c38 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -812,6 +812,14 @@ Enum(openacc_privatization) String(quiet) Value(OPENACC_PRIVATIZATION_QUIET) EnumValue Enum(openacc_privatization) String(noisy) Value(OPENACC_PRIVATIZATION_NOISY) +-param=omp-unroll-full-max-iterations= +Common Joined UInteger Var(param_omp_unroll_full_max_iterations) Init(5) Param Optimization +The maximum number of iterations of a loop for which an 'omp unroll' directive on the loop without a clause is turned into an 'omp unroll full'. + +-param=omp-unroll-default-factor= +Common Joined UInteger Var(param_omp_unroll_default_factor) Init(0) Param Optimization +The unroll factor used for loops that have an 'omp unroll partial' directive without an explicit unroll factor. + -param=parloops-chunk-size= Common Joined UInteger Var(param_parloops_chunk_size) Param Optimization Chunk size of omp schedule for loops parallelized by parloops. diff --git a/gcc/passes.def b/gcc/passes.def index 4110a472914..4c31ca7a7c2 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_diagnose_omp_blocks); NEXT_PASS (pass_diagnose_tm_blocks); NEXT_PASS (pass_omp_oacc_kernels_decompose); + NEXT_PASS (pass_omp_transform_loops); NEXT_PASS (pass_lower_omp); NEXT_PASS (pass_lower_cf); NEXT_PASS (pass_lower_tm); diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 75bdc1eda4b..b46f9fc8bdb 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -524,6 +524,18 @@ enum omp_clause_code { /* OpenACC clause: nohost. */ OMP_CLAUSE_NOHOST, + + /* Internal representation for an "omp unroll full" directive. */ + OMP_CLAUSE_UNROLL_FULL, + + /* Internal representation for an "omp unroll" directive without a clause. */ + OMP_CLAUSE_UNROLL_NONE, + + /* Internal representation for an "omp unroll partial" directive. */ + OMP_CLAUSE_UNROLL_PARTIAL, + + /* Represents a "tile" directive internally. */ + OMP_CLAUSE_TILE }; #undef DEFTREESTRUCT diff --git a/gcc/tree-nested.cc b/gcc/tree-nested.cc index 987839577a2..04d61e509ae 100644 --- a/gcc/tree-nested.cc +++ b/gcc/tree-nested.cc @@ -1493,6 +1493,13 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE__SCANTEMP_: break; + /* Clauses related to loop transforms. */ + case OMP_CLAUSE_TILE: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_UNROLL_NONE: + break; + /* The following clause belongs to the OpenACC cache directive, which is discarded during gimplification. */ case OMP_CLAUSE__CACHE_: @@ -2290,6 +2297,13 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE__SCANTEMP_: break; + /* Clauses related to loop transforms. */ + case OMP_CLAUSE_TILE: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_UNROLL_NONE: + break; + /* The following clause belongs to the OpenACC cache directive, which is discarded during gimplification. */ case OMP_CLAUSE__CACHE_: diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index eba2d54ac76..46f00c7a4da 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -428,6 +428,7 @@ extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_omp_transform_loops (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_diagnose_omp_blocks (gcc::context *ctxt); extern gimple_opt_pass *make_pass_expand_omp (gcc::context *ctxt); diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index dcd0c585c09..41b6c480772 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -505,6 +505,54 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case OMP_CLAUSE_EXCLUSIVE: name = "exclusive"; goto print_remap; + case OMP_CLAUSE_UNROLL_FULL: + pp_string (pp, "unroll_full"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } + break; + case OMP_CLAUSE_UNROLL_NONE: + pp_string (pp, "unroll_none"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + pp_string (pp, "unroll_partial"); + if (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause)) + { + pp_left_paren (pp); + dump_generic_node (pp, OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause), spc, flags, + false); + pp_right_paren (pp); + } + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } + break; + case OMP_CLAUSE_TILE: + pp_string (pp, "tile sizes"); + pp_left_paren (pp); + gcc_assert (OMP_CLAUSE_TILE_SIZES (clause)); + dump_generic_node (pp, OMP_CLAUSE_TILE_SIZES (clause), spc, flags, + false); + pp_right_paren (pp); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } + break; case OMP_CLAUSE__LOOPTEMP_: name = "_looptemp_"; goto print_remap; @@ -3633,6 +3681,10 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, pp_string (pp, "#pragma omp distribute"); goto dump_omp_loop; + case OMP_LOOP_TRANS: + pp_string (pp, "#pragma omp loop_transform"); + goto dump_omp_loop; + case OMP_TASKLOOP: pp_string (pp, "#pragma omp taskloop"); goto dump_omp_loop; diff --git a/gcc/tree.cc b/gcc/tree.cc index 067b8edf2e7..0771e56bfd8 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -326,6 +326,10 @@ unsigned const char omp_clause_num_ops[] = 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ + 1, /* OMP_CLAUSE_UNROLL_FULL */ + 1, /* OMP_CLAUSE_UNROLL_NONE */ + 2, /* OMP_CLAUSE_UNROLL_PARTIAL */ + 2, /* OMP_CLAUSE_TILE */ }; const char * const omp_clause_code_name[] = @@ -417,6 +421,10 @@ const char * const omp_clause_code_name[] = "if_present", "finalize", "nohost", + "unroll_full", + "unroll_none", + "unroll_partial", + "tile", }; /* Unless specific to OpenACC, we tend to internally maintain OpenMP-centric diff --git a/gcc/tree.def b/gcc/tree.def index 70699ade9da..86ce3bb5a80 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1214,6 +1214,12 @@ DEFTREECODE (OMP_TASK, "omp_task", tcc_statement, 2) DEFTREECODE (OMP_FOR, "omp_for", tcc_statement, 7) +/* OpenMP - A loop nest to which a loop transformation such as #pragma omp + unroll should be applied, but which is not associated with another directive + such as #pragma omp for. The kind of loop transformations to be applied are + internally represented by clauses. Operands like for OMP_FOR. */ +DEFTREECODE (OMP_LOOP_TRANS, "omp_loop_trans", tcc_statement, 7) + /* OpenMP - #pragma omp simd [clause1 ... clauseN] Operands like for OMP_FOR. */ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 7) diff --git a/gcc/tree.h b/gcc/tree.h index fbf2a7e33e7..39085a6cdb0 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1836,6 +1836,17 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_USE_DEVICE_PTR)->base.public_flag) +/* The level of a collapsed loop nest at which the tranformation represented + by this clause should be applied. */ +#define OMP_CLAUSE_TRANSFORM_LEVEL(NODE) \ + OMP_CLAUSE_OPERAND (NODE, 0) + +#define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 1) + +#define OMP_CLAUSE_TILE_SIZES(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 1) + #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind) From patchwork Sun Oct 1 20:10:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 147154 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1038693vqb; Sun, 1 Oct 2023 13:14:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGecv6BjyoARwtGauy/2MJZOy9CljIuBn2q/YOq2wguoWqX1G+ZHkv4hAPsOd8dM2O6XnHA X-Received: by 2002:a17:906:2189:b0:9b2:7584:80dc with SMTP id 9-20020a170906218900b009b2758480dcmr8432924eju.20.1696191267495; Sun, 01 Oct 2023 13:14:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696191267; cv=none; d=google.com; s=arc-20160816; b=fvtPQPl5FYkH5BSEprG/o2UhbHhkQrlBU2np6JdWsiS2HIBtITSuSli3ZgO7bVFV1q aLhTWZmOmS/ZBeEs2pkH72h75mVhuRWFSimynnB7WN7IiUCm3/RIzwAPfy06i8u3F7jX HB11e46c6TvInQGlt5e7k/u6Jw5m0e6WNmFGOPLgu83f4ahxDW0VtKxeMZDQHtmxyb5R RSQ4bD4VpXub26LkxFFBATQxTa9f1wBCDnkdpNTRrScYSujtFoBw/Q/kc95neQCzQkm3 E0MiEoFq3W5gTnqAneas0hR5NO/Kk0KEVKvPonL9EI4w99xiK90dLJUTob8YSARr80hy UwNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:ironport-sdr:dmarc-filter:delivered-to; bh=zNeD2DRIfW9n2fUZQA7ZwmNccU8BT9J0dHKLIBFdV/k=; fh=OCleZEGPw6EXwGK7DgwfSjSqRFJqj4InDXFwXNGe58Y=; b=HsKXkLxj8Scip3dhX3IATjs8TmtyPE2bxBS5Ins6W7ZpkiK5Iz2qyLVkrt/1LNJ3ti kjqyjrHRVnhfMktxxt2s8FhWNRGcbVF8rTzDyjfi194gOvEs0NRob3ahdNYDOggd64uL QrYle92mWIcphuGw/zcrYlzHuQPn0ifIGy0btaLsyR24m43+40tqJcmv2AFt9KNRsmeo 48C2OrH1W4Iv1yEz5NaRc71MtZDuN7hc3cJWcpdikNIcMPjKYrGoFx+vgKizuo0d1wyT CyR9c29k9jxPwo4S76Nb7LTzd9lcOK1Ed1e7ZVv3Nmr5mrZ3kRAhI+QSAT6Ejjxaf5NE w4GA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k9-20020a170906680900b0099d0a0914b1si19348764ejr.203.2023.10.01.13.14.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Oct 2023 13:14:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A6CD3882024 for ; Sun, 1 Oct 2023 20:12:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 0317F3882675 for ; Sun, 1 Oct 2023 20:11:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0317F3882675 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-CSE-ConnectionGUID: uR03XYbQRJ2QigsLB1Hz/g== X-CSE-MsgGUID: czdsFHTbQTSn8E7OQbWQBA== X-IronPort-AV: E=Sophos;i="6.03,191,1694764800"; d="scan'208";a="18296436" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 01 Oct 2023 12:11:22 -0800 IronPort-SDR: oIRcCV1hitbdAYcjYhrT9ENtp+D+bhm9Xo2zFTgeJPxuJjdEU28ntvEWLbDZoZWFsenPn7n1Ep r0sS6MlWe1BmzDrsqNyTIfZ3rDSnOToBe3DLBUphR5O1UntLSYgYRJezoQGP184wdsZAos3Ik2 9GeWklJP6nI1T49QQBrbuqcdBuO0HWdUguy+4i+EwKcQvfuUVNAWN1x2Fn/JMuF9qs8RMjIoRr 9zj7lTqt6O6aTSqPyAjq/Be3eKFpotlZ6SZnTBHz43LGdbpYk5qDwHYBjW5dt8dKMABJZMgqZT bPk= From: Sandra Loosemore To: CC: , Subject: [WIP 3/4] OpenMP: Fortran front-end support for loop transforms. Date: Sun, 1 Oct 2023 14:10:20 -0600 Message-ID: <20231001201021.785572-4-sandra@codesourcery.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231001201021.785572-1-sandra@codesourcery.com> References: <20231001201021.785572-1-sandra@codesourcery.com> MIME-Version: 1.0 X-ClientProxiedBy: svr-orw-mbx-11.mgc.mentorg.com (147.34.90.211) To svr-orw-mbx-13.mgc.mentorg.com (147.34.90.213) X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778585454858154312 X-GMAIL-MSGID: 1778585454858154312 From: Frederik Harwath gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Print unroll clauses. (show_omp_node): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. (show_code_node): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL, ST_OMP_TILE, and ST_OMP_END_TILE. (struct gfc_omp_clauses): Add fields for tile and unroll. (enum gfc_exec_op): Add EXEC_OMP_UNROLL and EXEC_OMP_TILE. (loop_transform_p): Declare. (gfc_expr_list_len): Declare. * match.h (gfc_match_omp_tile): Declare. (gfc_match_omp_unroll): Declare. * openmp.cc (gfc_free_omp_clauses): Free tile_sizes field. (match_tile_sizes): New. (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. (gfc_match_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL and OMP_CLAUSE_UNROLL_PARTIAL syntax. (OMP_UNROLL_CLAUSES): Define. (OMP_TILE_CLAUSES): Define. (gfc_match_omp_tile): New. (gfc_match_omp_unroll): New. (find_nested_loop_in_chain): Handle loop transforms. (find_nested_loop_or_transform_in_chain): New. (find_nested_loop_or_transform_in_block): New. (diagnose_intervening_code_errors_1): Handle loop transforms. (restructure_intervening_code): Handle loop transforms. (is_outer_iteration_variable): Adjust to avoid fencepost error. (check_nested_loop_in_chain): Handle loop transforms. (expr_uses_intervening_var): Add assertion. (is_intervening_var): Add assertion. (expr_is_invariant): Adjust to avoid fencepost error. (omp_unroll_removes_loop_nest): New. (resolve_nested_loop_transforms): New. (resolve_omp_unroll): New. (resolve_nested_loops): New, split from... (resolve_omp_do) ...here. (resolve_omp_tile): New. (omp_code_to_statement): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. (resolve_oacc_nested_loops): Adjust assertion. (gfc_resolve_omp_directive): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. * parse.cc (decode_omp_directive): Handle tile/unroll directives. (case_exec_markers): Handle ST_OMP_TILE and ST_OMP_UNROLL. (gfc_ascii_statement): Handle tile/unroll directives. (parse_omp_do): Handle ST_OMP_TILE and ST_OMP_UNROLL. (parse_executable): Handle ST_OMP_TILE and ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): HANDLE EXEC_OMP_TILE and EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Handle ST_OMP_TILE and ST_OMP_UNROLL. * trans-openmp.cc (gfc_trans_omp_clauses): Handle tile/unroll directives. (loop_transform_p): New. (gfc_expr_list_len): New. (computer_transformed_depth): New. (gfc_trans_omp_do): Handle loop transformations. (gfc_trans_omp_directive): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. * trans.cc (trans_code): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/collapse1.f90: Adjust error messages. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New. * gfortran.dg/gomp/loop-transforms/tile-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-1a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-3.f90: New. * gfortran.dg/gomp/loop-transforms/tile-4.f90: New. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New. * gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90: New. * gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90: New. * gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-10.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-11.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-12.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-8.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New. * gfortran.dg/gomp/pure-1.f90: Move unroll/tile tests here from... * gfortran.dg/gomp/pure-2.f90: ...here. libgomp/ChangeLog: * testsuite/libgomp.fortran/imperfect-transform-1.f90: New. * testsuite/libgomp.fortran/imperfect-transform-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/nested-fn.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New. * testsuite/libgomp.fortran/target-imperfect-transform-1.f90: New. * testsuite/libgomp.fortran/target-imperfect-transform-2.f90: New. Co-Authored-By: Sandra Loosemore --- gcc/fortran/dump-parse-tree.cc | 28 + gcc/fortran/gfortran.h | 12 +- gcc/fortran/match.h | 2 + gcc/fortran/openmp.cc | 730 ++++++++++++++---- gcc/fortran/parse.cc | 48 ++ gcc/fortran/resolve.cc | 6 + gcc/fortran/st.cc | 2 + gcc/fortran/trans-openmp.cc | 182 ++++- gcc/fortran/trans.cc | 2 + gcc/testsuite/gfortran.dg/gomp/collapse1.f90 | 6 +- .../gomp/loop-transforms/inner-loops.f90 | 124 +++ .../gomp/loop-transforms/tile-1.f90 | 163 ++++ .../gomp/loop-transforms/tile-1a.f90 | 10 + .../gomp/loop-transforms/tile-2.f90 | 80 ++ .../gomp/loop-transforms/tile-3.f90 | 18 + .../gomp/loop-transforms/tile-4.f90 | 95 +++ .../loop-transforms/tile-imperfect-nest.f90 | 93 +++ .../loop-transforms/tile-inner-loops-1.f90 | 16 + .../loop-transforms/tile-inner-loops-2.f90 | 23 + .../loop-transforms/tile-inner-loops-3.f90 | 22 + .../loop-transforms/tile-inner-loops-3a.f90 | 31 + .../loop-transforms/tile-inner-loops-4.f90 | 30 + .../loop-transforms/tile-inner-loops-4a.f90 | 26 + .../loop-transforms/tile-inner-loops-5.f90 | 123 +++ .../tile-non-rectangular-1.f90 | 71 ++ .../tile-non-rectangular-2.f90 | 12 + .../gomp/loop-transforms/tile-unroll-1.f90 | 57 ++ .../gomp/loop-transforms/unroll-1.f90 | 277 +++++++ .../gomp/loop-transforms/unroll-10.f90 | 7 + .../gomp/loop-transforms/unroll-11.f90 | 75 ++ .../gomp/loop-transforms/unroll-12.f90 | 29 + .../gomp/loop-transforms/unroll-2.f90 | 22 + .../gomp/loop-transforms/unroll-3.f90 | 17 + .../gomp/loop-transforms/unroll-4.f90 | 18 + .../gomp/loop-transforms/unroll-5.f90 | 18 + .../gomp/loop-transforms/unroll-6.f90 | 19 + .../gomp/loop-transforms/unroll-7.f90 | 62 ++ .../gomp/loop-transforms/unroll-8.f90 | 22 + .../gomp/loop-transforms/unroll-9.f90 | 18 + .../loop-transforms/unroll-inner-loop.f90 | 57 ++ .../loop-transforms/unroll-no-clause-1.f90 | 20 + .../loop-transforms/unroll-no-clause-2.f90 | 21 + .../loop-transforms/unroll-no-clause-3.f90 | 23 + .../loop-transforms/unroll-non-rect-1.f90 | 31 + .../gomp/loop-transforms/unroll-simd-1.f90 | 244 ++++++ .../gomp/loop-transforms/unroll-simd-2.f90 | 57 ++ .../gomp/loop-transforms/unroll-tile-1.f90 | 37 + .../gomp/loop-transforms/unroll-tile-2.f90 | 41 + .../loop-transforms/unroll-tile-inner-1.f90 | 25 + gcc/testsuite/gfortran.dg/gomp/pure-1.f90 | 26 + gcc/testsuite/gfortran.dg/gomp/pure-2.f90 | 25 - .../libgomp.fortran/imperfect-transform-1.f90 | 70 ++ .../libgomp.fortran/imperfect-transform-2.f90 | 70 ++ .../loop-transforms/inner-1.f90 | 77 ++ .../loop-transforms/nested-fn.f90 | 19 + .../loop-transforms/tile-1.f90 | 71 ++ .../loop-transforms/tile-2.f90 | 117 +++ .../loop-transforms/tile-unroll-1.f90 | 112 +++ .../loop-transforms/tile-unroll-2.f90 | 71 ++ .../loop-transforms/tile-unroll-3.f90 | 77 ++ .../loop-transforms/tile-unroll-4.f90 | 75 ++ .../loop-transforms/unroll-1.f90 | 54 ++ .../loop-transforms/unroll-2.f90 | 88 +++ .../loop-transforms/unroll-3.f90 | 59 ++ .../loop-transforms/unroll-4.f90 | 72 ++ .../loop-transforms/unroll-5.f90 | 55 ++ .../loop-transforms/unroll-6.f90 | 105 +++ .../loop-transforms/unroll-7.f90 | 198 +++++ .../loop-transforms/unroll-7a.f90 | 7 + .../loop-transforms/unroll-7b.f90 | 7 + .../loop-transforms/unroll-7c.f90 | 7 + .../loop-transforms/unroll-8.f90 | 38 + .../loop-transforms/unroll-simd-1.f90 | 34 + .../loop-transforms/unroll-tile-1.f90 | 112 +++ .../loop-transforms/unroll-tile-2.f90 | 71 ++ .../target-imperfect-transform-1.f90 | 73 ++ .../target-imperfect-transform-2.f90 | 73 ++ 77 files changed, 4818 insertions(+), 197 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc index 68122e3e6fd..859f3f36609 100644 --- a/gcc/fortran/dump-parse-tree.cc +++ b/gcc/fortran/dump-parse-tree.cc @@ -2108,6 +2108,26 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) } if (omp_clauses->assume) show_omp_assumes (omp_clauses->assume); + if (omp_clauses->unroll_full) + fputs (" FULL", dumpfile); + if (omp_clauses->unroll_partial) + { + fputs (" PARTIAL", dumpfile); + if (omp_clauses->unroll_partial_factor > 0) + fprintf (dumpfile, "(%u)", omp_clauses->unroll_partial_factor); + } + if (omp_clauses->tile_sizes) + { + gfc_expr_list *sizes; + fputs (" TILE SIZES(", dumpfile); + for (sizes = omp_clauses->tile_sizes; sizes; sizes = sizes->next) + { + show_expr (sizes->expr); + if (sizes->next) + fputs (", ", dumpfile); + } + fputc (')', dumpfile); + } } /* Show a single OpenMP or OpenACC directive node and everything underneath it @@ -2220,6 +2240,8 @@ show_omp_node (int level, gfc_code *c) name = "TEAMS DISTRIBUTE PARALLEL DO SIMD"; break; case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: name = "TEAMS DISTRIBUTE SIMD"; break; case EXEC_OMP_TEAMS_LOOP: name = "TEAMS LOOP"; break; + case EXEC_OMP_TILE: name = "TILE"; break; + case EXEC_OMP_UNROLL: name = "UNROLL"; break; case EXEC_OMP_WORKSHARE: name = "WORKSHARE"; break; default: gcc_unreachable (); @@ -2296,6 +2318,8 @@ show_omp_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: omp_clauses = c->ext.omp_clauses; break; @@ -2357,6 +2381,8 @@ show_omp_node (int level, gfc_code *c) d = d->block; } } + else if (c->op == EXEC_OMP_UNROLL || c->op == EXEC_OMP_TILE) + show_code (level + 1, c->block != NULL ? c->block->next : c->next); else show_code (level + 1, c->block->next); if (c->op == EXEC_OMP_ATOMIC) @@ -3537,6 +3563,8 @@ show_code_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: show_omp_node (level, c); break; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6caf7765ac6..bf81d1ce009 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -321,7 +321,9 @@ enum gfc_statement ST_OMP_ALLOCATE, ST_OMP_ALLOCATE_EXEC, ST_OMP_ALLOCATORS, ST_OMP_END_ALLOCATORS, /* Note: gfc_match_omp_nothing returns ST_NONE. */ - ST_OMP_NOTHING, ST_NONE + ST_OMP_NOTHING, ST_NONE, + ST_OMP_UNROLL, ST_OMP_END_UNROLL, + ST_OMP_TILE, ST_OMP_END_TILE }; /* Types of interfaces that we can have. Assignment interfaces are @@ -1564,6 +1566,7 @@ typedef struct gfc_omp_clauses struct gfc_expr *dist_chunk_size; struct gfc_expr *message; struct gfc_omp_assumptions *assume; + struct gfc_expr_list *tile_sizes; const char *critical_name; enum gfc_omp_default_sharing default_sharing; enum gfc_omp_atomic_op atomic_op; @@ -1577,6 +1580,8 @@ typedef struct gfc_omp_clauses unsigned grainsize_strict:1, num_tasks_strict:1, compare:1, weak:1; unsigned non_rectangular:1, order_concurrent:1; unsigned contains_teams_construct:1, target_first_st_is_teams:1; + unsigned unroll_full:1, unroll_none:1, unroll_partial:1; + unsigned unroll_partial_factor; ENUM_BITFIELD (gfc_omp_sched_kind) sched_kind:3; ENUM_BITFIELD (gfc_omp_device_type) device_type:2; ENUM_BITFIELD (gfc_omp_memorder) memorder:3; @@ -3011,6 +3016,7 @@ enum gfc_exec_op EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED, EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD, EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE, + EXEC_OMP_UNROLL, EXEC_OMP_TILE, EXEC_OMP_ERROR, EXEC_OMP_ALLOCATE, EXEC_OMP_ALLOCATORS }; @@ -3927,6 +3933,10 @@ void gfc_generate_module_code (gfc_namespace *); /* trans-intrinsic.cc */ bool gfc_inline_intrinsic_function_p (gfc_expr *); +/* trans-openmp.cc */ +bool loop_transform_p (gfc_exec_op op); +int gfc_expr_list_len (gfc_expr_list *); + /* bbt.cc */ typedef int (*compare_fn) (void *, void *); void gfc_insert_bbt (void *, void *, compare_fn); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 7d72725ed3c..d7156b9f308 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -228,6 +228,8 @@ match gfc_match_omp_teams_distribute_parallel_do_simd (void); match gfc_match_omp_teams_distribute_simd (void); match gfc_match_omp_teams_loop (void); match gfc_match_omp_threadprivate (void); +match gfc_match_omp_tile (void); +match gfc_match_omp_unroll (void); match gfc_match_omp_workshare (void); match gfc_match_omp_end_critical (void); match gfc_match_omp_end_nowait (void); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 6b9c5e81a37..6dbdd0d5685 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -193,6 +193,7 @@ gfc_free_omp_clauses (gfc_omp_clauses *c) i == OMP_LIST_USES_ALLOCATORS); gfc_free_expr_list (c->wait_list); gfc_free_expr_list (c->tile_list); + gfc_free_expr_list (c->tile_sizes); free (CONST_CAST (char *, c->critical_name)); if (c->assume) { @@ -989,6 +990,76 @@ cleanup: return MATCH_ERROR; } +static match +match_tile_sizes (gfc_expr_list **list) +{ + gfc_expr_list *head, *tail, *p; + locus old_loc; + gfc_expr *expr; + match m; + + head = tail = NULL; + + old_loc = gfc_current_locus; + + m = gfc_match_char ('('); + if (m != MATCH_YES) + goto syntax; + + for (;;) + { + m = gfc_match_expr (&expr); + if (m == MATCH_YES) + { + p = gfc_get_expr_list (); + if (head == NULL) + head = tail = p; + else + { + tail->next = p; + tail = tail->next; + } + int size = 0; + if (m == MATCH_YES) + { + if (gfc_extract_int (expr, &size, 1)) + goto cleanup; + else if (size < 1) + { + gfc_error_now ("tile size not constant " + "positive integer at %C"); + goto cleanup; + } + tail->expr = expr; + } + goto next_item; + } + if (m == MATCH_ERROR) + goto cleanup; + goto syntax; + + next_item: + if (gfc_match_char (')') == MATCH_YES) + break; + if (gfc_match_char (',') != MATCH_YES) + goto syntax; + } + + while (*list) + list = &(*list)->next; + + *list = head; + return MATCH_YES; + +syntax: + gfc_error ("Syntax error in 'tile sizes' list at %C"); + +cleanup: + gfc_free_expr_list (head); + gfc_current_locus = old_loc; + return MATCH_ERROR; +} + /* OpenMP clauses. */ enum omp_mask1 { @@ -1063,6 +1134,10 @@ enum omp_mask1 /* More OpenMP clauses and OpenACC 2.0+ specific clauses. */ enum omp_mask2 { + OMP_CLAUSE_UNROLL_FULL, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_NONE, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_PARTIAL, /* OpenMP 5.1. */ + OMP_CLAUSE_TILE, /* OpenMP 5.1. */ OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, @@ -2667,6 +2742,15 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, && gfc_match_motion_var_list ("from (", &c->lists[OMP_LIST_FROM], &head) == MATCH_YES) continue; + if ((mask & OMP_CLAUSE_UNROLL_FULL) + && (m = gfc_match_dupl_check (!c->unroll_full, "full")) + != MATCH_NO) + { + if (m == MATCH_ERROR) + goto error; + c->unroll_full = needs_space = true; + continue; + } break; case 'g': if ((mask & OMP_CLAUSE_GANG) @@ -3326,6 +3410,32 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, } break; case 'p': + if (mask & OMP_CLAUSE_UNROLL_PARTIAL) + { + if ((m = gfc_match_dupl_check (!c->unroll_partial, "partial")) + != MATCH_NO) + { + int unroll_factor; + if (m == MATCH_ERROR) + goto error; + + c->unroll_partial = true; + + gfc_expr *cexpr = NULL; + m = gfc_match (" ( %e )", &cexpr); + if (m == MATCH_NO) + ; + else if (m == MATCH_YES + && !gfc_extract_int (cexpr, &unroll_factor, -1) + && unroll_factor > 0) + c->unroll_partial_factor = unroll_factor; + else + gfc_error_now ("PARTIAL clause argument not constant " + "positive integer at %C"); + gfc_free_expr (cexpr); + continue; + } + } if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], @@ -4446,6 +4556,10 @@ cleanup: (omp_mask (OMP_CLAUSE_AT) | OMP_CLAUSE_MESSAGE | OMP_CLAUSE_SEVERITY) #define OMP_WORKSHARE_CLAUSES \ omp_mask (OMP_CLAUSE_NOWAIT) +#define OMP_UNROLL_CLAUSES \ + (omp_mask (OMP_CLAUSE_UNROLL_FULL) | OMP_CLAUSE_UNROLL_PARTIAL) +#define OMP_TILE_CLAUSES \ + (omp_mask (OMP_CLAUSE_TILE)) #define OMP_ALLOCATORS_CLAUSES \ omp_mask (OMP_CLAUSE_ALLOCATE) @@ -6654,6 +6768,30 @@ gfc_match_omp_teams_distribute_simd (void) | OMP_SIMD_CLAUSES); } +match +gfc_match_omp_tile (void) +{ + gfc_omp_clauses *c = gfc_get_omp_clauses(); + new_st.op = EXEC_OMP_TILE; + new_st.ext.omp_clauses = c; + + return match_tile_sizes (&c->tile_sizes); +} + +match +gfc_match_omp_unroll (void) +{ + match m = match_omp (EXEC_OMP_UNROLL, OMP_UNROLL_CLAUSES); + + /* Add an internal clause as a marker to indicate that this "unroll" + directive had no clause. */ + if (new_st.ext.omp_clauses + && !new_st.ext.omp_clauses->unroll_full + && !new_st.ext.omp_clauses->unroll_partial) + new_st.ext.omp_clauses->unroll_none = true; + + return m; +} match gfc_match_omp_workshare (void) @@ -9602,6 +9740,11 @@ find_nested_loop_in_chain (gfc_code *chain) { if (code->op == EXEC_DO) return code; + else if (loop_transform_p (code->op) && code->block) + { + code = code->block; + continue; + } else if (code->op == EXEC_BLOCK) { gfc_code *c = find_nested_loop_in_block (code); @@ -9624,6 +9767,63 @@ find_nested_loop_in_block (gfc_code *block) return find_nested_loop_in_chain (ns->code); } +/* Forward declaration for mutually recursive functions. */ +static gfc_code * +find_next_loop_or_transform_in_block (gfc_code *block, gfc_code **imperfectp); + +/* Like find_nested_loop_in_chain, but also stop when a loop transform is + found and check for intervening code too. Return the first nested + DO loop or loop transform in CHAIN, and set *IMPERFECTP to the first + intervening code statement if one is found. */ +static gfc_code * +find_next_loop_or_transform_in_chain (gfc_code *chain, gfc_code **imperfectp) +{ + gfc_code *code; + gfc_code *result = NULL; + + if (!chain) + return NULL; + + for (code = chain; code; code = code->next) + { + /* DO WHILE and DO CONCURRENT are errors, but we need to catch them + here to ensure the right error is diagnosed elsewhere. */ + if (!result + && (code->op == EXEC_DO + || code->op == EXEC_DO_WHILE + || code->op == EXEC_DO_CONCURRENT + || loop_transform_p (code->op))) + result = code; + else if (!result && code->op == EXEC_BLOCK) + { + result = find_next_loop_or_transform_in_block (code, imperfectp); + /* If no loop in the block, the block itself is intervening code. */ + if (!result && !*imperfectp) + *imperfectp = code; + } + else if (code->op == EXEC_NOP || code->op == EXEC_CONTINUE) + continue; + else if (!*imperfectp) + *imperfectp = code; + if (result && *imperfectp) + break; + } + return result; +} + +/* Like find_nested_loop_in_block, but also checks for intervening code. + Return the first nested DO loop in BLOCK, or NULL if there + isn't one. Sets *IMPERFECTP to the first piece of intervening code. */ +static gfc_code * +find_next_loop_or_transform_in_block (gfc_code *block, gfc_code **imperfectp) +{ + gfc_namespace *ns; + gcc_assert (block->op == EXEC_BLOCK); + ns = block->ext.block.ns; + gcc_assert (ns); + return find_next_loop_or_transform_in_chain (ns->code, imperfectp); +} + void gfc_resolve_omp_do_blocks (gfc_code *code, gfc_namespace *ns) { @@ -10059,6 +10259,9 @@ diagnose_intervening_code_errors_1 (gfc_code *chain, gfc_namespace* ns = code->ext.block.ns; diagnose_intervening_code_errors_1 (ns->code, state); } + else if (loop_transform_p (code->op) && code->block) + /* Recurse on loop transformations. */ + diagnose_intervening_code_errors_1 (code->block->next, state); else /* Treat the whole statement as a unit. */ { @@ -10125,19 +10328,32 @@ restructure_intervening_code (gfc_code **chainp, gfc_code *outer_loop, for (code = *chainp; code; code = code->next, chainp = &((*chainp)->next)) { - if (code->op == EXEC_DO) + if (code->op == EXEC_DO || loop_transform_p (code->op)) { - /* Cut CODE free from its chain, leaving the ends dangling. */ + gfc_code *c = code; + + /* Treat a series of loop transforms as a unit, same as a single + EXEC_DO. CODE is the first and C is the last in the chain. */ + while (loop_transform_p (c->op) && !c->block) + c = c->next; + + gcc_assert (c); + gcc_assert (c->op == EXEC_DO + || (loop_transform_p (c->op) && c->block)); + + /* Cut the transforms and the loop they apply to free from the + chain, leaving the ends dangling. */ *chainp = NULL; - tail = code->next; - code->next = NULL; + tail = c->next; + c->next = NULL; - if (count == 1) - innermost_loop = code; + if (count == 1 && c->op == EXEC_DO) + innermost_loop = c; else innermost_loop - = restructure_intervening_code (&(code->block->next), - code, count - 1); + = restructure_intervening_code (&(c->block->next), c, + (loop_transform_p (c->op) + ? count : count - 1)); break; } else if (code->op == EXEC_BLOCK @@ -10190,7 +10406,7 @@ restructure_intervening_code (gfc_code **chainp, gfc_code *outer_loop, /* For loops, finally splice CODE into OUTER_LOOP. We already handled relinking EXEC_BLOCK above. */ - if (code->op == EXEC_DO && outer_loop) + if ((code->op == EXEC_DO || loop_transform_p (code->op)) && outer_loop) outer_loop->block->next = code; return innermost_loop; @@ -10204,13 +10420,13 @@ is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) int i; gfc_code *do_code = code; - for (i = 1; i < depth; i++) + for (i = 0; i < depth; i++) { - do_code = find_nested_loop_in_chain (do_code->block->next); - gcc_assert (do_code); + gcc_assert (do_code && do_code->op == EXEC_DO); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (var == ivar) return true; + do_code = find_nested_loop_in_chain (do_code->block->next); } return false; } @@ -10232,6 +10448,11 @@ check_nested_loop_in_chain (gfc_code *chain, gfc_expr *expr, gfc_symbol *sym, { if (code->op == EXEC_DO) return code; + else if (loop_transform_p (code->op) && code->block) + { + code = code->block; + continue; + } else if (code->op == EXEC_BLOCK) { gfc_code *c = check_nested_loop_in_block (code, expr, sym, bad); @@ -10299,6 +10520,7 @@ expr_uses_intervening_var (gfc_code *code, int depth, gfc_expr *expr) for (i = 0; i < depth; i++) { bool bad = false; + gcc_assert (do_code && do_code->op == EXEC_DO); do_code = check_nested_loop_in_chain (do_code->block->next, expr, NULL, &bad); if (bad) @@ -10318,6 +10540,7 @@ is_intervening_var (gfc_code *code, int depth, gfc_symbol *sym) for (i = 0; i < depth; i++) { bool bad = false; + gcc_assert (do_code && do_code->op == EXEC_DO); do_code = check_nested_loop_in_chain (do_code->block->next, NULL, sym, &bad); if (bad) @@ -10334,13 +10557,13 @@ expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) int i; gfc_code *do_code = code; - for (i = 1; i < depth; i++) + for (i = 0; i < depth; i++) { - do_code = find_nested_loop_in_chain (do_code->block->next); - gcc_assert (do_code); + gcc_assert (do_code && do_code->op == EXEC_DO); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (gfc_find_sym_in_expr (ivar, expr)) return false; + do_code = find_nested_loop_in_chain (do_code->block->next); } return true; } @@ -10408,135 +10631,131 @@ bound_expr_is_canonical (gfc_code *code, int depth, gfc_expr *expr, return false; } -static void -resolve_omp_do (gfc_code *code) +static bool +omp_unroll_removes_loop_nest (gfc_code *code) { - gfc_code *do_code, *next; - int list, i, count; - gfc_omp_namelist *n; - gfc_symbol *dovar; - const char *name; - bool is_simd = false; - bool errorp = false; - bool perfect_nesting_errorp = false; + gcc_checking_assert (code->op == EXEC_OMP_UNROLL); + if (!code->ext.omp_clauses) + return true; - switch (code->op) + if (code->ext.omp_clauses->unroll_none) { - case EXEC_OMP_DISTRIBUTE: name = "!$OMP DISTRIBUTE"; break; - case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_DISTRIBUTE_SIMD: - name = "!$OMP DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_DO: name = "!$OMP DO"; break; - case EXEC_OMP_DO_SIMD: name = "!$OMP DO SIMD"; is_simd = true; break; - case EXEC_OMP_LOOP: name = "!$OMP LOOP"; break; - case EXEC_OMP_PARALLEL_DO: name = "!$OMP PARALLEL DO"; break; - case EXEC_OMP_PARALLEL_DO_SIMD: - name = "!$OMP PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_PARALLEL_LOOP: name = "!$OMP PARALLEL LOOP"; break; - case EXEC_OMP_PARALLEL_MASKED_TASKLOOP: - name = "!$OMP PARALLEL MASKED TASKLOOP"; - break; - case EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD: - name = "!$OMP PARALLEL MASKED TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_PARALLEL_MASTER_TASKLOOP: - name = "!$OMP PARALLEL MASTER TASKLOOP"; - break; - case EXEC_OMP_PARALLEL_MASTER_TASKLOOP_SIMD: - name = "!$OMP PARALLEL MASTER TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_MASKED_TASKLOOP: name = "!$OMP MASKED TASKLOOP"; break; - case EXEC_OMP_MASKED_TASKLOOP_SIMD: - name = "!$OMP MASKED TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_MASTER_TASKLOOP: name = "!$OMP MASTER TASKLOOP"; break; - case EXEC_OMP_MASTER_TASKLOOP_SIMD: - name = "!$OMP MASTER TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_SIMD: name = "!$OMP SIMD"; is_simd = true; break; - case EXEC_OMP_TARGET_PARALLEL_DO: name = "!$OMP TARGET PARALLEL DO"; break; - case EXEC_OMP_TARGET_PARALLEL_DO_SIMD: - name = "!$OMP TARGET PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_PARALLEL_LOOP: - name = "!$OMP TARGET PARALLEL LOOP"; - break; - case EXEC_OMP_TARGET_SIMD: - name = "!$OMP TARGET SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE: - name = "!$OMP TARGET TEAMS DISTRIBUTE"; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD: - name = "!$OMP TARGET TEAMS DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_TARGET_TEAMS_LOOP: name = "!$OMP TARGET TEAMS LOOP"; break; - case EXEC_OMP_TASKLOOP: name = "!$OMP TASKLOOP"; break; - case EXEC_OMP_TASKLOOP_SIMD: - name = "!$OMP TASKLOOP SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE: name = "!$OMP TEAMS DISTRIBUTE"; break; - case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO: - name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO"; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: - name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: - name = "!$OMP TEAMS DISTRIBUTE SIMD"; - is_simd = true; - break; - case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; - default: gcc_unreachable (); + gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " + "into a non-loop", + &code->loc); + return true; } + if (code->ext.omp_clauses->unroll_full) + { + gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " + "non-loop", + &code->loc); + return true; + } + return false; +} - if (code->ext.omp_clauses) - resolve_omp_clauses (code, code->ext.omp_clauses, NULL); +static gfc_code * +resolve_nested_loop_transforms (gfc_code *code, const char *name, + int required_depth, locus *loc) +{ + if (!code) + return code; - do_code = code->block->next; - if (code->ext.omp_clauses->orderedc) - count = code->ext.omp_clauses->orderedc; - else + bool error = false; + while (loop_transform_p (code->op)) { - count = code->ext.omp_clauses->collapse; - if (count <= 0) - count = 1; + if (!error && code->op == EXEC_OMP_UNROLL) + { + if (omp_unroll_removes_loop_nest (code)) + { + gfc_error ("missing canonical loop nest after %s at %L", name, + loc); + error = true; + } + else if (required_depth > 1) + { + gfc_error ("loop nest depth after !$OMP UNROLL at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + } + else if (!error && code->op == EXEC_OMP_TILE + && required_depth > gfc_expr_list_len (code->ext.omp_clauses->tile_sizes)) + { + gfc_error ("loop nest depth after !$OMP TILE at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + + if (code->block) + code = code->block->next; + else + code = code->next; } + gcc_checking_assert (!loop_transform_p (code->op)); - /* While the spec defines the loop nest depth independently of the COLLAPSE - clause, in practice the middle end only pays attention to the COLLAPSE - depth and treats any further inner loops as the final-loop-body. So - here we also check canonical loop nest form only for the number of - outer loops specified by the COLLAPSE clause too. */ - for (i = 1; i <= count; i++) + return code; +} + +static void +resolve_omp_unroll (gfc_code *code) +{ + const char *descr = "!$OMP UNROLL"; + locus *loc = &code->loc; + + if (!code->block || code->block->op == EXEC_DO) + return; + + code = resolve_nested_loop_transforms (code->block->next, descr, 1, + &code->loc); + + if (code->op == EXEC_DO) + return; + + if (code->op == EXEC_DO_WHILE) + { + gfc_error ("%s invalid around DO WHILE or DO without loop " + "control at %L", descr, loc); + return; + } + + if (code->op == EXEC_DO_CONCURRENT) { + gfc_error ("%s invalid around DO CONCURRENT loop at %L", + descr, loc); + return; + } + + gfc_error ("missing canonical loop nest after %s at %L", + descr, loc); +} + +/* Shared helper function for resolve_omp_do and resolve_omp_tile: + check that we have NUM_LOOPS nested loops at DO_CODE. CODE and NAME + are for the outer OMP construct, used for error checking. + Note that DO_CODE should be an EXEC_DO, with all the outer loop + transformations stripped off already. */ + +static void +resolve_nested_loops (gfc_code *code, const char *name, gfc_code *do_code, + int num_loops, bool is_simd, bool is_tile) +{ + bool errorp = false; + bool perfect_nesting_errorp = false; + bool is_nested_tile = false; + gfc_omp_namelist *n; + gfc_code *next; + int list; + bool any_imperfect = false; + gfc_code *outer_do_code = do_code; + + for (int i = 0; i < num_loops; i++) + { + gfc_symbol *dovar; gfc_symbol *start_var = NULL, *end_var = NULL; + gfc_code *imperfect = NULL; + /* Parse errors are not recoverable. */ if (do_code->op == EXEC_DO_WHILE) { @@ -10550,7 +10769,16 @@ resolve_omp_do (gfc_code *code) &do_code->loc); return; } + if (do_code->op != EXEC_DO) + { + gfc_error ("%s must be DO loop at %L", name, + &do_code->loc); + break; + } + gcc_assert (do_code->op == EXEC_DO); + if (!gfc_resolve_expr (do_code->ext.iterator->var)) + break; if (do_code->ext.iterator->var->ts.type != BT_INTEGER) { gfc_error ("%s iteration variable must be of type integer at %L", @@ -10584,20 +10812,20 @@ resolve_omp_do (gfc_code *code) "LINEAR at %L", name, &do_code->loc); errorp = true; } - if (is_outer_iteration_variable (code, i, dovar)) + if (is_outer_iteration_variable (outer_do_code, i, dovar)) { gfc_error ("%s iteration variable used in more than one loop at %L", name, &do_code->loc); errorp = true; } - else if (is_intervening_var (code, i, dovar)) + else if (is_intervening_var (outer_do_code, i, dovar)) { gfc_error ("%s iteration variable at %L is bound in " "intervening code", name, &do_code->loc); errorp = true; } - else if (!bound_expr_is_canonical (code, i, + else if (!bound_expr_is_canonical (outer_do_code, i, do_code->ext.iterator->start, &start_var)) { @@ -10605,7 +10833,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->start)) { gfc_error ("%s loop start expression at %L uses variable bound in " @@ -10613,7 +10841,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (!bound_expr_is_canonical (code, i, + else if (!bound_expr_is_canonical (outer_do_code, i, do_code->ext.iterator->end, &end_var)) { @@ -10621,7 +10849,7 @@ resolve_omp_do (gfc_code *code) name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->end)) { gfc_error ("%s loop end expression at %L uses variable bound in " @@ -10635,13 +10863,14 @@ resolve_omp_do (gfc_code *code) "iteration variables at %L", name, &do_code->loc); errorp = true; } - else if (!expr_is_invariant (code, i, do_code->ext.iterator->step)) + else if (!expr_is_invariant (outer_do_code, i, + do_code->ext.iterator->step)) { gfc_error ("%s loop increment not in canonical form at %L", name, &do_code->loc); errorp = true; } - else if (expr_uses_intervening_var (code, i, + else if (expr_uses_intervening_var (outer_do_code, i, do_code->ext.iterator->step)) { gfc_error ("%s loop increment expression at %L uses variable " @@ -10654,21 +10883,24 @@ resolve_omp_do (gfc_code *code) /* Only parse loop body into nested loop and intervening code if there are supposed to be more loops in the nest to collapse. */ - if (i == count) + if (i == num_loops - 1) break; - next = find_nested_loop_in_chain (do_code->block->next); + next = find_next_loop_or_transform_in_chain (do_code->block->next, + &imperfect); if (!next) { /* Parse error, can't recover from this. */ - gfc_error ("not enough DO loops for collapsed %s (level %d) at %L", - name, i, &code->loc); + gfc_error ("not enough DO loops for %s (level %d) at %L", + name, i + 1, &code->loc); return; } - else if (next != do_code->block->next || next->next) + else if (imperfect) /* Imperfectly nested loop found. */ { + any_imperfect = true; + /* Only diagnose violation of imperfect nesting constraints once. */ if (!perfect_nesting_errorp) { @@ -10686,7 +10918,19 @@ resolve_omp_do (gfc_code *code) name, &code->loc); perfect_nesting_errorp = true; } - /* FIXME: Also diagnose for TILE directives. */ + else if (is_tile) + { + gfc_error ("%s inner loops must be perfectly nested at %L", + name, &code->loc); + perfect_nesting_errorp = true; + } + else if (is_nested_tile) + { + gfc_error ("%s inner loops must be perfectly nested with " + "nested !$OMP TILE at %L", + name, &code->loc); + perfect_nesting_errorp = true; + } if (perfect_nesting_errorp) errorp = true; } @@ -10694,6 +10938,32 @@ resolve_omp_do (gfc_code *code) name, next)) errorp = true; } + + /* Check for presence of nested TILE directive, used for next level + of the imperfect loop error checking above. Then resolve all the + transforms at this level. */ + if (!is_tile && !is_nested_tile && !perfect_nesting_errorp) + for (gfc_code *c = next; c && loop_transform_p (c->op); ) + { + if (c->op == EXEC_OMP_TILE) + { + is_nested_tile = true; + break; + } + if (c->block) + c = c->block->next; + else + c = c->next; + } + next = resolve_nested_loop_transforms (next, name, num_loops - i - 1, + &code->loc); + if (!next) + { + gfc_error ("not enough DO loops for %s at %L", + name, &code->loc); + return; + } + do_code = next; } @@ -10701,9 +10971,162 @@ resolve_omp_do (gfc_code *code) if (errorp) return; - restructure_intervening_code (&(code->block->next), code, count); + /* Only restructure intervening code if we found some. Note that + restructure_intervening_code assumes CODE is a DO loop instead of a + top-level TILE directive, which should have been rejected already if + if contains intervening code. */ + if (is_tile) + gcc_assert (!any_imperfect); + else if (any_imperfect) + { + gcc_assert (code->block); + restructure_intervening_code (&(code->block->next), code, num_loops); + } +} + +static void +resolve_omp_do (gfc_code *code) +{ + gfc_code *do_code; + int count; + const char *name; + bool is_simd = false; + + switch (code->op) + { + case EXEC_OMP_DISTRIBUTE: name = "!$OMP DISTRIBUTE"; break; + case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_DISTRIBUTE_SIMD: + name = "!$OMP DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_DO: name = "!$OMP DO"; break; + case EXEC_OMP_DO_SIMD: name = "!$OMP DO SIMD"; is_simd = true; break; + case EXEC_OMP_LOOP: name = "!$OMP LOOP"; break; + case EXEC_OMP_PARALLEL_DO: name = "!$OMP PARALLEL DO"; break; + case EXEC_OMP_PARALLEL_DO_SIMD: + name = "!$OMP PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_PARALLEL_LOOP: name = "!$OMP PARALLEL LOOP"; break; + case EXEC_OMP_PARALLEL_MASKED_TASKLOOP: + name = "!$OMP PARALLEL MASKED TASKLOOP"; + break; + case EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD: + name = "!$OMP PARALLEL MASKED TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_PARALLEL_MASTER_TASKLOOP: + name = "!$OMP PARALLEL MASTER TASKLOOP"; + break; + case EXEC_OMP_PARALLEL_MASTER_TASKLOOP_SIMD: + name = "!$OMP PARALLEL MASTER TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_MASKED_TASKLOOP: name = "!$OMP MASKED TASKLOOP"; break; + case EXEC_OMP_MASKED_TASKLOOP_SIMD: + name = "!$OMP MASKED TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_MASTER_TASKLOOP: name = "!$OMP MASTER TASKLOOP"; break; + case EXEC_OMP_MASTER_TASKLOOP_SIMD: + name = "!$OMP MASTER TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_SIMD: name = "!$OMP SIMD"; is_simd = true; break; + case EXEC_OMP_TARGET_PARALLEL_DO: name = "!$OMP TARGET PARALLEL DO"; break; + case EXEC_OMP_TARGET_PARALLEL_DO_SIMD: + name = "!$OMP TARGET PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_PARALLEL_LOOP: + name = "!$OMP TARGET PARALLEL LOOP"; + break; + case EXEC_OMP_TARGET_SIMD: + name = "!$OMP TARGET SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE: + name = "!$OMP TARGET TEAMS DISTRIBUTE"; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP TARGET TEAMS DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD: + name = "!$OMP TARGET TEAMS DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_TARGET_TEAMS_LOOP: name = "!$OMP TARGET TEAMS LOOP"; break; + case EXEC_OMP_TASKLOOP: name = "!$OMP TASKLOOP"; break; + case EXEC_OMP_TASKLOOP_SIMD: + name = "!$OMP TASKLOOP SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE: name = "!$OMP TEAMS DISTRIBUTE"; break; + case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO: + name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO"; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: + name = "!$OMP TEAMS DISTRIBUTE PARALLEL DO SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + name = "!$OMP TEAMS DISTRIBUTE SIMD"; + is_simd = true; + break; + case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "!$OMP UNROLL"; break; + case EXEC_OMP_TILE: name = "!$OMP TILE"; break; + default: gcc_unreachable (); + } + + if (code->ext.omp_clauses) + resolve_omp_clauses (code, code->ext.omp_clauses, NULL); + + if (code->ext.omp_clauses->orderedc) + count = code->ext.omp_clauses->orderedc; + else + { + count = code->ext.omp_clauses->collapse; + if (count <= 0) + count = 1; + } + + /* While the spec defines the loop nest depth independently of the COLLAPSE + clause, in practice the middle end only pays attention to the COLLAPSE + depth and treats any further inner loops as the final-loop-body. So + here we also check canonical loop nest form only for the number of + outer loops specified by the COLLAPSE clause too. */ + do_code = resolve_nested_loop_transforms (code->block->next, name, count, + &code->loc); + resolve_nested_loops (code, name, do_code, count, is_simd, false); } +static void +resolve_omp_tile (gfc_code *code) +{ + gfc_code *do_code; + const char *name = "!$OMP TILE"; + + unsigned num_loops = 0; + gcc_assert (code->ext.omp_clauses->tile_sizes); + for (gfc_expr_list *el = code->ext.omp_clauses->tile_sizes; el; + el = el->next) + num_loops++; + + do_code = resolve_nested_loop_transforms (code, name, num_loops, &code->loc); + resolve_nested_loops (code, name, do_code, num_loops, false, true); +} static gfc_statement omp_code_to_statement (gfc_code *code) @@ -10852,6 +11275,10 @@ omp_code_to_statement (gfc_code *code) return ST_OMP_PARALLEL_LOOP; case EXEC_OMP_DEPOBJ: return ST_OMP_DEPOBJ; + case EXEC_OMP_TILE: + return ST_OMP_TILE; + case EXEC_OMP_UNROLL: + return ST_OMP_UNROLL; default: gcc_unreachable (); } @@ -10950,6 +11377,7 @@ resolve_oacc_nested_loops (gfc_code *code, gfc_code* do_code, int collapse, &do_code->loc); break; } + gcc_assert (do_code->op != EXEC_OMP_UNROLL); gcc_assert (do_code->op == EXEC_DO); if (do_code->ext.iterator->var->ts.type != BT_INTEGER) gfc_error ("!$ACC LOOP iteration variable must be of type integer at %L", @@ -11316,6 +11744,12 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_TEAMS_LOOP: resolve_omp_do (code); break; + case EXEC_OMP_TILE: + resolve_omp_tile (code); + break; + case EXEC_OMP_UNROLL: + resolve_omp_unroll (code); + break; case EXEC_OMP_TARGET: resolve_omp_target (code); gcc_fallthrough (); diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc index 58386805ffe..5ea613fc6db 100644 --- a/gcc/fortran/parse.cc +++ b/gcc/fortran/parse.cc @@ -1151,6 +1151,8 @@ decode_omp_directive (void) ST_OMP_END_TEAMS_DISTRIBUTE); matcho ("end teams loop", gfc_match_omp_eos_error, ST_OMP_END_TEAMS_LOOP); matcho ("end teams", gfc_match_omp_eos_error, ST_OMP_END_TEAMS); + matchs ("end unroll", gfc_match_omp_eos_error, ST_OMP_END_UNROLL); + matchs ("end tile", gfc_match_omp_eos_error, ST_OMP_END_TILE); matcho ("end workshare", gfc_match_omp_end_nowait, ST_OMP_END_WORKSHARE); break; @@ -1278,6 +1280,10 @@ decode_omp_directive (void) matcho ("teams", gfc_match_omp_teams, ST_OMP_TEAMS); matchdo ("threadprivate", gfc_match_omp_threadprivate, ST_OMP_THREADPRIVATE); + matchs ("tile sizes", gfc_match_omp_tile, ST_OMP_TILE); + break; + case 'u': + matchs ("unroll", gfc_match_omp_unroll, ST_OMP_UNROLL); break; case 'w': matcho ("workshare", gfc_match_omp_workshare, ST_OMP_WORKSHARE); @@ -1910,6 +1916,7 @@ next_statement (void) case ST_OMP_LOOP: case ST_OMP_PARALLEL_LOOP: case ST_OMP_TEAMS_LOOP: \ case ST_OMP_TARGET_PARALLEL_LOOP: case ST_OMP_TARGET_TEAMS_LOOP: \ case ST_OMP_ALLOCATE_EXEC: case ST_OMP_ALLOCATORS: case ST_OMP_ASSUME: \ + case ST_OMP_TILE: case ST_OMP_UNROLL: \ case ST_CRITICAL: \ case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \ case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \ @@ -2282,6 +2289,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_END_UNION: p = "END UNION"; break; + case ST_OMP_END_UNROLL: + p = "!$OMP END UNROLL"; + break; case ST_END_MAP: p = "END MAP"; break; @@ -2962,6 +2972,12 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_OMP_THREADPRIVATE: p = "!$OMP THREADPRIVATE"; break; + case ST_OMP_TILE: + p = "!$OMP TILE"; + break; + case ST_OMP_UNROLL: + p = "!$OMP UNROLL"; + break; case ST_OMP_WORKSHARE: p = "!$OMP WORKSHARE"; break; @@ -5384,6 +5400,7 @@ parse_omp_do (gfc_statement omp_st) gfc_statement st; gfc_code *cp, *np; gfc_state_data s; + int num_unroll = 0; accept_statement (omp_st); @@ -5400,6 +5417,17 @@ parse_omp_do (gfc_statement omp_st) unexpected_eof (); else if (st == ST_DO) break; + else if (st == ST_OMP_UNROLL) + { + accept_statement (st); + num_unroll++; + continue; + } + else if (st == ST_OMP_TILE) + { + accept_statement (st); + continue; + } else unexpected_statement (st); } @@ -5511,8 +5539,26 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_TEAMS_LOOP: omp_end_st = ST_OMP_END_TEAMS_LOOP; break; + case ST_OMP_TILE: + omp_end_st = ST_OMP_END_TILE; + break; + case ST_OMP_UNROLL: + omp_end_st = ST_OMP_END_UNROLL; + break; default: gcc_unreachable (); } + + for (; num_unroll > 0; num_unroll--) + { + if (st == ST_OMP_END_UNROLL) + { + gfc_clear_new_st (); + gfc_commit_symbols (); + gfc_warning_check (); + st = next_statement (); + } + } + if (st == omp_end_st) { if (new_st.op == EXEC_OMP_END_NOWAIT) @@ -6296,6 +6342,8 @@ parse_executable (gfc_statement st) case ST_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case ST_OMP_TEAMS_DISTRIBUTE_SIMD: case ST_OMP_TEAMS_LOOP: + case ST_OMP_TILE: + case ST_OMP_UNROLL: st = parse_omp_do (st); if (st == ST_IMPLIED_ENDDO) return st; diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 861f69ac20f..4f5d6decc42 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11129,6 +11129,8 @@ gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_LOOP: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: break; @@ -12296,6 +12298,8 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TARGET_SIMD: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: gfc_resolve_omp_do_blocks (code, ns); break; case EXEC_SELECT_TYPE: @@ -12794,6 +12798,8 @@ start: case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_resolve_omp_directive (code, ns); break; diff --git a/gcc/fortran/st.cc b/gcc/fortran/st.cc index b6d87c40207..8b083de7308 100644 --- a/gcc/fortran/st.cc +++ b/gcc/fortran/st.cc @@ -279,6 +279,8 @@ gfc_free_statement (gfc_code *p) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_free_omp_clauses (p->ext.omp_clauses); break; diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 06c5a123973..9c1eb3e6a9c 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -4112,6 +4112,51 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, omp_clauses = gfc_trans_add_clause (c, omp_clauses); } + if (clauses->unroll_full) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_none) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_partial) + { + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) + = (clauses->unroll_partial_factor + ? build_int_cst (integer_type_node, clauses->unroll_partial_factor) + : NULL_TREE); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->tile_sizes) + { + vec *tvec; + gfc_expr_list *el; + + vec_alloc (tvec, 4); + + for (el = clauses->tile_sizes; el; el = el->next) + vec_safe_push (tvec, gfc_convert_expr_to_tree (block, el->expr)); + + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_TILE); + OMP_CLAUSE_TILE_SIZES (c) = build_tree_list_vec (tvec); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + + tvec->truncate (0); + } + if (clauses->ordered) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_ORDERED); @@ -5308,6 +5353,12 @@ gfc_trans_omp_cancel (gfc_code *code) return gfc_finish_block (&block); } +bool +loop_transform_p (gfc_exec_op op) +{ + return op == EXEC_OMP_UNROLL || op == EXEC_OMP_TILE; +} + static tree gfc_trans_omp_cancellation_point (gfc_code *code) { @@ -5479,13 +5530,46 @@ gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n, return true; } +int +gfc_expr_list_len (gfc_expr_list *list) +{ + unsigned len = 0; + for (; list; list = list->next) + len++; + + return len; +} + +/* Traverse the loops with nesting depth at most + COLLAPSE from CODE and determine the largest + loop nest depth required by the loop transformations + found on the loops. */ +int compute_transformed_depth (gfc_code *code, int collapse) +{ + int new_collapse = collapse; + for (int i = 0; i < new_collapse; i++) + { + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); + while (loop_transform_p (code->op)) + { + int tile_depth + = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + new_collapse = MAX (new_collapse, i + tile_depth); + code = code->block ? code->block->next : code->next; + } + code = code->block->next; + } + + return new_collapse; +} + static tree gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_omp_clauses *do_clauses, tree par_clauses) { gfc_se se; tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls; - tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses; + tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses, loop_transform_clauses; stmtblock_t block; stmtblock_t body; gfc_omp_clauses *clauses = code->ext.omp_clauses; @@ -5494,45 +5578,80 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, dovar_init *di; unsigned ix; vec *saved_doacross_steps = doacross_steps; - gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list; gfc_code *orig_code = code; + locus top_loc = code->loc; + gfc_expr_list *oacc_tile + = do_clauses ? do_clauses->tile_list : clauses->tile_list; + gfc_expr_list *omp_tile + = do_clauses ? do_clauses->tile_sizes : clauses->tile_sizes; + gcc_assert (!omp_tile || op == EXEC_OMP_TILE); + gcc_assert (!(oacc_tile && omp_tile)); + + if (pblock == NULL) + { + gfc_start_block (&block); + pblock = █ + } + code = code->block->next; + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); + /* Loop transformation directives surrounding the associated loop of an "omp + do" (or similar directive) are represented as clauses on the "omp do". */ + loop_transform_clauses = NULL; + int omp_tile_depth = gfc_expr_list_len (omp_tile); + tree clauses_tail = NULL; + while (loop_transform_p (code->op)) + { + tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, + code->loc); + /* There might be several "!$omp tile" transformations surrounding the + loop. Use the innermost one which must have the largest tiling depth. + If an inner directive has a smaller tiling depth than an outer + directive, an error will be emitted in pass-omp_transform_loops. */ + omp_tile_depth = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + + if (!loop_transform_clauses) + { + loop_transform_clauses = clauses; + clauses_tail = tree_last (clauses); + } + else + clauses_tail = chainon (clauses_tail, clauses); + + code = code->block ? code->block->next : code->next; + } + gcc_checking_assert (!loop_transform_p (code->op)); + gcc_assert (code->op == EXEC_DO); /* Both collapsed and tiled loops are lowered the same way. In OpenACC, those clauses are not compatible, so prioritize the tile clause, if present. */ - if (tile) - { - collapse = 0; - for (gfc_expr_list *el = tile; el; el = el->next) - collapse++; - } + if (oacc_tile) + collapse = gfc_expr_list_len (oacc_tile); doacross_steps = NULL; if (clauses->orderedc) collapse = clauses->orderedc; if (collapse <= 0) collapse = 1; + collapse = MAX (collapse, omp_tile_depth); + gfc_code *first_loop = loop_transform_p (orig_code->op) ? + orig_code : orig_code->block->next; + int transform_depth = compute_transformed_depth (first_loop, collapse); - code = code->block->next; - gcc_assert (code->op == EXEC_DO); - + collapse = transform_depth; init = make_tree_vec (collapse); cond = make_tree_vec (collapse); incr = make_tree_vec (collapse); orig_decls = clauses->ordered ? make_tree_vec (collapse) : NULL_TREE; - if (pblock == NULL) - { - gfc_start_block (&block); - pblock = █ - } - /* simd schedule modifier is only useful for composite do simd and other constructs including that, where gfc_trans_omp_do is only called on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = NULL; + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); + omp_clauses = chainon (omp_clauses, loop_transform_clauses); for (i = 0; i < collapse; i++) { @@ -5784,7 +5903,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } gcc_assert (local_dovar == dovar || c != NULL); } - if (local_dovar != dovar) + if (local_dovar != dovar && op != EXEC_OMP_UNROLL) { if (op != EXEC_OMP_SIMD || dovar_found == 1) tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE); @@ -5802,7 +5921,26 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } if (i + 1 < collapse) - code = code->block->next; + { + code = code->block->next; + + loop_transform_clauses = NULL; + clauses_tail = omp_clauses; + while (loop_transform_p (code->op)) + { + loop_transform_clauses = gfc_trans_omp_clauses ( + pblock, code->ext.omp_clauses, code->loc); + for (tree c = loop_transform_clauses; c; + c = OMP_CLAUSE_CHAIN (c)) + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, i + 1); + + clauses_tail = chainon (clauses_tail, loop_transform_clauses); + clauses_tail = tree_last (loop_transform_clauses); + + code = code->block ? code->block->next : code->next; + } + } } if (pblock != &block) @@ -5873,6 +6011,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break; case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break; case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break; + case EXEC_OMP_TILE: stmt = make_node (OMP_LOOP_TRANS); break; + case EXEC_OMP_UNROLL: stmt = make_node (OMP_LOOP_TRANS); break; default: gcc_unreachable (); } @@ -7979,6 +8119,8 @@ gfc_trans_omp_directive (gfc_code *code) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TASKLOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses, NULL); case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: diff --git a/gcc/fortran/trans.cc b/gcc/fortran/trans.cc index e2e1b694012..95b724e1e0d 100644 --- a/gcc/fortran/trans.cc +++ b/gcc/fortran/trans.cc @@ -2607,6 +2607,8 @@ trans_code (gfc_code * code, tree cond) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: res = gfc_trans_omp_directive (code); break; diff --git a/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 b/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 index 613f06f6ea9..b155e0fcb5b 100644 --- a/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/collapse1.f90 @@ -9,7 +9,7 @@ subroutine collapse1 !$omp threadprivate (thr) l = .false. a(:, :, :) = 0 - !$omp parallel do collapse(4) schedule(static, 4) ! { dg-error "not enough DO loops for collapsed" } + !$omp parallel do collapse(4) schedule(static, 4) ! { dg-error "not enough DO loops for" } do i = 1, 3 do j = 4, 6 do k = 5, 7 @@ -33,9 +33,9 @@ subroutine collapse1 end do k = 4 end do - !$omp parallel do collapse(2) ! { dg-error "not enough DO loops" } + !$omp parallel do collapse(2) do i = 1, 3 - do + do ! { dg-error "cannot be a DO WHILE or DO without loop control" } end do end do !$omp parallel do collapse(2) diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 new file mode 100644 index 00000000000..fa2e2f17c6b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 @@ -0,0 +1,124 @@ +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp unroll partial(2) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +subroutine test2 + !$omp parallel do collapse(3) + do i=0,100 + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test2 + +subroutine test3 +!$omp parallel do collapse(3) +do i=0,100 + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test3 + +subroutine test4 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test4 + +subroutine test5 + !$omp parallel do collapse(3) + !$omp tile sizes(3,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do i=0,100 + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test5 + +subroutine test6 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,2) + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test6 + +subroutine test7 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test7 + +subroutine test8 +!$omp parallel do collapse(1) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test8 + +subroutine test9 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test9 + +subroutine test10 +!$omp parallel do +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test10 + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..8284dc8193d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 @@ -0,0 +1,163 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2+3) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(-21) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(0) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes( ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2 ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes() ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2,) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,2) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,i) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i,) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + call dummy(j) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + call dummy(i) + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested at \(1\)} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 new file mode 100644 index 00000000000..441d89b61e9 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 @@ -0,0 +1,10 @@ + +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..d14af08c27a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 @@ -0,0 +1,80 @@ +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(2) + !$omp tile sizes (3,4) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop + + !$omp taskloop simd + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop simd +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(3) + !$omp tile sizes (1,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TASKLOOP} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 new file mode 100644 index 00000000000..308e3b3e4d0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 @@ -0,0 +1,18 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) ! { dg-error {'ordered' invalid in conjunction with 'omp tile'} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp target + +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 new file mode 100644 index 00000000000..b2dca0bbec6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 @@ -0,0 +1,95 @@ + +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp target teams distribute + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test3 + +subroutine test4 + implicit none + integer :: i, j, k + + !$omp target teams distribute collapse(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TARGET TEAMS DISTRIBUTE} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test4 + +subroutine test5 + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp tile + !$end omp target + +end subroutine test5 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 new file mode 100644 index 00000000000..e9cf88f4def --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 @@ -0,0 +1,93 @@ +subroutine test0 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + end do + end do + end do +end subroutine test0 + +subroutine test0m + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test0m + +subroutine test1 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test1 + + +subroutine test2 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test2 + +subroutine test3 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) ! { dg-error {\!\$OMP TILE inner loops must be perfectly nested} } + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 new file mode 100644 index 00000000000..6474b9da1e2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 @@ -0,0 +1,16 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp tile sizes(4) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +! Collapse of the gimple_omp_for should be unaffacted by the transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) tile sizes\(4\).1\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) private\(j.0\) private\(j\)\n +for \(i = 0; i < 101; i = i \+ 1\)\n +for \(.omp_tile_index.\d = -300; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 4\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 new file mode 100644 index 00000000000..0d462debd72 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test2 + !$omp parallel do + !$omp tile sizes(3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2 + +! One gimple_omp_for should cover the outer two loops, another the inner two loops +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 new file mode 100644 index 00000000000..3ce87ad8a4b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait private\(l.0\) private\(k\)\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 new file mode 100644 index 00000000000..2c06d2094ba --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 @@ -0,0 +1,31 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } + +! Third level is tiled first by the inner construct. The resulting floor loop is tiled by the outer construct. +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.1} 2 "omp_transform_loops" } } + +! All other levels are tiled once +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.2} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.3} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.4} 1 "omp_transform_loops" } } + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 new file mode 100644 index 00000000000..355d977fe35 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 @@ -0,0 +1,30 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! The outer gimple_omp_for should not cover the loop with the tile transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! After transformations, the outer loop should be a floor loop created +! by the tiling and the outer construct type and non-transformation +! clauses should be unaffected by the tiling +! { dg-final { scan-tree-dump {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 3\)} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp} 2 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp parallel} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 new file mode 100644 index 00000000000..0c83da660f5 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 @@ -0,0 +1,26 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! There should be separate gimple_omp_for constructs for the tile constructs because the tiling depth +! of the outer construct does not reach the level of the inner construct +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index} 3 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 new file mode 100644 index 00000000000..670e14caa12 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 @@ -0,0 +1,123 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1a + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2a + +subroutine test3a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3a + +subroutine test4a + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4a + +subroutine test1b + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1b + +subroutine test2b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2b + +subroutine test3b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3b + +subroutine test4b + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 new file mode 100644 index 00000000000..169c2b10e54 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 @@ -0,0 +1,71 @@ +subroutine test1 + !$omp tile sizes(1) + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + !$omp tile sizes(1,2) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test2 + +subroutine test3 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test3 + +subroutine test4 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test4 + +subroutine test5 + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test5 + +subroutine test6 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test6 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 new file mode 100644 index 00000000000..f0f3e046511 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 @@ -0,0 +1,12 @@ +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..27920701b36 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,57 @@ +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} "" { target *-*-*} .-1 } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8) + !$omp unroll partial(1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp parallel do collapse(2) ! { dg-error {missing canonical loop nest after \!\$OMP PARALLEL DO at \(1\)} } + !$omp tile sizes (8,8) ! { dg-error {missing canonical loop nest after \!\$OMP TILE at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..4cfac4c5e26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 @@ -0,0 +1,277 @@ +subroutine test1 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 + +subroutine test20 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test20 + +subroutine test21 + implicit none + integer :: i + + !$omp unroll partial ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i = 1:100) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test21 + +subroutine test22 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do concurrent (i = 1:100) ! { dg-error {\!\$OMP DO cannot be a DO CONCURRENT loop at \(1\)} } + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test22 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 new file mode 100644 index 00000000000..2c4a45d3054 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 @@ -0,0 +1,7 @@ +subroutine test(i) + ! TODO The checking that produces this message comes too late. Not important, but would be nice to have. + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} "" { xfail *-*-* } } + call dummy0 ! { dg-error {Unexpected CALL statement at \(1\)} } +end subroutine test ! { dg-error {Unexpected END statement at \(1\)} } + +! { dg-error "Unexpected end of file" "" { target "*-*-*" } 0 } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 new file mode 100644 index 00000000000..3f0d5981e9b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 @@ -0,0 +1,75 @@ +subroutine test1(i) + implicit none + integer :: i + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +subroutine test3(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll full + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test3 + +subroutine test4(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test4 + +subroutine test5(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test5 + +subroutine test6(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test6 + +subroutine test7(i) + implicit none + integer :: i + !$omp loop ! { dg-error {missing canonical loop nest after \!\$OMP LOOP at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test7 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 new file mode 100644 index 00000000000..0d8f3f5a2c0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 @@ -0,0 +1,29 @@ +subroutine test1 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do while (i < 10) + call dummy(i) + i = i + 1 + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do + call dummy(i) + i = i + 1 + if (i >= 10) exit + end do +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i=1:10) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..8496f9eefe0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_none" 1 "original" } } +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_full" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..0d233c9ab6f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 @@ -0,0 +1,17 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +! Loop should be removed with 10 copies of the body remaining + +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..fcccdb0bcf8 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should not be unrolled, but the internal representation should be lowered + +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..ee82b4d150c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 5 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..237e6b83087 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 @@ -0,0 +1,19 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(10\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 10 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..8feaf7dc4d3 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 @@ -0,0 +1,62 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i,j + !$omp parallel do + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + + !$omp taskloop + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + +end subroutine test1 + +! For the "parallel do", there should be 11 "omp for" loops, 10 for the inner loop, 1 for outer, +! for the "taskloop", there should be 10 "omp for" loops for the unrolled loop +! { dg-final { scan-tree-dump-times {#pragma omp for} 21 "omp_transform_loops" } } +! ... and two outer taskloops plus the one taskloops +! { dg-final { scan-tree-dump-times {#pragma omp taskloop} 3 "omp_transform_loops" } } + + +subroutine test2 + implicit none + integer :: i,j + do i = 1,100 + !$omp teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do + + do i = 1,100 + !$omp target teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times {#pragma omp distribute} 2 "omp_transform_loops" } } + +! After unrolling there should be 10 copies of each loop body for each loop-nest +! { dg-final { scan-tree-dump-times "dummy" 40 "omp_transform_loops" } } + +! { dg-final { scan-tree-dump-not {#pragma omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(10\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp distribute private\(j\) unroll_partial\(10\)} 2 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..dab3f0fb5cf --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp parallel do collapse(1) + !$omp unroll partial(4) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll partial\(24\)'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 1 * 2 * 3 * 4 = 24 times + +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\).0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 new file mode 100644 index 00000000000..91e13ff1b37 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll partial' directives preceding 'omp unroll full'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full.0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\).0} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 new file mode 100644 index 00000000000..efcc691185d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 @@ -0,0 +1,57 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1a + +subroutine test1b + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1b + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2a + +subroutine test2b + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 new file mode 100644 index 00000000000..079c0fdd75b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 @@ -0,0 +1,20 @@ +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,5 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,6 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 new file mode 100644 index 00000000000..4893ba46e4e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 @@ -0,0 +1,21 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=20" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,21 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 new file mode 100644 index 00000000000..60f25d3abe6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=10" } +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,21 + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 new file mode 100644 index 00000000000..3da99158cc0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 @@ -0,0 +1,31 @@ +subroutine test + implicit none + + integer :: i, j, k + !$omp target parallel do collapse(2) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 2 step '5'} } + do i = -300, 100 + !$omp unroll partial + do j = i,i*2 + call dummy (i) + end do + end do + + !$omp target parallel do collapse(3) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 3 step '5'} } + do i = -300, 100 + do j = 1,10 + !$omp unroll partial + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do + + !$omp unroll full + do i = -3, 5 + do j = 1,10 + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do +end subroutine diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..f22debbb78f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,244 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } + +subroutine test1 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 new file mode 100644 index 00000000000..faaa37c5d7e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 @@ -0,0 +1,57 @@ +! { dg-do run } +! { dg-options "-O2 -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp simd + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program + +! { dg-final { scan-tree-dump {omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..20617e25105 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,37 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do + !$omp unroll partial(1) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(1\)@0 tile sizes\(8, 8\)@0} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Tiling adds two floor and two tile loops. + +! Number of conditional statements after tiling: +! 5 +! = 2 (lowering of 2 tile loops) +! + 1 (partial tile handling in 2 tile loops) +! + 1 (lowering of non-associated floor loop) + +! The unrolling with unroll factor 1 currently gets executed (TODO could/should be skipped?) + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 5 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..c1e7f356a87 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,41 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + c = 0 + + !$omp target + !$omp parallel do + !$omp unroll partial(2) + !$omp tile sizes (8,8,4) + do i = 1,m + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + !$omp end target +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(2\)@0 tile sizes\(8, 8, 4\)@0} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Check the number of loops + +! Tiling adds three tile and three floor loops. +! The outermost floor loop is associated with the "!$omp parallel do" +! and hence it isn't lowered in the transformation pass. +! Number of conditional statements after tiling: +! 8 +! = 2 (inner floor loop lowering) +! + 3 (partial tile handling in 3 tile loops) +! + 3 (lowering of 3 tile loops) +! +! Unrolling creates 2 copies of the tiled loop nest. + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 16 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 new file mode 100644 index 00000000000..bc7a890df17 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 @@ -0,0 +1,25 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + !$omp unroll partial(10) + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_partial" 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 b/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 index 598e455d2e9..eadf34a022a 100644 --- a/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/pure-1.f90 @@ -86,3 +86,29 @@ pure integer function func_simd(n) end do func_simd = r end + +!pure integer function func_unroll(n) +integer function func_unroll(n) + implicit none + integer, value :: n + integer :: j, r + r = 0 + !$omp unroll partial(2) + do j = 1, n + r = r + j + end do + func_unroll = r +end + +!pure integer function func_tile(n) +integer function func_tile(n) + implicit none + integer, value :: n + integer :: j, r + r = 0 + !$omp tile sizes(2) + do j = 1, n + r = r + j + end do + func_tile = r +end diff --git a/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 b/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 index 1e3cf8c9416..35503c6a284 100644 --- a/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/pure-2.f90 @@ -46,28 +46,3 @@ logical function func_reverse(n) end do end -!pure integer function func_unroll(n) -integer function func_unroll(n) - implicit none - integer, value :: n - integer :: j, r - r = 0 - !$omp unroll partial(2) ! { dg-error "Unclassifiable OpenMP directive" } - do j = 1, n - r = r + j - end do - func_unroll = r -end - -!pure integer function func_tile(n) -integer function func_tile(n) - implicit none - integer, value :: n - integer :: j, r - r = 0 - !$omp tile sizes(2) ! { dg-error "Unclassifiable OpenMP directive" } - do j = 1, n - r = r + j - end do - func_tile = r -end diff --git a/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 b/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 new file mode 100644 index 00000000000..aa956707414 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/imperfect-transform-1.f90 @@ -0,0 +1,70 @@ +! { dg-do run } + +! Like imperfect1.f90, but also includes loop transforms. + +program foo + integer, save :: f1count(3), f2count(3) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp do collapse(3) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp unroll partial + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 b/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 new file mode 100644 index 00000000000..be199ab9218 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/imperfect-transform-2.f90 @@ -0,0 +1,70 @@ +! { dg-do run } + +! Like imperfect1.f90, but also includes loop transforms. + +program foo + integer, save :: f1count(3), f2count(3) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp do collapse(3) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp tile sizes(5) + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 new file mode 100644 index 00000000000..1db97feb34d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp target parallel do collapse(2) private(inner) map(to:a,b) map(from:c) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 new file mode 100644 index 00000000000..dc70c9228fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/nested-fn.f90 @@ -0,0 +1,19 @@ +! { dg-do run } + +program foo + integer :: count +contains + +subroutine s1 () + integer :: i, count + + count = 0 + + !$omp target parallel do + !$omp unroll partial + do i = 1, 100 + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..bb48c31224e --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..a7cb5e7635d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 @@ -0,0 +1,117 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum1() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(2) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(16) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum3() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(100) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum4() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum5() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp parallel do collapse(2) reduction(+:sum) + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum5 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 new file mode 100644 index 00000000000..518968f1335 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + ! omp do private(inner) + do i = 1,m + !$omp unroll partial(4) + !$omp tile sizes (5) + do j = 1,n + do k = 1, n + write (*,*) "i", i, "j", j, "k", k + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 new file mode 100644 index 00000000000..807135df5e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 @@ -0,0 +1,75 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,m + do j = 1,n + c(j, i) = 0 + end do + end do + + !$omp parallel do + do i = 1,m + !$omp tile sizes (5,2) + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..b91ea275577 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 @@ -0,0 +1,54 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp parallel do reduction(+:sum) + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..2ce44d4d044 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 @@ -0,0 +1,88 @@ +! { dg-additional-options "-fdump-tree-original -g" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 1,10,3 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum2() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = -20,1,3 + sum = sum + 1 + end do + end function compute_sum2 + + + integer function compute_sum3() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 30,1,-3 + sum = sum + 1 + end do + end function compute_sum3 + + + integer function compute_sum4() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 50,-60,-10 + sum = sum + 1 + end do + end function compute_sum4 + +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 4) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 8) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 10) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 12) then + call abort + end if + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..55e5cc568a5 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 @@ -0,0 +1,59 @@ +! Test lowering of the internal representation of "omp unroll" loops +! which are not unrolled. + +! { dg-additional-options "-O0" } +! { dg-additional-options "--param=omp-unroll-full-max-iterations=0" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll + do i = 0,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + do i = 0,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (1, 100) + if (result .ne. 101) then + call abort + end if + + result = compute_sum3 (2, 100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (-2, -100) + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..52a214f1049 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 @@ -0,0 +1,72 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll partial(2) + do i = 1,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum3 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum3 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum3 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..d6a4e739675 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 @@ -0,0 +1,55 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp do + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..b953ce31b5b --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 @@ -0,0 +1,105 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 + + integer function compute_sum5 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll partial\(50\)'} } + !$omp unroll partial(10) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum5 + + integer function compute_sum6 (step,n) result(sum) + implicit none + integer :: i, j, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + do i = 1,n,step + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll partial' directives preceding 'omp unroll full'} } + !$omp unroll partial(10) + do j = 1, 1000 + sum = sum + 1 + end do + end do + end function compute_sum6 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum5 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum5 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum5 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum5 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum5 (-2, -100) + if (result .ne. 51) then + call abort + end if + + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..d25f18002ae --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 @@ -0,0 +1,198 @@ +! { dg-additional-options "-O0 -cpp" } +! { dg-do run } + +#ifndef UNROLL_FACTOR +#define UNROLL_FACTOR 1 +#endif +module test_functions +contains + subroutine copy (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 1, 100 + array1(i) = array2(i) + end do + end subroutine + + subroutine copy2 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 0,99 + array1(i+1) = array2(i+1) + end do + end subroutine copy2 + + subroutine copy3 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do lastprivate(i) + !$omp unroll partial(UNROLL_FACTOR) + do i = -49,50 + if (i < 0) then + array1((-1)*i) = array2((-1)*i) + else + array1(50+i) = array2(50+i) + endif + end do + end subroutine copy3 + + subroutine copy4 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 2, 200, 2 + array1(i/2) = array2(i/2) + end do + end subroutine copy4 + + subroutine copy5 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 200, 2, -2 + array1(i/2) = array2(i/2) + end do + end subroutine + + subroutine copy6 (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = lower, upper, step + array1 (i) = array2(i) + end do + end subroutine + + subroutine prepare (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + + array1 = 2 + array2 = 0 + end subroutine + + subroutine check_equal (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_equal_at_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + do i=lower, upper, step + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_unchanged_at_non_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i, j + + do i=lower, upper,step + do j=i,i+step-1 + if (array2(j) /= 0) then + write (*,*) i + call abort + end if + end do + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + + call prepare (array1, array2) + call copy (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy2 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy3 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy4 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy5 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 100, 5) + call check_equal_at_steps (array1, array2, 1, 100, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 100, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 50, 5) + call check_equal_at_steps (array1, array2, 1, 50, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 50, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 3, 18, 7) + call check_equal_at_steps (array1, array2, 3 , 18, 7) + call check_unchanged_at_non_steps (array1, array2, 3, 18, 7) +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 new file mode 100644 index 00000000000..02328464c0d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that divides the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 5 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 new file mode 100644 index 00000000000..60866ef33fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that does not divide the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 3 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 new file mode 100644 index 00000000000..6d8a2ef7bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that is larger than the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 113 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..40506025aa3 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 @@ -0,0 +1,38 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + subroutine copy (array1, array2, step, n) + implicit none + + integer :: array1(n) + integer :: array2(n) + integer :: i, step, n + + call omp_set_num_threads (4) + !$omp parallel do shared(array1) shared(array2) schedule(static, 4) + !$omp unroll partial(2) + do i = 1,n + array1(i) = array2(i) + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + integer :: i + + array1 = 2 + call copy(array1, array2, 1, 100) + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..7a43458f0dd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,34 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp simd reduction(+:sum) + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function compute_sum +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 new file mode 100644 index 00000000000..34b6e075e05 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-1.f90 @@ -0,0 +1,73 @@ +! { dg-do run } + +! Like imperfect-transform.f90, but enables offloading. + +program foo + integer, save :: f1count(3), f2count(3) + !$omp declare target enter (f1count, f2count) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + !$omp atomic + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + !$omp atomic + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp target parallel do collapse(3) map(always, tofrom:f1count, f2count) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp unroll partial + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 new file mode 100644 index 00000000000..188cca1e5b4 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target-imperfect-transform-2.f90 @@ -0,0 +1,73 @@ +! { dg-do run } + +! Like imperfect-transform.f90, but enables offloading. + +program foo + integer, save :: f1count(3), f2count(3) + !$omp declare target enter (f1count, f2count) + + f1count(1) = 0 + f1count(2) = 0 + f1count(3) = 0 + f2count(1) = 0 + f2count(2) = 0 + f2count(3) = 0 + + call s1 (3, 4, 5) + + ! All intervening code at the same depth must be executed the same + ! number of times. + if (f1count(1) /= f2count(1)) error stop 101 + if (f1count(2) /= f2count(2)) error stop 102 + if (f1count(3) /= f2count(3)) error stop 103 + + ! Intervening code must be executed at least as many times as the loop + ! that encloses it. + if (f1count(1) < 3) error stop 111 + if (f1count(2) < 3 * 4) error stop 112 + + ! Intervening code must not be executed more times than the number + ! of logical iterations. + if (f1count(1) > 3 * 4 * 5) error stop 121 + if (f1count(2) > 3 * 4 * 5) error stop 122 + + ! Check that the innermost loop body is executed exactly the number + ! of logical iterations expected. + if (f1count(3) /= 3 * 4 * 5) error stop 131 + +contains + +subroutine f1 (depth, iter) + integer :: depth, iter + !$omp atomic + f1count(depth) = f1count(depth) + 1 +end subroutine + +subroutine f2 (depth, iter) + integer :: depth, iter + !$omp atomic + f2count(depth) = f2count(depth) + 1 +end subroutine + +subroutine s1 (a1, a2, a3) + integer :: a1, a2, a3 + integer :: i, j, k + + !$omp target parallel do collapse(3) map(always, tofrom:f1count, f2count) + do i = 1, a1 + call f1 (1, i) + do j = 1, a2 + call f1 (2, j) + !$omp tile sizes(5) + do k = 1, a3 + call f1 (3, k) + call f2 (3, k) + end do + call f2 (2, j) + end do + call f2 (1, i) + end do + +end subroutine + +end program From patchwork Sun Oct 1 20:10:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 147155 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1038821vqb; Sun, 1 Oct 2023 13:14:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGEI7BFELPw486MX6MZFa2P9LrgddjXOmt9v2x2eTMNVgnqLlYn7Y8Yh+wfc0wVAmBvB1aq X-Received: by 2002:a05:6402:354:b0:532:c046:9e01 with SMTP id r20-20020a056402035400b00532c0469e01mr8467028edw.7.1696191295624; Sun, 01 Oct 2023 13:14:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696191295; cv=none; d=google.com; s=arc-20160816; b=c5bWDmjqGBuladgQHc/2YQqVO5VNxn53r159EDRrEg0zKO91lz7QnsMK1VC166tuWx ChXx7Pw9KsyaGDn7SeYLgxFubdG1E4w77U78bZM4SQidwtgybmHSjVbEfbTDEyMYANxb 3LAkemVe+2qJcTjTKFE/uCynO4Er1R/7ab3VakVYFS1AFqS+X/ev/dlYDIjsQYY4XL2m +hWU//EHMBMPJj94impe968Ra9qRWb227HsPutA34NEUmuYi+GP2f7SIN3eus8UTIpfS nHHyWQ3v70JhLStIC7VM2VxT0PePSWgFFX92zeJNbsUIdTyL+VZBAIB52WlxwnOtwVLS vSPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:ironport-sdr:dmarc-filter:delivered-to; bh=CHe0NeKTawvXCeQIRl52pAwSGoehXRBJbmGJ0nj1yJM=; fh=OCleZEGPw6EXwGK7DgwfSjSqRFJqj4InDXFwXNGe58Y=; b=Av0/cQ53jRyRMv12eQZA79zF8J6NrNJ+IJWUiG/SwqJZBimVfKYLIz4qtPDD6bBUY7 CnodZ1E3TAdUBVEiUt2UzbubgLZHtBvq5UtB27rVvBxdtssY3y7bLEtlLiybwnElA8uv BId0NAqe/B+LFfw+X9evVWkvikxVeIGA5WCUMSQhieFKe8LVP43cUqzZTAE0aU767Km8 GCI3Zmo4rQnFFG2uDGcMF9T3An0Bku8rMfB/hblHrgI6q3fCFUrJBPoUTxZOtbKX3EeW aObmnItozuQLAjUUbqu4OmboIzrMCE0LqMhXqHSrT2pzWV/rlt2NWC2OiedOZFN5vsF8 V4dQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t13-20020aa7d4cd000000b005256d5bdb32si17127520edr.448.2023.10.01.13.14.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Oct 2023 13:14:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 663743882AFD for ; Sun, 1 Oct 2023 20:12:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 0200138313B7 for ; Sun, 1 Oct 2023 20:11:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0200138313B7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-CSE-ConnectionGUID: uR03XYbQRJ2QigsLB1Hz/g== X-CSE-MsgGUID: dl6/TvX7TQCMK1UsmCpe2Q== X-IronPort-AV: E=Sophos;i="6.03,191,1694764800"; d="scan'208";a="18296439" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 01 Oct 2023 12:11:26 -0800 IronPort-SDR: k8/ioS3ia/juNOwnyQaOLsuATfUDw2F729QeOdVcRCkl3mUmMhI2Nsj8Dq0M8Fq7EweT6uyISU AmZhgbiew1vDdgbDm7lqs++L9fPpGrlTP0WItcUnp+j3Je/CilpAObHYscdcQ0vyEJgFsSzlAi e7DqeDEa9TEo2cbPeh7I10G/t/nKmO0liaE435ryUZZtolInAYog9wxUcu6T0GiosNUUl8P71X MECVobPvTAcWdO9OpXY4jGvkzxMQmPEec1ykbAxQ6uLWPHwdrrIvuidi4mxnOqCXHIZb2ZmGj/ NV4= From: Sandra Loosemore To: CC: , Subject: [WIP 4/4] OpenMP: C and C++ front-end support for loop transforms. Date: Sun, 1 Oct 2023 14:10:21 -0600 Message-ID: <20231001201021.785572-5-sandra@codesourcery.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231001201021.785572-1-sandra@codesourcery.com> References: <20231001201021.785572-1-sandra@codesourcery.com> MIME-Version: 1.0 X-ClientProxiedBy: svr-orw-mbx-11.mgc.mentorg.com (147.34.90.211) To svr-orw-mbx-13.mgc.mentorg.com (147.34.90.213) X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778585484318738792 X-GMAIL-MSGID: 1778585484318738792 From: Frederik Harwath gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): Handle OMP_LOOP_TRANS. * c-omp.cc (c_omp_directives): Uncomment entries for "tile" and "unroll". * c-pragma.cc (omp_pragmas_simd): Add tile and unroll. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE and PRAGMA_OMP_UNROLL. Adjust PRAGMA_OMP__LAST__. (enum pragma_clause): Add PRAGMA_OMP_CLAUSE_FULL, PRAGMA_OMP_CLAUSE_PARTIAL, and PRAGMA_OMP_CLAUSE_TILE. gcc/c/ChangeLog: * c-parser.cc (struct omp_for_parse_data): Add clauses field. (c_parser_skip_std_attribute_spec_seq): New. (check_omp_intervening_code): Reject imperfectly-nested loops with TILE directive. (c_parser_compound_statement_nostart): Handle loop transforms. (c_parser_omp_clause_name): Handle "full" and "partial". (check_no_duplicate_clause): Change to return a boolean error value. (c_parser_omp_clause_unroll_full): New. (c_parser_omp_clause_unroll_partial): New. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. (c_parser_see_omp_loop_nest): New. (c_parser_omp_loop_nest): Error on standard attributes for consistency with C++. Handle loop transformations. (c_parser_omp_for_loop): Handle loop transformations. (OMP_UNROLL_CLAUSE_MASK): Define. (c_parser_omp_tile_sizes): New. (c_parser_omp_loop_transform_clause): New. (c_parser_omp_nested_loop_transform_clauses): New. (c_parser_omp_tile): New. (c_parser_omp_unroll): New. (c_parser_omp_construct): Handle PRAGMA_OMP_TILE and PRAGMA_OMP_UNROLL. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL and OMP_CLAUSE_UNROLL_PARTIAL. gcc/cp/ChangeLog: * cp/cp-gimplify.cc (cp_gimplify_expr): Handle OMP_LOOP_TRANS. (cp_fold_r): Handle OMP_LOOP_TRANS. (cp_genericize_r): Handle OMP_LOOP_TRANS. * cp/parser.cc (check_omp_intervening_code): Reject imperfectly-nested loops with TILE directive. (cp_parser_statement_seq_opt): Handle loop transforms. (cp_parser_omp_clause_name): Handle "full" and "partial". (check_no_duplicate_clause): Change to return a boolean error value. (cp_parser_omp_clause_unroll_full): New. (cp_parser_omp_clause_unroll_partial): New. (cp_parser_omp_all_clauses): Consume comma even if first. Handle PRAGMA_OMP_CLAUSE_PARTIAL and PRAGMA_OMP_CLAUSE_FULL. (cp_parser_see_omp_loop_nest): New. (cp_parser_omp_loop_nest): Handle standard attribute syntax and loop transforms. (cp_parser_omp_for_loop): Handle loop transforms. (cp_parser_omp_tile_sizes): New. (cp_parser_omp_tile): New. (OMP_UNROLL_CLAUSE_MASK): New. (cp_parser_omp_loop_transform_clause): New. (cp_parser_nested_loop_transform_clauses): New. (cp_parser_omp_unroll): New. (cp_parser_omp_construct): Handle PRAGMA_OMP_TILE and PRAGMA_OMP_UNROLL. (cp_parser_pragma): Handle PRAGMA_OMP_TILE and PRAGMA_OMP_UNROLL. * cp/pt.cc (tsubst_omp_clauses): Handle new loop transform clauses. (tsubst_expr): Handle OMP_LOOP_TRANS. * cp/semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_TILE, OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. gcc/testsuite/ChangeLog: * c-c++-common/gomp/imperfect-attributes.c: Adjust for new attribute behavior. * c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c: New. * c-c++-common/gomp/loop-transforms/tile-1.c: New. * c-c++-common/gomp/loop-transforms/tile-2.c: New. * c-c++-common/gomp/loop-transforms/tile-3.c: New. * c-c++-common/gomp/loop-transforms/tile-4.c: New. * c-c++-common/gomp/loop-transforms/tile-5.c: New. * c-c++-common/gomp/loop-transforms/tile-6.c: New. * c-c++-common/gomp/loop-transforms/tile-7.c: New. * c-c++-common/gomp/loop-transforms/tile-8.c: New. * c-c++-common/gomp/loop-transforms/unroll-1.c: New. * c-c++-common/gomp/loop-transforms/unroll-2.c: New. * c-c++-common/gomp/loop-transforms/unroll-3.c: New. * c-c++-common/gomp/loop-transforms/unroll-4.c: New. * c-c++-common/gomp/loop-transforms/unroll-5.c: New. * c-c++-common/gomp/loop-transforms/unroll-6.c: New. * c-c++-common/gomp/loop-transforms/unroll-7.c: New. * c-c++-common/gomp/loop-transforms/unroll-8.c: New. * c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New. * c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New. * c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c: New. * c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c: New. * c-c++-common/gomp/loop-transforms/unroll-simd-1.c: New. * g++.dg/gomp/attrs-4.C: Adjust expected error message. * g++.dg/gomp/for-1.C: Adjust expected error message. * g++.dg/gomp/loop-transforms/attrs-tile-1.C: New. * g++.dg/gomp/loop-transforms/attrs-tile-2.C: New. * g++.dg/gomp/loop-transforms/attrs-tile-3.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-1.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-2.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-3.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-inner-1.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-inner-2.C: New. * g++.dg/gomp/loop-transforms/attrs-unroll-inner-3.C: New. * g++.dg/gomp/loop-transforms/tile-1.h: New. * g++.dg/gomp/loop-transforms/tile-1a.C: New. * g++.dg/gomp/loop-transforms/tile-1b.C: New. * g++.dg/gomp/loop-transforms/unroll-1.C: New. * g++.dg/gomp/loop-transforms/unroll-2.C: New. * g++.dg/gomp/loop-transforms/unroll-3.C: New. * g++.dg/gomp/pr94512.C: Adjust expected error message. * gcc.dg/gomp/for-1.c: Adjust expected error message. * gcc.dg/gomp/for-11.c: Adjust expected error message. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C: New. * testsuite/libgomp.c++/loop-transforms/tile-2.C: New. * testsuite/libgomp.c++/loop-transforms/tile-3.C: New. * testsuite/libgomp.c++/loop-transforms/unroll-1.C: New. * testsuite/libgomp.c++/loop-transforms/unroll-2.C: New. * testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C: New. * testsuite/libgomp.c-c++-common/imperfect-transform-1.c: New. * testsuite/libgomp.c-c++-common/imperfect-transform-2.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h: New. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: New. * testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c: New. * testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c: New. Co-Authored-By: Sandra Loosemore --- gcc/c-family/c-gimplify.cc | 1 + gcc/c-family/c-omp.cc | 10 +- gcc/c-family/c-pragma.cc | 2 + gcc/c-family/c-pragma.h | 7 +- gcc/c/c-parser.cc | 515 ++++++++++++++++- gcc/c/c-typeck.cc | 8 + gcc/cp/cp-gimplify.cc | 3 + gcc/cp/parser.cc | 529 +++++++++++++++++- gcc/cp/pt.cc | 13 + gcc/cp/semantics.cc | 95 ++++ .../c-c++-common/gomp/imperfect-attributes.c | 18 +- .../loop-transforms/imperfect-loop-nest.c | 11 + .../gomp/loop-transforms/tile-1.c | 160 ++++++ .../gomp/loop-transforms/tile-2.c | 179 ++++++ .../gomp/loop-transforms/tile-3.c | 109 ++++ .../gomp/loop-transforms/tile-4.c | 322 +++++++++++ .../gomp/loop-transforms/tile-5.c | 150 +++++ .../gomp/loop-transforms/tile-6.c | 34 ++ .../gomp/loop-transforms/tile-7.c | 31 + .../gomp/loop-transforms/tile-8.c | 40 ++ .../gomp/loop-transforms/unroll-1.c | 133 +++++ .../gomp/loop-transforms/unroll-2.c | 95 ++++ .../gomp/loop-transforms/unroll-3.c | 18 + .../gomp/loop-transforms/unroll-4.c | 19 + .../gomp/loop-transforms/unroll-5.c | 19 + .../gomp/loop-transforms/unroll-6.c | 20 + .../gomp/loop-transforms/unroll-7.c | 144 +++++ .../gomp/loop-transforms/unroll-8.c | 76 +++ .../gomp/loop-transforms/unroll-inner-1.c | 15 + .../gomp/loop-transforms/unroll-inner-2.c | 29 + .../gomp/loop-transforms/unroll-non-rect-1.c | 37 ++ .../gomp/loop-transforms/unroll-non-rect-2.c | 22 + .../gomp/loop-transforms/unroll-simd-1.c | 84 +++ gcc/testsuite/g++.dg/gomp/attrs-4.C | 2 +- gcc/testsuite/g++.dg/gomp/for-1.C | 2 +- .../gomp/loop-transforms/attrs-tile-1.C | 164 ++++++ .../gomp/loop-transforms/attrs-tile-2.C | 174 ++++++ .../gomp/loop-transforms/attrs-tile-3.C | 111 ++++ .../gomp/loop-transforms/attrs-unroll-1.C | 135 +++++ .../gomp/loop-transforms/attrs-unroll-2.C | 81 +++ .../gomp/loop-transforms/attrs-unroll-3.C | 20 + .../loop-transforms/attrs-unroll-inner-1.C | 15 + .../loop-transforms/attrs-unroll-inner-2.C | 29 + .../loop-transforms/attrs-unroll-inner-3.C | 71 +++ .../g++.dg/gomp/loop-transforms/tile-1.h | 27 + .../g++.dg/gomp/loop-transforms/tile-1a.C | 27 + .../g++.dg/gomp/loop-transforms/tile-1b.C | 27 + .../g++.dg/gomp/loop-transforms/unroll-1.C | 42 ++ .../g++.dg/gomp/loop-transforms/unroll-2.C | 47 ++ .../g++.dg/gomp/loop-transforms/unroll-3.C | 37 ++ gcc/testsuite/g++.dg/gomp/pr94512.C | 2 +- gcc/testsuite/gcc.dg/gomp/for-1.c | 2 +- gcc/testsuite/gcc.dg/gomp/for-11.c | 2 +- .../matrix-no-directive-unroll-full-1.C | 13 + .../libgomp.c++/loop-transforms/tile-2.C | 69 +++ .../libgomp.c++/loop-transforms/tile-3.C | 28 + .../libgomp.c++/loop-transforms/unroll-1.C | 73 +++ .../libgomp.c++/loop-transforms/unroll-2.C | 34 ++ .../loop-transforms/unroll-full-tile.C | 84 +++ .../imperfect-transform-1.c | 79 +++ .../imperfect-transform-2.c | 79 +++ .../loop-transforms/matrix-1.h | 70 +++ .../loop-transforms/matrix-constant-iter.h | 71 +++ .../loop-transforms/matrix-helper.h | 19 + .../loop-transforms/matrix-no-directive-1.c | 11 + .../matrix-no-directive-unroll-full-1.c | 13 + .../matrix-omp-distribute-parallel-for-1.c | 8 + .../loop-transforms/matrix-omp-for-1.c | 13 + .../matrix-omp-parallel-for-1.c | 13 + .../matrix-omp-parallel-masked-taskloop-1.c | 8 + ...trix-omp-parallel-masked-taskloop-simd-1.c | 8 + .../matrix-omp-target-parallel-for-1.c | 15 + ...p-target-teams-distribute-parallel-for-1.c | 10 + .../loop-transforms/matrix-omp-taskloop-1.c | 8 + ...trix-omp-teams-distribute-parallel-for-1.c | 8 + .../loop-transforms/matrix-simd-1.c | 8 + .../matrix-transform-variants-1.h | 191 +++++++ .../loop-transforms/unroll-1.c | 78 +++ .../loop-transforms/unroll-non-rect-1.c | 131 +++++ .../target-imperfect-transform-1.c | 82 +++ .../target-imperfect-transform-2.c | 82 +++ 81 files changed, 5218 insertions(+), 53 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-8.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-2.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc index 17b0610a89f..35d3a6e10d6 100644 --- a/gcc/c-family/c-gimplify.cc +++ b/gcc/c-family/c-gimplify.cc @@ -508,6 +508,7 @@ c_genericize_control_stmt (tree *stmt_p, int *walk_subtrees, void *data, case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: genericize_omp_for_stmt (stmt_p, walk_subtrees, data, func, lh); break; diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index 5de3b77c450..980c6fc0867 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -3359,14 +3359,14 @@ const struct c_omp_directive c_omp_directives[] = { C_OMP_DIR_STANDALONE, false }, { "taskyield", nullptr, nullptr, PRAGMA_OMP_TASKYIELD, C_OMP_DIR_STANDALONE, false }, - /* { "tile", nullptr, nullptr, PRAGMA_OMP_TILE, - C_OMP_DIR_CONSTRUCT, false }, */ + { "tile", nullptr, nullptr, PRAGMA_OMP_TILE, + C_OMP_DIR_CONSTRUCT, false }, { "teams", nullptr, nullptr, PRAGMA_OMP_TEAMS, C_OMP_DIR_CONSTRUCT, true }, { "threadprivate", nullptr, nullptr, PRAGMA_OMP_THREADPRIVATE, - C_OMP_DIR_DECLARATIVE, false } - /* { "unroll", nullptr, nullptr, PRAGMA_OMP_UNROLL, - C_OMP_DIR_CONSTRUCT, false }, */ + C_OMP_DIR_DECLARATIVE, false }, + { "unroll", nullptr, nullptr, PRAGMA_OMP_UNROLL, + C_OMP_DIR_CONSTRUCT, false }, }; /* Find (non-combined/composite) OpenMP directive (if any) which starts diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc index 293311dd4ce..22e5448331e 100644 --- a/gcc/c-family/c-pragma.cc +++ b/gcc/c-family/c-pragma.cc @@ -1550,6 +1550,8 @@ static const struct omp_pragma_def omp_pragmas_simd[] = { { "target", PRAGMA_OMP_TARGET }, { "taskloop", PRAGMA_OMP_TASKLOOP }, { "teams", PRAGMA_OMP_TEAMS }, + { "tile", PRAGMA_OMP_TILE }, + { "unroll", PRAGMA_OMP_UNROLL }, }; void diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index 603c5151978..703154d224e 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -81,8 +81,10 @@ enum pragma_kind { PRAGMA_OMP_TASKYIELD, PRAGMA_OMP_THREADPRIVATE, PRAGMA_OMP_TEAMS, + PRAGMA_OMP_TILE, + PRAGMA_OMP_UNROLL, /* PRAGMA_OMP__LAST_ should be equal to the last PRAGMA_OMP_* code. */ - PRAGMA_OMP__LAST_ = PRAGMA_OMP_TEAMS, + PRAGMA_OMP__LAST_ = PRAGMA_OMP_UNROLL, PRAGMA_GCC_PCH_PREPROCESS, PRAGMA_IVDEP, @@ -119,6 +121,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_FIRSTPRIVATE, PRAGMA_OMP_CLAUSE_FOR, PRAGMA_OMP_CLAUSE_FROM, + PRAGMA_OMP_CLAUSE_FULL, PRAGMA_OMP_CLAUSE_GRAINSIZE, PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR, PRAGMA_OMP_CLAUSE_HINT, @@ -141,6 +144,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_ORDER, PRAGMA_OMP_CLAUSE_ORDERED, PRAGMA_OMP_CLAUSE_PARALLEL, + PRAGMA_OMP_CLAUSE_PARTIAL, PRAGMA_OMP_CLAUSE_PRIORITY, PRAGMA_OMP_CLAUSE_PRIVATE, PRAGMA_OMP_CLAUSE_PROC_BIND, @@ -155,6 +159,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_TASKGROUP, PRAGMA_OMP_CLAUSE_THREAD_LIMIT, PRAGMA_OMP_CLAUSE_THREADS, + PRAGMA_OMP_CLAUSE_TILE, PRAGMA_OMP_CLAUSE_TO, PRAGMA_OMP_CLAUSE_UNIFORM, PRAGMA_OMP_CLAUSE_UNTIED, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index e6342d2188d..45f59ded1a7 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -1554,6 +1554,7 @@ static tree objc_foreach_break_label, objc_foreach_continue_label; */ struct omp_for_parse_data { enum tree_code code; + tree clauses; tree declv, condv, incrv, initv; tree pre_body; tree bindings; @@ -1662,7 +1663,10 @@ static void c_parser_omp_threadprivate (c_parser *); static void c_parser_omp_barrier (c_parser *); static void c_parser_omp_depobj (c_parser *); static void c_parser_omp_flush (c_parser *); +static bool c_parser_see_omp_loop_nest (c_parser *, enum tree_code, bool); static tree c_parser_omp_loop_nest (c_parser *, bool *); +static int c_parser_omp_nested_loop_transform_clauses (c_parser *, tree &, int, + int, const char *); static tree c_parser_omp_for_loop (location_t, c_parser *, enum tree_code, tree, tree *, bool *); static void c_parser_omp_taskwait (c_parser *); @@ -5720,6 +5724,37 @@ c_parser_nth_token_starts_std_attributes (c_parser *parser, unsigned int n) return token->type == CPP_CLOSE_SQUARE; } +/* Skip standard attribute tokens starting at Nth token (with 1 as the + next token), return index of the first token after the standard + attribute tokens, or N on failure. */ + +static size_t +c_parser_skip_std_attribute_spec_seq (c_parser *parser, size_t n) +{ + size_t orig_n = n; + while (true) + { + if (c_parser_peek_nth_token_raw (parser, n)->type == CPP_OPEN_SQUARE + && (c_parser_peek_nth_token_raw (parser, n + 1)->type + == CPP_OPEN_SQUARE)) + { + unsigned int m = n + 2; + if (!c_parser_check_balanced_raw_token_sequence (parser, &m)) + return orig_n; + c_token *token = c_parser_peek_nth_token_raw (parser, m); + if (token->type != CPP_CLOSE_SQUARE) + return orig_n; + token = c_parser_peek_nth_token_raw (parser, m + 1); + if (token->type != CPP_CLOSE_SQUARE) + return orig_n; + n = m + 2; + } + else + break; + } + return n; +} + static tree c_parser_std_attribute_specifier_sequence (c_parser *parser) { @@ -6305,7 +6340,20 @@ check_omp_intervening_code (c_parser *parser) "% % clause"); omp_for_parse_state->perfect_nesting_fail = true; } - /* TODO: Also reject loops with TILE directive. */ + else + { + tree c = omp_find_clause (omp_for_parse_state->clauses, + OMP_CLAUSE_TILE); + if (c && + ((int) tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (c)) + <= omp_for_parse_state->depth)) + { + error_at (omp_for_parse_state->for_loc, + "inner loops must be perfectly nested " + "with % directive"); + omp_for_parse_state->perfect_nesting_fail = true; + } + } if (omp_for_parse_state->perfect_nesting_fail) omp_for_parse_state->fail = true; } @@ -6412,7 +6460,9 @@ c_parser_compound_statement_nostart (c_parser *parser) if (in_omp_loop_block && !last_label) { if (want_nested_loop - && c_parser_next_token_is_keyword (parser, RID_FOR)) + && c_parser_see_omp_loop_nest (parser, + omp_for_parse_state->code, + false)) { /* Found the next nested loop. If there were intervening code statements collected before now, wrap them in an @@ -13953,6 +14003,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE; else if (!strcmp ("from", p)) result = PRAGMA_OMP_CLAUSE_FROM; + else if (!strcmp ("full", p)) + result = PRAGMA_OMP_CLAUSE_FULL; break; case 'g': if (!strcmp ("gang", p)) @@ -14027,6 +14079,8 @@ c_parser_omp_clause_name (c_parser *parser) case 'p': if (!strcmp ("parallel", p)) result = PRAGMA_OMP_CLAUSE_PARALLEL; + else if (!strcmp ("partial", p)) + result = PRAGMA_OMP_CLAUSE_PARTIAL; else if (!strcmp ("present", p)) result = PRAGMA_OACC_CLAUSE_PRESENT; /* As of OpenACC 2.5, these are now aliases of the non-present_or @@ -14121,12 +14175,15 @@ c_parser_omp_clause_name (c_parser *parser) /* Validate that a clause of the given type does not already exist. */ -static void +static bool check_no_duplicate_clause (tree clauses, enum omp_clause_code code, const char *name) { - if (tree c = omp_find_clause (clauses, code)) + tree c = omp_find_clause (clauses, code); + if (c) error_at (OMP_CLAUSE_LOCATION (c), "too many %qs clauses", name); + + return c == NULL_TREE; } /* OpenACC 2.0 @@ -17993,6 +18050,67 @@ c_parser_omp_clause_uniform (c_parser *parser, tree list) return list; } +/* OpenMP 5.1 + full */ + +static tree +c_parser_omp_clause_unroll_full (c_parser *parser, tree list) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_FULL, "full")) + return list; + + location_t loc = c_parser_peek_token (parser)->location; + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = list; + return c; +} + +/* OpenMP 5.1 + partial ( constant-expression ) */ + +static tree +c_parser_omp_clause_unroll_partial (c_parser *parser, tree list) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_PARTIAL, "partial")) + return list; + + tree c, num = error_mark_node; + HOST_WIDE_INT n; + location_t loc; + + loc = c_parser_peek_token (parser)->location; + c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = list; + + if (!c_parser_next_token_is (parser, CPP_OPEN_PAREN)) + return c; + + matching_parens parens; + parens.consume_open (parser); + num = c_parser_expr_no_commas (parser, NULL).value; + parens.skip_until_found_close (parser); + + if (num == error_mark_node) + return list; + + mark_exp_read (num); + num = c_fully_fold (num, false, NULL); + if (!INTEGRAL_TYPE_P (TREE_TYPE (num)) || !tree_fits_shwi_p (num) + || (n = tree_to_shwi (num)) <= 0 || (int)n != n) + { + error_at (loc, + "partial argument needs positive constant integer expression"); + return list; + } + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = num; + + return c; +} + /* OpenMP 5.0: detach ( event-handle ) */ @@ -18589,6 +18707,14 @@ c_parser_omp_all_clauses (c_parser *parser, omp_clause_mask mask, clauses); c_name = "enter"; break; + case PRAGMA_OMP_CLAUSE_FULL: + c_name = "full"; + clauses = c_parser_omp_clause_unroll_full (parser, clauses); + break; + case PRAGMA_OMP_CLAUSE_PARTIAL: + c_name = "partial"; + clauses = c_parser_omp_clause_unroll_partial (parser, clauses); + break; default: c_parser_error (parser, "expected an OpenMP clause"); goto saw_error; @@ -20848,6 +20974,44 @@ c_parser_omp_scan_loop_body (c_parser *parser, bool open_brace_parsed) } +/* Check that the next token starts a loop nest. Return true if yes, + otherwise diagnose an error if ERROR_P is true, and return false. */ +static bool +c_parser_see_omp_loop_nest (c_parser *parser, enum tree_code code, + bool error_p) +{ + if (code == OACC_LOOP) + { + if (c_parser_next_token_is_keyword (parser, RID_FOR)) + return true; + if (error_p) + c_parser_error (parser, "for statement expected"); + } + else + { + if (c_parser_next_token_is_keyword (parser, RID_FOR) + || c_parser_peek_token (parser)->pragma_kind == PRAGMA_OMP_UNROLL + || c_parser_peek_token (parser)->pragma_kind == PRAGMA_OMP_TILE) + return true; + + /* For consistency with C++, treat standard attributes followed + by RID_FOR as a loop nest, but diagnose unknown attributes as + an error in c_parser_omp_loop_nest. */ + size_t n = c_parser_skip_std_attribute_spec_seq (parser, 1); + c_token *token = c_parser_peek_nth_token_raw (parser, n); + /* TOKEN is a raw token that hasn't been converted to a keyword yet, + we have to do the lookup explicitly. */ + if (token->type == CPP_NAME + && C_IS_RESERVED_WORD (token->value) + && C_RID_CODE (token->value) == RID_FOR) + return true; + if (error_p) + c_parser_error (parser, "loop nest expected"); + } + + return false; +} + /* This function parses a single level of a loop nest, invoking itself recursively if necessary. @@ -20883,6 +21047,51 @@ c_parser_omp_loop_nest (c_parser *parser, bool *if_p) gcc_assert (omp_for_parse_state); int depth = omp_for_parse_state->depth; + /* Reject non-empty standard attributes with an error. C++ allows OpenMP + directives to be specified with attribute syntax, but C does not. */ + if (c_parser_nth_token_starts_std_attributes (parser, 1)) + { + tree std_attrs = c_parser_std_attribute_specifier_sequence (parser); + if (std_attrs) + error_at (c_parser_peek_token (parser)->location, + "attributes are not allowed on % in loop nest"); + } + + /* Handle loop transformations first. Note that when we get here + omp_for_parse_state->depth has already been incremented to indicate + the depth of the *next* loop, not the level of the loop body the + transformation directive appears in. */ + if (c_parser_peek_token (parser)->pragma_kind == PRAGMA_OMP_UNROLL + || c_parser_peek_token (parser)->pragma_kind == PRAGMA_OMP_TILE) + { + int count = omp_for_parse_state->count; + int more = c_parser_omp_nested_loop_transform_clauses ( + parser, omp_for_parse_state->clauses, + depth, count - depth, "loop collapse"); + if (depth + more > count) + { + count = depth + more; + omp_for_parse_state->count = count; + omp_for_parse_state->declv + = grow_tree_vec (omp_for_parse_state->declv, count); + omp_for_parse_state->initv + = grow_tree_vec (omp_for_parse_state->initv, count); + omp_for_parse_state->condv + = grow_tree_vec (omp_for_parse_state->condv, count); + omp_for_parse_state->incrv + = grow_tree_vec (omp_for_parse_state->incrv, count); + } + if (c_parser_see_omp_loop_nest (parser, omp_for_parse_state->code, + true)) + return c_parser_omp_loop_nest (parser, if_p); + else + { + /* FIXME: Better error recovery here? */ + omp_for_parse_state->fail = true; + return NULL_TREE; + } + } + /* We have already matched the FOR token but not consumed it yet. */ loc = c_parser_peek_token (parser)->location; gcc_assert (c_parser_next_token_is_keyword (parser, RID_FOR)); @@ -21017,7 +21226,9 @@ c_parser_omp_loop_nest (c_parser *parser, bool *if_p) parse_next: moreloops = depth < omp_for_parse_state->count - 1; omp_for_parse_state->want_nested_loop = moreloops; - if (moreloops && c_parser_next_token_is_keyword (parser, RID_FOR)) + if (moreloops + && c_parser_see_omp_loop_nest (parser, omp_for_parse_state->code, + false)) { omp_for_parse_state->depth++; body = c_parser_omp_loop_nest (parser, if_p); @@ -21129,7 +21340,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, tree ret = NULL_TREE; tree ordered_cl = NULL_TREE; int i, collapse = 1, ordered = 0, count; - bool tiling = false; + bool oacc_tiling = false; bool inscan = false; struct omp_for_parse_data data; struct omp_for_parse_data *save_data = parser->omp_for_parse_state; @@ -21139,7 +21350,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { - tiling = true; + oacc_tiling = true; collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED @@ -21162,15 +21373,29 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, ordered = collapse; } - gcc_assert (tiling || (collapse >= 1 && ordered >= 0)); - count = ordered ? ordered : collapse; + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, collapse, + "loop collapse"); - if (!c_parser_next_token_is_keyword (parser, RID_FOR)) + /* Find the depth of the loop nest affected by "omp tile" + directives. There can be several such directives, but the tiling + depth of the outer ones may not be larger than the depth of the + innermost directive. */ + int omp_tile_depth = 0; + for (tree c = clauses; c; c = TREE_CHAIN (c)) { - c_parser_error (parser, "for statement expected"); - return NULL; + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + omp_tile_depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); } + gcc_assert (oacc_tiling || (collapse >= 1 && ordered >= 0)); + count = ordered ? ordered : collapse; + count = MAX (count, omp_tile_depth); + + if (!c_parser_see_omp_loop_nest (parser, code, true)) + return NULL; + /* Initialize parse state for recursive descent. */ data.declv = make_tree_vec (count); data.initv = make_tree_vec (count); @@ -21189,9 +21414,11 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, data.inscan = inscan; data.saw_intervening_code = false; data.code = code; + data.clauses = clauses; parser->omp_for_parse_state = &data; body = c_parser_omp_loop_nest (parser, if_p); + count = data.count; /* Add saved bindings for iteration variables that were declared in the nested for loop to the scope surrounding the entire loop. */ @@ -24641,6 +24868,263 @@ c_parser_omp_taskloop (location_t loc, c_parser *parser, return ret; } +#define OMP_UNROLL_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) + +/* OpenMP 5.1: Parse sizes list for "omp tile sizes" + sizes ( size-expr-list ) */ +static tree +c_parser_omp_tile_sizes (c_parser *parser, location_t loc) +{ + tree sizes = NULL_TREE; + + c_token *tok = c_parser_peek_token (parser); + if (tok->type != CPP_NAME + || strcmp ("sizes", IDENTIFIER_POINTER (tok->value))) + { + c_parser_error (parser, "expected %"); + return error_mark_node; + } + c_parser_consume_token (parser); + + if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>")) + return error_mark_node; + + do + { + if (sizes && !c_parser_require (parser, CPP_COMMA, "expected %<,%>")) + return error_mark_node; + + location_t expr_loc = c_parser_peek_token (parser)->location; + c_expr cexpr = c_parser_expr_no_commas (parser, NULL); + cexpr = convert_lvalue_to_rvalue (expr_loc, cexpr, false, true); + tree expr = cexpr.value; + + if (expr == error_mark_node) + { + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, + "expected %<)%>"); + return error_mark_node; + } + + expr = c_fully_fold (expr, false, NULL); + + if (!INTEGRAL_TYPE_P (TREE_TYPE (expr)) || !tree_fits_shwi_p (expr) + || tree_to_shwi (expr) <= 0) + { + c_parser_error (parser, "% argument needs positive" + " integral constant"); + expr = integer_zero_node; + } + + sizes = tree_cons (NULL_TREE, expr, sizes); + } + while (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN)); + c_parser_consume_token (parser); + + gcc_assert (sizes); + tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_TILE_SIZES (c) = sizes; + + return c; +} + +/* Parse a single OpenMP loop transformation directive and return the + clause that is used internally to represent the directive. */ + +static tree +c_parser_omp_loop_transform_clause (c_parser *parser) +{ + c_token *tok = c_parser_peek_token (parser); + if (tok->type != CPP_PRAGMA) + return NULL_TREE; + + tree c; + switch (tok->pragma_kind) + { + case PRAGMA_OMP_UNROLL: + c_parser_consume_pragma (parser); + c = c_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + "#pragma omp unroll", false, true); + if (!c && c_parser_next_token_is (parser, CPP_PRAGMA_EOL)) + { + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, 0); + } + else if (!c) + c = error_mark_node; + c_parser_skip_to_pragma_eol (parser); + break; + + case PRAGMA_OMP_TILE: + c_parser_consume_pragma (parser); + c = c_parser_omp_tile_sizes (parser, tok->location); + c_parser_skip_to_pragma_eol (parser); + break; + + default: + c = NULL_TREE; + break; + } + + gcc_assert (!c || !TREE_CHAIN (c)); + return c; +} + +/* Parse zero or more OpenMP loop transformation directives that + follow another directive that requires a canonical loop nest, + append all to CLAUSES and record the LEVEL at which the clauses + appear in the loop nest in each clause. + + REQUIRED_DEPTH is the nesting depth of the loop nest required by + the preceding directive. OUTER_DESCR is a description of the + language construct that requires the loop nest depth (e.g. "loop + collpase", "outer transformation") that is used for error + messages. */ + +static int +c_parser_omp_nested_loop_transform_clauses (c_parser *parser, tree &clauses, + int level, int required_depth, + const char *outer_descr) +{ + tree c = NULL_TREE; + tree last_c = tree_last (clauses); + + /* The depth of the loop nest, counting from LEVEL, after the + transformations. That is, the nesting depth left by the outermost + transformation which is the first to be parsed, but the last to be + executed. */ + int transformed_depth = 0; + + /* The minimum nesting depth required by the last parsed transformation. */ + int last_depth = required_depth; + while ((c = c_parser_omp_loop_transform_clause (parser))) + { + /* The nesting depth left after the current transformation. */ + int depth = 1; + if (TREE_CODE (c) == ERROR_MARK) + goto error; + + gcc_assert (!TREE_CHAIN (c)); + switch (OMP_CLAUSE_CODE (c)) + { + case OMP_CLAUSE_UNROLL_FULL: + error_at (OMP_CLAUSE_LOCATION (c), + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_NONE: + error_at (OMP_CLAUSE_LOCATION (c), + "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_PARTIAL: + depth = 1; + break; + case OMP_CLAUSE_TILE: + depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + break; + default: + gcc_unreachable (); + } + + if (depth < last_depth) + { + bool is_outermost_clause = !transformed_depth; + error_at (OMP_CLAUSE_LOCATION (c), + "nesting depth left after this transformation too low " + "for %s", + is_outermost_clause ? outer_descr + : "outer transformation"); + goto error; + } + + last_depth = depth; + + if (!transformed_depth) + transformed_depth = last_depth; + + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, level); + if (!clauses) + clauses = c; + else if (last_c) + TREE_CHAIN (last_c) = c; + + last_c = c; + } + + return transformed_depth; + +error: + while (c_parser_omp_loop_transform_clause (parser)) + ; + clauses = NULL_TREE; + return -1; +} + +/* OpenMP 5.1: + tile sizes ( size-expr-list ) */ + +static tree +c_parser_omp_tile (location_t loc, c_parser *parser, bool *if_p) +{ + tree block; + tree ret = error_mark_node; + + tree clauses = c_parser_omp_tile_sizes (parser, loc); + c_parser_skip_to_pragma_eol (parser); + + if (!clauses || clauses == error_mark_node) + return error_mark_node; + + int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); + + block = c_begin_compound_stmt (true); + ret = c_parser_omp_for_loop (loc, parser, OMP_LOOP_TRANS, clauses, + NULL, if_p); + block = c_end_compound_stmt (loc, block, true); + add_stmt (block); + + return ret; + } + +static tree +c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) +{ + tree block, ret; + static const char *p_name = "#pragma omp unroll"; + omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; + + tree clauses = c_parser_omp_all_clauses (parser, mask, p_name, false); + int required_depth = 1; + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); + + if (!clauses) + { + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = clauses; + clauses = c; + } + + block = c_begin_compound_stmt (true); + ret = c_parser_omp_for_loop (loc, parser, OMP_LOOP_TRANS, clauses, + NULL, if_p); + block = c_end_compound_stmt (loc, block, true); + add_stmt (block); + + return ret; +} + /* OpenMP 5.1 #pragma omp nothing new-line */ @@ -25032,6 +25516,7 @@ c_parser_omp_construct (c_parser *parser, bool *if_p) p_kind = c_parser_peek_token (parser)->pragma_kind; c_parser_consume_pragma (parser); + gcc_assert (parser->in_pragma); switch (p_kind) { case PRAGMA_OACC_ATOMIC: @@ -25122,6 +25607,12 @@ c_parser_omp_construct (c_parser *parser, bool *if_p) case PRAGMA_OMP_ASSUME: c_parser_omp_assume (parser, if_p); return; + case PRAGMA_OMP_TILE: + stmt = c_parser_omp_tile (loc, parser, if_p); + break; + case PRAGMA_OMP_UNROLL: + stmt = c_parser_omp_unroll (loc, parser, if_p); + break; default: gcc_unreachable (); } diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 54a5f208cb5..7de61ae9a2e 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -15920,6 +15920,14 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) pc = &OMP_CLAUSE_CHAIN (c); continue; + case OMP_CLAUSE_UNROLL_FULL: + pc = &OMP_CLAUSE_CHAIN (c); + continue; + + case OMP_CLAUSE_UNROLL_PARTIAL: + pc = &OMP_CLAUSE_CHAIN (c); + continue; + case OMP_CLAUSE_INBRANCH: case OMP_CLAUSE_NOTINBRANCH: if (branch_seen) diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index bdf6e5f98ff..7d6986be86f 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -647,6 +647,7 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: ret = cp_gimplify_omp_for (expr_p, pre_p); break; @@ -1188,6 +1189,7 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_) case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: cp_walk_tree (&OMP_FOR_BODY (stmt), cp_fold_r, data, NULL); cp_walk_tree (&OMP_FOR_CLAUSES (stmt), cp_fold_r, data, NULL); @@ -1964,6 +1966,7 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data) case OMP_FOR: case OMP_SIMD: case OMP_LOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: case STATEMENT_LIST: /* These cases are handled by shared code. */ diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index defb81ca8c1..91b1a5a1464 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -2972,7 +2972,11 @@ static bool cp_parser_skip_up_to_closing_square_bracket static bool cp_parser_skip_to_closing_square_bracket (cp_parser *); static size_t cp_parser_skip_balanced_tokens (cp_parser *, size_t); +static bool cp_parser_see_omp_loop_nest (cp_parser *, enum tree_code, bool); static tree cp_parser_omp_loop_nest (cp_parser *, bool *); +static int cp_parser_omp_nested_loop_transform_clauses (cp_parser *, tree &, + int, int, + const char *); // -------------------------------------------------------------------------- // // Unevaluated Operand Guard @@ -13119,7 +13123,20 @@ check_omp_intervening_code (cp_parser *parser) "% % clause"); omp_for_parse_state->perfect_nesting_fail = true; } - /* TODO: Also reject loops with TILE directive. */ + else + { + tree c = omp_find_clause (omp_for_parse_state->clauses, + OMP_CLAUSE_TILE); + if (c && + ((int) tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (c)) + <= omp_for_parse_state->depth)) + { + error_at (omp_for_parse_state->for_loc, + "inner loops must be perfectly nested " + "with % directive"); + omp_for_parse_state->perfect_nesting_fail = true; + } + } if (omp_for_parse_state->perfect_nesting_fail) omp_for_parse_state->fail = true; } @@ -13171,7 +13188,9 @@ cp_parser_statement_seq_opt (cp_parser* parser, tree in_statement_expr) { bool want_nested_loop = omp_for_parse_state->want_nested_loop; if (want_nested_loop - && token->type == CPP_KEYWORD && token->keyword == RID_FOR) + && cp_parser_see_omp_loop_nest (parser, + omp_for_parse_state->code, + false)) { /* Found the nested loop. */ omp_for_parse_state->depth++; @@ -37494,6 +37513,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE; else if (!strcmp ("from", p)) result = PRAGMA_OMP_CLAUSE_FROM; + else if (!strcmp ("full", p)) + result = PRAGMA_OMP_CLAUSE_FULL; break; case 'g': if (!strcmp ("gang", p)) @@ -37568,6 +37589,8 @@ cp_parser_omp_clause_name (cp_parser *parser) case 'p': if (!strcmp ("parallel", p)) result = PRAGMA_OMP_CLAUSE_PARALLEL; + if (!strcmp ("partial", p)) + result = PRAGMA_OMP_CLAUSE_PARTIAL; else if (!strcmp ("present", p)) result = PRAGMA_OACC_CLAUSE_PRESENT; else if (!strcmp ("present_or_copy", p) @@ -37658,12 +37681,15 @@ cp_parser_omp_clause_name (cp_parser *parser) /* Validate that a clause of the given type does not already exist. */ -static void +static bool check_no_duplicate_clause (tree clauses, enum omp_clause_code code, const char *name, location_t location) { - if (omp_find_clause (clauses, code)) + bool found = omp_find_clause (clauses, code); + if (found) error_at (location, "too many %qs clauses", name); + + return !found; } /* OpenMP 2.5: @@ -39782,6 +39808,58 @@ cp_parser_omp_clause_thread_limit (cp_parser *parser, tree list, return c; } +/* OpenMP 5.1 + full */ + +static tree +cp_parser_omp_clause_unroll_full (tree list, location_t loc) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_FULL, "full", loc)) + return list; + + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = list; + return c; +} + +/* OpenMP 5.1 + partial ( constant-expression ) */ + +static tree +cp_parser_omp_clause_unroll_partial (cp_parser *parser, tree list, + location_t loc) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_PARTIAL, "partial", + loc)) + return list; + + tree c, num = error_mark_node; + c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = list; + + if (!cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) + return c; + + matching_parens parens; + parens.consume_open (parser); + num = cp_parser_constant_expression (parser); + cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true, + /*or_comma=*/false, + /*consume_paren=*/true); + + if (num == error_mark_node) + return list; + + mark_exp_read (num); + num = fold_non_dependent_expr (num); + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = num; + return c; +} + /* OpenMP 4.0: aligned ( variable-list ) aligned ( variable-list : constant-expression ) */ @@ -41473,15 +41551,12 @@ cp_parser_omp_all_clauses (cp_parser *parser, omp_clause_mask mask, if (nested && cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN)) break; - if (!first || nested != 2) - { - if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA)) - cp_lexer_consume_token (parser->lexer); - else if (nested == 2) - error_at (cp_lexer_peek_token (parser->lexer)->location, - "clauses in % trait should be separated " - "by %<,%>"); - } + if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA)) + cp_lexer_consume_token (parser->lexer); + else if (!first && nested == 2) + error_at (cp_lexer_peek_token (parser->lexer)->location, + "clauses in % trait should be separated " + "by %<,%>"); token = cp_lexer_peek_token (parser->lexer); c_kind = cp_parser_omp_clause_name (parser); @@ -41822,6 +41897,16 @@ cp_parser_omp_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses); c_name = "enter"; break; + case PRAGMA_OMP_CLAUSE_PARTIAL: + clauses = cp_parser_omp_clause_unroll_partial (parser, clauses, + token->location); + c_name = "partial"; + break; + case PRAGMA_OMP_CLAUSE_FULL: + clauses = cp_parser_omp_clause_unroll_full (clauses, + token->location); + c_name = "full"; + break; default: cp_parser_error (parser, "expected an OpenMP clause"); goto saw_error; @@ -43999,6 +44084,44 @@ cp_parser_omp_scan_loop_body (cp_parser *parser) } +/* Check that the next token starts a loop nest. Return true if yes, + otherwise diagnose an error if ERROR_P is true and return false. */ +static bool +cp_parser_see_omp_loop_nest (cp_parser *parser, enum tree_code code, + bool error_p) +{ + if (code == OACC_LOOP) + { + if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)) + return true; + if (error_p) + cp_parser_error (parser, "for statement expected"); + } + else + { + if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR) + || (cp_parser_pragma_kind (cp_lexer_peek_token (parser->lexer)) + == PRAGMA_OMP_UNROLL) + || (cp_parser_pragma_kind (cp_lexer_peek_token (parser->lexer)) + == PRAGMA_OMP_TILE)) + return true; + /* The OpenMP spec isn't very clear on this. Here we consider that + any attribute specifier sequence followed by a FOR loop is a loop + nest, but cp_parser_omp_loop_nest rejects invalid attributes + with an error. If we rejected such things here, too, then the + associated FOR statement would be considered intervening code + instead, and we would get a different error about a loop in + intervening code. */ + size_t n = cp_parser_skip_std_attribute_spec_seq (parser, 1); + if (cp_lexer_nth_token_is_keyword (parser->lexer, n, RID_FOR)) + return true; + if (error_p) + cp_parser_error (parser, "loop nest expected"); + } + return false; +} + + /* This function parses a single level of a loop nest, invoking itself recursively if necessary. @@ -44051,8 +44174,69 @@ cp_parser_omp_loop_nest (cp_parser *parser, bool *if_p) gcc_assert (omp_for_parse_state); int depth = omp_for_parse_state->depth; - /* We have already matched the FOR token but not consumed it yet. */ - gcc_assert (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)); + /* Handle loop transformations first. Note that when we get here + omp_for_parse_state->depth has already been incremented to indicate + the depth of the *next* loop, not the level of the loop body the + transformation directive appears in. */ + + /* Arrange for C++ standard attribute syntax to be parsed as regular + pragmas. Give an error if there are other random attributes present. */ + cp_token *token = cp_lexer_peek_token (parser->lexer); + tree std_attrs = cp_parser_std_attribute_spec_seq (parser); + std_attrs = cp_parser_handle_statement_omp_attributes (parser, std_attrs); + if (std_attrs) + error_at (token->location, + "attributes other than OpenMP directives " + "are not allowed on % in loop nest"); + + if ((cp_parser_pragma_kind (cp_lexer_peek_token (parser->lexer)) + == PRAGMA_OMP_UNROLL) + || (cp_parser_pragma_kind (cp_lexer_peek_token (parser->lexer)) + == PRAGMA_OMP_TILE)) + { + int count = omp_for_parse_state->count; + int more = cp_parser_omp_nested_loop_transform_clauses ( + parser, omp_for_parse_state->clauses, + depth, count - depth, "loop collapse"); + if (depth + more > count) + { + count = depth + more; + omp_for_parse_state->count = count; + omp_for_parse_state->declv + = grow_tree_vec (omp_for_parse_state->declv, count); + omp_for_parse_state->initv + = grow_tree_vec (omp_for_parse_state->initv, count); + omp_for_parse_state->condv + = grow_tree_vec (omp_for_parse_state->condv, count); + omp_for_parse_state->incrv + = grow_tree_vec (omp_for_parse_state->incrv, count); + if (omp_for_parse_state->orig_declv) + omp_for_parse_state->orig_declv + = grow_tree_vec (omp_for_parse_state->orig_declv, count); + } + } + + /* Diagnose errors if we don't have a "for" loop following the + optional loop transforms. Otherwise, consume the token. */ + if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)) + { + omp_for_parse_state->fail = true; + cp_token *token = cp_lexer_peek_token (parser->lexer); + /* Don't call cp_parser_error here since it overrides the + provided message with a more confusing one if there was + a bad pragma or attribute directive. */ + error_at (token->location, "loop nest expected"); + /* See if we can recover by skipping over bad pragma(s). */ + while (token->type == CPP_PRAGMA) + { + cp_parser_skip_to_pragma_eol (parser, token); + if (cp_parser_see_omp_loop_nest (parser, omp_for_parse_state->code, + false)) + return cp_parser_omp_loop_nest (parser, if_p); + token = cp_lexer_peek_token (parser->lexer); + } + return NULL_TREE; + } loc = cp_lexer_consume_token (parser->lexer)->location; /* Forbid break/continue in the loop initializer, condition, and @@ -44307,7 +44491,9 @@ cp_parser_omp_loop_nest (cp_parser *parser, bool *if_p) moreloops = depth < omp_for_parse_state->count - 1; omp_for_parse_state->want_nested_loop = moreloops; - if (moreloops && cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)) + if (moreloops + && cp_parser_see_omp_loop_nest (parser, omp_for_parse_state->code, + false)) { omp_for_parse_state->depth++; add_stmt (cp_parser_omp_loop_nest (parser, if_p)); @@ -44582,6 +44768,10 @@ fixup_blocks_walker (tree *tp, int *walk_subtrees, void *dp) return NULL; } +static int cp_parser_omp_nested_loop_transform_clauses (cp_parser *, tree &, + int, int, + const char *); + /* Parse the restricted form of the for statement allowed by OpenMP. */ static tree @@ -44592,7 +44782,7 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, tree cl, ordered_cl = NULL_TREE; int collapse = 1, ordered = 0; unsigned int count; - bool tiling = false; + bool oacc_tiling = false; bool inscan = false; struct omp_for_parse_data data; struct omp_for_parse_data *save_data = parser->omp_for_parse_state; @@ -44604,7 +44794,7 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { - tiling = true; + oacc_tiling = true; collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED @@ -44627,14 +44817,28 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, ordered = collapse; } - gcc_assert (tiling || (collapse >= 1 && ordered >= 0)); + gcc_assert (oacc_tiling || (collapse >= 1 && ordered >= 0)); count = ordered ? ordered : collapse; - if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)) + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, count, + "loop collapse"); + + /* Find the depth of the loop nest affected by "omp tile" + directives. There can be several such directives, but the tiling + depth of the outer ones may not be larger than the depth of the + innermost directive. */ + unsigned int omp_tile_depth = 0; + for (tree c = clauses; c; c = TREE_CHAIN (c)) { - cp_parser_error (parser, "for statement expected"); - return NULL; + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + omp_tile_depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); } + count = MAX (count, omp_tile_depth); + + if (!cp_parser_see_omp_loop_nest (parser, code, true)) + return NULL; /* Initialize parse state for recursive descent. */ data.declv = make_tree_vec (count); @@ -44687,6 +44891,7 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, We also need to flatten the init blocks, as some code for later processing of combined directives gets confused otherwise. */ + count = data.count; gcc_assert (vec_safe_length (data.init_blockv) == count); gcc_assert (vec_safe_length (data.body_blockv) == count); gcc_assert (vec_safe_length (data.init_placeholderv) == count); @@ -46479,6 +46684,272 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, return true; } + +/* OpenMP 5.1: Parse sizes list for "omp tile sizes" + sizes ( size-expr-list ) */ +static tree +cp_parser_omp_tile_sizes (cp_parser *parser, location_t loc) +{ + tree sizes = NULL_TREE; + cp_lexer *lexer = parser->lexer; + + if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA)) + cp_lexer_consume_token (parser->lexer); + + cp_token *tok = cp_lexer_peek_token (lexer); + if (tok->type != CPP_NAME + || strcmp ("sizes", IDENTIFIER_POINTER (tok->u.value))) + { + cp_parser_error (parser, "expected %"); + return error_mark_node; + } + cp_lexer_consume_token (lexer); + + if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN)) + return error_mark_node; + + do + { + if (sizes && !cp_parser_require (parser, CPP_COMMA, RT_COMMA)) + return error_mark_node; + + tree expr = cp_parser_constant_expression (parser); + if (expr == error_mark_node) + { + cp_parser_skip_to_closing_parenthesis (parser, + /*recovering=*/true, + /*or_comma=*/false, + /*consume_paren=*/ + true); + return error_mark_node; + } + + sizes = tree_cons (NULL_TREE, expr, sizes); + } + while (cp_lexer_next_token_is_not (lexer, CPP_CLOSE_PAREN)); + cp_lexer_consume_token (lexer); + + gcc_assert (sizes); + tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_TILE_SIZES (c) = sizes; + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, 0); + + return c; +} + +/* OpenMP 5.1: + tile sizes ( size-expr-list ) */ + +static tree +cp_parser_omp_tile (cp_parser *parser, cp_token *tok, bool *if_p) +{ + tree block; + tree ret = error_mark_node; + + gcc_assert (!parser->omp_for_parse_state); + + tree clauses = cp_parser_omp_tile_sizes (parser, tok->location); + cp_parser_require_pragma_eol (parser, tok); + + if (!clauses || clauses == error_mark_node) + return error_mark_node; + + int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); + + block = begin_omp_structured_block (); + clauses = finish_omp_clauses (clauses, C_ORT_OMP); + + ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = finish_omp_structured_block (block); + add_stmt (block); + + return ret; +} + +#define OMP_UNROLL_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) + +/* Parse a single OpenMP loop transformation directive and return the + clause that is used internally to represent the directive. */ + +static tree +cp_parser_omp_loop_transform_clause (cp_parser *parser) +{ + cp_lexer *lexer = parser->lexer; + cp_token *tok = cp_lexer_peek_token (lexer); + if (tok->type != CPP_PRAGMA) + return NULL_TREE; + + tree c; + switch (cp_parser_pragma_kind (tok)) + { + case PRAGMA_OMP_UNROLL: + cp_lexer_consume_token (lexer); + lexer->in_pragma = true; + c = cp_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + "#pragma omp unroll", tok, + false, true); + if (!c && cp_lexer_next_token_is (lexer, CPP_PRAGMA_EOL)) + { + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, 0); + } + else if (!c) + c = error_mark_node; + cp_parser_skip_to_pragma_eol (parser, tok); + break; + + case PRAGMA_OMP_TILE: + cp_lexer_consume_token (lexer); + lexer->in_pragma = true; + c = cp_parser_omp_tile_sizes (parser, tok->location); + cp_parser_require_pragma_eol (parser, tok); + break; + + default: + c = NULL_TREE; + break; + } + + gcc_assert (!c || !TREE_CHAIN (c)); + return c; +} + +/* Parse zero or more OpenMP loop transformation directives that + follow another directive that requires a canonical loop nest, + append all to CLAUSES, and require the level at which the clause + appears in the loop nest in each clause. Return the nesting depth + of the transformed loop nest. + + REQUIRED_DEPTH is the nesting depth of the loop nest required by + the preceding directive. OUTER_DESCR is a description of the + language construct that requires the loop nest depth (e.g. "loop + collpase", "outer transformation") that is used for error + messages. */ + +static int +cp_parser_omp_nested_loop_transform_clauses (cp_parser *parser, tree &clauses, + int level, int required_depth, + const char *outer_descr) +{ + tree c = NULL_TREE; + tree last_c = tree_last (clauses); + + /* The depth of the loop nest after the transformations. That is, + the nesting depth left by the outermost transformation which is + the first to be parsed, but the last to be executed. */ + int transformed_depth = 0; + + /* The minimum nesting depth required by the last parsed transformation. */ + int last_depth = required_depth; + + while ((c = cp_parser_omp_loop_transform_clause (parser))) + { + /* The nesting depth left after the current transformation. */ + int depth = 1; + if (TREE_CODE (c) == ERROR_MARK) + goto error; + + gcc_assert (!TREE_CHAIN (c)); + switch (OMP_CLAUSE_CODE (c)) + { + case OMP_CLAUSE_UNROLL_FULL: + error_at (OMP_CLAUSE_LOCATION (c), + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_NONE: + error_at (OMP_CLAUSE_LOCATION (c), + "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_PARTIAL: + depth = 1; + break; + case OMP_CLAUSE_TILE: + depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + break; + default: + gcc_unreachable (); + } + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, level); + + if (depth < last_depth) + { + bool is_outermost_clause = !transformed_depth; + error_at (OMP_CLAUSE_LOCATION (c), + "nesting depth left after this transformation too low " + "for %s", + is_outermost_clause ? outer_descr + : "outer transformation"); + goto error; + } + + last_depth = depth; + + if (!transformed_depth) + transformed_depth = last_depth; + + c = finish_omp_clauses (c, C_ORT_OMP); + + if (!clauses) + clauses = c; + else if (last_c) + TREE_CHAIN (last_c) = c; + + last_c = c; + } + + return transformed_depth; + +error: + while (cp_parser_omp_loop_transform_clause (parser)) + ; + clauses = NULL_TREE; + return -1; +} + +static tree +cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) +{ + tree block, ret; + static const char *p_name = "#pragma omp unroll"; + omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; + + gcc_assert (!parser->omp_for_parse_state); + + tree clauses = cp_parser_omp_all_clauses (parser, mask, p_name, tok, true); + + if (!clauses) + { + tree c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_CHAIN (c) = clauses; + clauses = c; + } + + int required_depth = 1; + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); + + block = begin_omp_structured_block (); + ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = finish_omp_structured_block (block); + add_stmt (block); + + return ret; +} + /* OpenACC 2.0: # pragma acc cache (variable-list) new-line */ @@ -49675,6 +50146,12 @@ cp_parser_omp_construct (cp_parser *parser, cp_token *pragma_tok, bool *if_p) case PRAGMA_OMP_ASSUME: cp_parser_omp_assume (parser, pragma_tok, if_p); return; + case PRAGMA_OMP_TILE: + stmt = cp_parser_omp_tile (parser, pragma_tok, if_p); + break; + case PRAGMA_OMP_UNROLL: + stmt = cp_parser_omp_unroll (parser, pragma_tok, if_p); + break; default: gcc_unreachable (); } @@ -50321,6 +50798,14 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p) cp_parser_omp_construct (parser, pragma_tok, if_p); pop_omp_privatization_clauses (stmt); return true; + case PRAGMA_OMP_TILE: + case PRAGMA_OMP_UNROLL: + if (context != pragma_stmt && context != pragma_compound) + goto bad_stmt; + stmt = push_omp_privatization_clauses (false); + cp_parser_omp_construct (parser, pragma_tok, if_p); + pop_omp_privatization_clauses (stmt); + return true; case PRAGMA_OMP_REQUIRES: if (context != pragma_external) diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index d44d20767ca..6942a5c69a8 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18142,6 +18142,16 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, OMP_CLAUSE_OPERAND (nc, 0) = tsubst_expr (OMP_CLAUSE_OPERAND (oc, 0), args, complain, in_decl); break; + case OMP_CLAUSE_UNROLL_PARTIAL: + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (nc) + = tsubst_expr (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (oc), args, complain, + in_decl); + break; + case OMP_CLAUSE_TILE: + OMP_CLAUSE_TILE_SIZES (nc) + = tsubst_expr (OMP_CLAUSE_TILE_SIZES (oc), args, complain, + in_decl); + break; case OMP_CLAUSE_REDUCTION: case OMP_CLAUSE_IN_REDUCTION: case OMP_CLAUSE_TASK_REDUCTION: @@ -18222,6 +18232,8 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_NOHOST: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: break; default: gcc_unreachable (); @@ -19494,6 +19506,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl) case OMP_SIMD: case OMP_DISTRIBUTE: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: { tree clauses, body, pre_body; diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index c934659c9f3..45d0ab8f8d8 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -6887,6 +6887,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) bool mergeable_seen = false; bool implicit_moved = false; bool target_in_reduction_seen = false; + bool unroll_full_seen = false; bitmap_obstack_initialize (NULL); bitmap_initialize (&generic_head, &bitmap_default_obstack); @@ -8876,6 +8877,46 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } break; + case OMP_CLAUSE_TILE: + for (tree list = OMP_CLAUSE_TILE_SIZES (c); !remove && list; + list = TREE_CHAIN (list)) + { + t = TREE_VALUE (list); + + if (t == error_mark_node) + remove = true; + else if (!type_dependent_expression_p (t) + && !INTEGRAL_TYPE_P (TREE_TYPE (t))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% argument needs integral type"); + remove = true; + } + else + { + t = mark_rvalue_use (t); + if (!processing_template_decl) + { + t = maybe_constant_value (t); + int n; + if (!tree_fits_shwi_p (t) + || !INTEGRAL_TYPE_P (TREE_TYPE (t)) + || (n = tree_to_shwi (t)) <= 0 || (int)n != n) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% argument needs positive " + "integral constant"); + remove = true; + } + t = fold_build_cleanup_point_expr (TREE_TYPE (t), t); + } + } + + /* Update list item. */ + TREE_VALUE (list) = t; + } + break; + case OMP_CLAUSE_ORDERED: ordered_seen = true; break; @@ -8930,6 +8971,60 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } break; + case OMP_CLAUSE_UNROLL_FULL: + if (unroll_full_seen) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% appears more than once"); + remove = true; + } + unroll_full_seen = true; + break; + + case OMP_CLAUSE_UNROLL_PARTIAL: + { + tree t = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + + if (!t) + break; + + if (t == error_mark_node) + remove = true; + else if (!type_dependent_expression_p (t) + && !INTEGRAL_TYPE_P (TREE_TYPE (t))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "partial argument needs integral type"); + remove = true; + } + else + { + t = mark_rvalue_use (t); + if (!processing_template_decl) + { + t = maybe_constant_value (t); + + int n; + if (!INTEGRAL_TYPE_P (TREE_TYPE (t)) + || !tree_fits_shwi_p (t) + || (n = tree_to_shwi (t)) <= 0 || (int)n != n) + { + error_at (OMP_CLAUSE_LOCATION (c), + "partial argument needs positive constant " + "integer expression"); + remove = true; + } + t = fold_build_cleanup_point_expr (TREE_TYPE (t), t); + } + } + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = t; + } + break; + + case OMP_CLAUSE_UNROLL_NONE: + break; + default: gcc_unreachable (); } diff --git a/gcc/testsuite/c-c++-common/gomp/imperfect-attributes.c b/gcc/testsuite/c-c++-common/gomp/imperfect-attributes.c index 776295ce22a..3c35e7c54b6 100644 --- a/gcc/testsuite/c-c++-common/gomp/imperfect-attributes.c +++ b/gcc/testsuite/c-c++-common/gomp/imperfect-attributes.c @@ -1,17 +1,20 @@ /* { dg-do compile { target { c || c++11 } } } */ /* Check that a nested FOR loop with standard c/c++ attributes on it - is treated as intervening code, since it doesn't match the grammar - for canonical loop nest form. */ + (not the C++ attribute syntax for OpenMP directives) + gives an error. */ extern void do_something (void); + +/* This one should be OK, an empty attribute list is ignored in both C + and C++. */ void imperfect1 (int x, int y) { #pragma omp for collapse (2) - for (int i = 0; i < x; i++) /* { dg-error "not enough nested loops" } */ + for (int i = 0; i < x; i++) { - [[]] for (int j = 0; j < y; j++) /* { dg-error "loop not permitted in intervening code" } */ + [[]] for (int j = 0; j < y; j++) do_something (); } } @@ -19,16 +22,15 @@ void imperfect1 (int x, int y) void perfect1 (int x, int y) { #pragma omp for ordered (2) - for (int i = 0; i < x; i++) /* { dg-error "not enough nested loops" } */ - /* { dg-error "inner loops must be perfectly nested" "" { target *-*-*} .-1 } */ + for (int i = 0; i < x; i++) { - [[]] for (int j = 0; j < y; j++) /* { dg-error "loop not permitted in intervening code" } */ + [[]] for (int j = 0; j < y; j++) do_something (); } } /* Similar, but put the attributes on a block wrapping the nested loop - instead. */ + instead. This is not allowed by the grammar. */ void imperfect2 (int x, int y) { diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c new file mode 100644 index 00000000000..22a1250ed6c --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c @@ -0,0 +1,11 @@ +void test () +{ +#pragma omp tile sizes (2,4,6) + for (unsigned i = 0; i < 10; i++) /* { dg-error "inner loops must be perfectly nested" } */ + for (unsigned j = 0; j < 10; j++) + { + float intervening_decl = 0; +#pragma omp unroll partial(2) + for (unsigned k = 0; k < 10; k++); + } +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c new file mode 100644 index 00000000000..f10aea4c27f --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c @@ -0,0 +1,160 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(0) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(-1) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes() /* { dg-error {expected expression before} "" { target c} } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(,) /* { dg-error {expected expression before} "" { target c } } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1,2 /* { dg-error {expected '\,' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) sizes(1) /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(5, 6) + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll partia /* { dg-error {expected an OpenMP clause before 'partia'} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll partial + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (j); + } + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + int s; + #pragma omp tile sizes(s) /* { dg-error {'tile sizes' argument needs positive integral constant} "" { target { ! c++98_only } } } */ + /* { dg-error {the value of 's' is not usable in a constant expression} "" { target { c++ && { ! c++98_only } } } .-1 } */ + /* { dg-error {'s' cannot appear in a constant-expression} "" { target c++98_only } .-2 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(42.0) /* { dg-error {'tile sizes' argument needs positive integral constant} "" { target c } } */ + /* { dg-error {'tile sizes' argument needs integral type} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c new file mode 100644 index 00000000000..45b9bb1a3ed --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c @@ -0,0 +1,179 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(0) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(-1) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes() /* { dg-error {expected expression before} "" { target c} } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(,) /* { dg-error {expected expression before} "" { target c } } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1,2 /* { dg-error {expected '\,' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) sizes(1) /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(5, 6) + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll partia /* { dg-error {expected an OpenMP clause before 'partia'} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll partial + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (j); + } + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c new file mode 100644 index 00000000000..e0ba1d6c444 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c @@ -0,0 +1,109 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < i; ++j) + dummy (i); + + +#pragma omp for collapse(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(3) + #pragma omp tile sizes(1, 2) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(1) +#pragma omp tile sizes(1) +#pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(5, 6) +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + +#pragma omp for collapse(1) +#pragma omp tile sizes(1) +#pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(5, 6) +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + +#pragma omp for collapse(3) +#pragma omp tile sizes(1, 2) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(3) +#pragma omp tile sizes(5, 6) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c new file mode 100644 index 00000000000..d46bb0cb642 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c @@ -0,0 +1,322 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d, expected %d\n", __FILE__, __LINE__, var, val); \ + __builtin_abort (); } + +int +test1 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(3) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test2 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(3) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test3 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(8) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test4 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(8) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + ASSERT_EQ (i, 0) + return iter; +} + +int +test5 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(71) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + + ASSERT_EQ (i, 0) + return iter; +} + +int +test6 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(1) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + ASSERT_EQ (i, 0) + return iter; +} + +int +test7 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(2) + for (i = 5; i < -5; i=i-3) + { + fprintf (stderr, "%d\n", i); + __builtin_abort (); + iter = iter - 3; + } + + ASSERT_EQ (i, 5) + + /* No iteration expected */ + return iter; +} + +int +test8 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(2) + for (i = 5; i > -5; i=i-3) + { + ASSERT_EQ (i, iter) + /* Expect only first iteration of the last tile to execute */ + if (iter != -4) + iter = iter - 3; + } + + ASSERT_EQ (i, -7) + return iter; +} + + +int +test9 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(5) + for (i = 5; i >= -5; i=i-4) + { + ASSERT_EQ (i, iter) + /* Expect only first iteration of the last tile to execute */ + if (iter != - 3) + iter = iter - 4; + } + + ASSERT_EQ (i, -7) + return iter; +} + +int +test10 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(5) + for (i = 5; i >= -5; i--) + { + ASSERT_EQ (i, iter) + iter--; + } + + ASSERT_EQ (i, -6) + return iter; +} + +int +test11 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(15) + for (i = 5; i != -5; i--) + { + ASSERT_EQ (i, iter) + iter--; + } + ASSERT_EQ (i, -5) + return iter; +} + +int +test12 () +{ + int iter = 0; + unsigned i; +#pragma omp tile sizes(3) + for (i = 0; i != 5; i++) + { + ASSERT_EQ (i, iter) + iter++; + } + + ASSERT_EQ (i, 5) + return iter; +} + +int +test13 () +{ + int iter = -5; + long long unsigned int i; +#pragma omp tile sizes(15) + for (int i = -5; i < 5; i=i+3) + { + ASSERT_EQ (i, iter) + iter++; + } + + ASSERT_EQ (i, 5) + return iter; +} + +int +test14 (unsigned init, int step) +{ + int iter = init; + long long unsigned int i; +#pragma omp tile sizes(8) + for (i = init; i < 2*init; i=i+step) + iter++; + + ASSERT_EQ (i, 2*init) + return iter; +} + +int +test15 (unsigned init, int step) +{ + int iter = init; + int i; +#pragma omp tile sizes(8) + for (unsigned i = init; i > 2* init; i=i+step) + iter++; + + return iter; +} + +int +main () +{ + int last_iter; + + last_iter = test1 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test2 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test3 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test4 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test5 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test6 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test7 (); + ASSERT_EQ (last_iter, 5); + + last_iter = test8 (); + ASSERT_EQ (last_iter, -4); + + last_iter = test9 (); + ASSERT_EQ (last_iter, -3); + + last_iter = test10 (); + ASSERT_EQ (last_iter, -6); + return 0; + + last_iter = test11 (); + ASSERT_EQ (last_iter, -4); + return 0; + + last_iter = test12 (); + ASSERT_EQ (last_iter, 5); + return 0; + + last_iter = test13 (); + ASSERT_EQ (last_iter, 4); + return 0; + + last_iter = test14 (0, 1); + ASSERT_EQ (last_iter, 0); + return 0; + + last_iter = test14 (0, -1); + ASSERT_EQ (last_iter, 0); + return 0; + + last_iter = test14 (8, 2); + ASSERT_EQ (last_iter, 16); + return 0; + + last_iter = test14 (5, 3); + ASSERT_EQ (last_iter, 9); + return 0; + + last_iter = test15 (8, -1); + ASSERT_EQ (last_iter, 9); + return 0; + + last_iter = test15 (8, -2); + ASSERT_EQ (last_iter, 10); + return 0; + + last_iter = test15 (5, -3); + ASSERT_EQ (last_iter, 6); + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c new file mode 100644 index 00000000000..815318ab27a --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c @@ -0,0 +1,150 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +test1 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i < data + 10 ; i++) + { + ASSERT_EQ (*i, data[iter]); + ASSERT_EQ_PTR (i, data + iter); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test2 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i < data + 10 ; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test3 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i <= data + 9 ; i=i+2) + { + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test4 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i != data + 10 ; i=i+1) + { + ASSERT_EQ (*i, data[iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test5 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(3) + for (i = data + 9; i >= data ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data - 1) + return iter; +} + +int +test6 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(3) + for (i = data + 9; i > data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data - 1) + return iter; +} + +int +test7 (int data[10]) +{ + int iter = 0; + #pragma omp tile sizes(1) + for (int *i = data + 9; i != data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + iter_count = test1 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test2 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test3 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test4 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test5 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test6 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test7 (data); + ASSERT_EQ (iter_count, 10); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c new file mode 100644 index 00000000000..8132128a5a8 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +int +test1 () +{ + int sum = 0; +for (int k = 0; k < 10; k++) + { +#pragma omp tile sizes(5,7) + for (int i = 0; i < 10; i++) + for (int j = 0; j < 10; j=j+2) + { + sum = sum + 1; + } + } + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 500) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c new file mode 100644 index 00000000000..cd25a62c5c0 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + int iter = 0; + int *i; + #pragma omp tile sizes(1) + for (i = data; i < data + 10; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + unsigned long real_iter_count = ((unsigned long)i - (unsigned long)data) / (sizeof (int) * 2); + ASSERT_EQ (real_iter_count, 5); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c new file mode 100644 index 00000000000..c26e03d7e74 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d, expected %d\n", __FILE__, __LINE__, var, val); \ + __builtin_abort (); } + +int +main () +{ + int iter_j = 0, iter_k = 0; + unsigned i, j, k; +#pragma omp tile sizes(3,5,8) + for (i = 0; i < 2; i=i+2) + for (j = 0; j < 3; j=j+1) + for (k = 0; k < 5; k=k+3) + { + /* fprintf (stderr, "i=%d j=%d k=%d\n", i, j, k); + * fprintf (stderr, "iter_j=%d iter_k=%d\n", iter_j, iter_k); */ + ASSERT_EQ (i, 0); + if (k == 0) + { + ASSERT_EQ (j, iter_j); + iter_k = 0; + } + + ASSERT_EQ (k, iter_k); + + iter_k = iter_k + 3; + if (k == 3) + iter_j++; + } + + ASSERT_EQ (i, 2); + ASSERT_EQ (j, 3); + ASSERT_EQ (k, 6); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c new file mode 100644 index 00000000000..d496dc29053 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c @@ -0,0 +1,133 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp unroll partial + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test2 () +{ +#pragma omp unroll partial(10) + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test3 () +{ +#pragma omp unroll full + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test4 () +{ +#pragma omp unroll full + for (int i = 0; i > 100; ++i) + dummy (i); +} + +void +test5 () +{ +#pragma omp unroll full + for (int i = 1; i <= 100; ++i) + dummy (i); +} + +void +test6 () +{ +#pragma omp unroll full + for (int i = 200; i >= 100; i--) + dummy (i); +} + +void +test7 () +{ +#pragma omp unroll full + for (int i = -100; i > 100; ++i) + dummy (i); +} + +void +test8 () +{ +#pragma omp unroll full + for (int i = 100; i > -200; --i) + dummy (i); +} + +void +test9 () +{ +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test10 () +{ +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test12 () +{ +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test13 () +{ + for (int i = 0; i < 100; ++i) +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test14 () +{ + #pragma omp for + for (int i = 0; i < 100; ++i) +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test15 () +{ + #pragma omp for + for (int i = 0; i < 100; ++i) + { + + dummy (i); + +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (j); + + dummy (i); + } + } diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c new file mode 100644 index 00000000000..cb37e7410c8 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c @@ -0,0 +1,95 @@ +/* { dg-prune-output "error: invalid controlling predicate" } */ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ +#pragma omp unroll partial +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ +#pragma omp unroll partial + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial partial /* { dg-error {too many 'partial' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll full full /* { dg-error {too many 'full' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial +#pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + + int i; +#pragma omp for +#pragma omp unroll( /* { dg-error {expected an OpenMP clause before '\(' token} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll foo /* { dg-error {expected an OpenMP clause before 'foo'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial( /* { dg-error {expected expression before end of line} "" { target c } } */ + /* { dg-error {expected primary-expression before end of line} "" { target c++ } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial() /* { dg-error {expected expression before '\)' token} "" { target c } } */ + /* { dg-error {expected primary-expression before '\)' token} "" { target c++ } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial(i) + /* { dg-error {the value of 'i' is not usable in a constant expression} "" { target c++ } .-1 } */ + /* { dg-error {partial argument needs positive constant integer expression} "" { target *-*-* } .-2 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll parti /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(1) +#pragma omp unroll parti /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(1) +#pragma omp unroll parti /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +int sum = 0; +#pragma omp parallel for reduction(+ : sum) collapse(2) +#pragma omp unroll partial(1) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 3; i < 10; ++i) + for (int j = -2; j < 7; ++j) + sum++; +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c new file mode 100644 index 00000000000..7ace5657b26 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c @@ -0,0 +1,18 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll full + for (int i = 0; i < 10; i++) + dummy (i); +} + + /* Loop should be removed with 10 copies of the body remaining + * { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c new file mode 100644 index 00000000000..5e473a099d3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c @@ -0,0 +1,19 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should not be unrolled, but the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c new file mode 100644 index 00000000000..9d5101bdc60 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c @@ -0,0 +1,19 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll partial /* { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } */ + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should be unrolled 5 times and the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c new file mode 100644 index 00000000000..ee2d000239d --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c @@ -0,0 +1,20 @@ +/* { dg-additional-options "--param=omp-unroll-default-factor=100" } + * { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll /* { dg-optimized {added 'partial\(100\)' clause to 'omp unroll' directive} } */ + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should be unrolled 5 times and the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform unroll_none" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c new file mode 100644 index 00000000000..0458cb030a9 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c @@ -0,0 +1,144 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +test1 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i < data + 10 ; i++) + { + ASSERT_EQ (*i, data[iter]); + ASSERT_EQ_PTR (i, data + iter); + iter++; + } + + return iter; +} + +int +test2 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i < data + 10 ; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + return iter; +} + +int +test3 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i <= data + 9 ; i=i+2) + { + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + return iter; +} + +int +test4 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i != data + 10 ; i=i+1) + { + ASSERT_EQ (*i, data[iter]); + iter++; + } + + return iter; +} + +int +test5 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(7) + for (i = data + 9; i >= data ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +test6 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(7) + for (i = data + 9; i > data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +test7 (int data[10]) +{ + int iter = 0; + #pragma omp unroll partial(7) + for (int *i = data + 9; i != data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + iter_count = test1 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test2 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test3 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test4 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test5 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test6 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test7 (data); + ASSERT_EQ (iter_count, 10); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-8.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-8.c new file mode 100644 index 00000000000..d49d7c42c87 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-8.c @@ -0,0 +1,76 @@ +extern void dummy(int); + +void +test1 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = 101; i > 100; i++) + dummy (i); +} + + +void +test2 () +{ +#pragma omp unroll full + for (int i = 101; i != 100; i++) + dummy (i); +} + +void +test3 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = 0; i <= 0; i--) + dummy (i); +} + +void +test4 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = 101; i > 100; i=i+2) + dummy (i); +} + +void +test5 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = -101; i < 100; i=i-10) + dummy (i); +} + +void +test6 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = -101; i < 100; i=i-300) + dummy (i); +} + +void +test7 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = 101; i > -100; i=i+300) + dummy (i); + + /* Loop does not iterate, hence no warning. */ +#pragma omp unroll full + for (int i = 101; i > 101; i=i+300) + dummy (i); +} + +void +test8 () +{ +#pragma omp unroll full /* { dg-warning "Cannot apply full unrolling to infinite loop" } */ + for (int i = -21; i < -20; i=i-40) + dummy (i); + + /* Loop does not iterate, hence no warning. */ +#pragma omp unroll full + for (int i = -21; i > 20; i=i-40) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c new file mode 100644 index 00000000000..c365d942591 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c @@ -0,0 +1,15 @@ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) + #pragma omp unroll partial + for (int j = 0; j != 100; ++j) + dummy (i); +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c new file mode 100644 index 00000000000..2082009a385 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c @@ -0,0 +1,29 @@ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp tile sizes(2) + for (int j = 0; j != 100; ++j) + dummy (i); + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) /* { dg-error {not enough nested loops} } */ +#pragma omp tile sizes(2, 3) + for (int j = 0; j != 100; ++j) + dummy (i); + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp tile sizes(2, 3) + for (int j = 0; j != 100; ++j) + for (int k = 0; k != 100; ++k) + dummy (i); +} + + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c new file mode 100644 index 00000000000..40e7f8e4bfb --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c @@ -0,0 +1,37 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp unroll partial(2) + for (int j = i * 2; j <= i * 4 + 1; ++j) + dummy (i); + +#pragma omp target parallel for collapse(3) + for (int i = -300; i != 100; ++i) + for (int j = i; j != i * 2; ++j) + #pragma omp unroll partial + for (int k = 2; k != 100; ++k) + dummy (i); + +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + for (int j = i; j != i * 2; ++j) + for (int k = 2; k != 100; ++k) + dummy (i); + + for (int i = -300; i != 100; ++i) +#pragma omp unroll full + for (int j = i; j != i + 10; ++j) + for (int k = 2; k != 100; ++k) + dummy (i); + + for (int i = -300; i != 100; ++i) +#pragma omp unroll full + for (int j = i; j != i + 10; ++j) + for (int k = j; k != 100; ++k) + dummy (i); +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c new file mode 100644 index 00000000000..7696e5d5fab --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c @@ -0,0 +1,22 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp target parallel for collapse(2) /* { dg-error {invalid OpenMP non-rectangular loop step; \'\(1 - 0\) \* 1\' is not a multiple of loop 2 step \'5\'} "" { target c } } */ + for (int i = -300; i != 100; ++i) /* { dg-error {invalid OpenMP non-rectangular loop step; \'\(1 - 0\) \* 1\' is not a multiple of loop 2 step \'5\'} "" { target c++ } } */ +#pragma omp unroll partial + for (int j = 2; j != i; ++j) + dummy (i); +} + +void +test2 () +{ + int i,j; +#pragma omp target parallel for collapse(2) + for (i = -300; i != 100; ++i) + #pragma omp unroll partial + for (j = 2; j != i; ++j) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c new file mode 100644 index 00000000000..1cd4d6e7322 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c @@ -0,0 +1,84 @@ +/* { dg-options "-fno-openmp -fopenmp-simd" } */ +/* { dg-do run } */ +/* { dg-additional-options "-fdump-tree-original" } */ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } */ + +#include + +int compute_sum1 () +{ + int sum = 0; + int i,j; + +#pragma omp simd reduction(+:sum) + for (i = 3; i < 10; ++i) + #pragma omp unroll full + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum2() +{ + int sum = 0; + int i,j; +#pragma omp simd reduction(+:sum) +#pragma omp unroll partial(5) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum3() +{ + int sum = 0; + int i,j; +#pragma omp simd reduction(+:sum) +#pragma omp unroll partial(1) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int main () +{ + int result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum3 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump {omp loop_transform} "original" } } */ +/* { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } */ diff --git a/gcc/testsuite/g++.dg/gomp/attrs-4.C b/gcc/testsuite/g++.dg/gomp/attrs-4.C index 005add826ba..a730ad7db50 100644 --- a/gcc/testsuite/g++.dg/gomp/attrs-4.C +++ b/gcc/testsuite/g++.dg/gomp/attrs-4.C @@ -49,7 +49,7 @@ foo (int x) for (int i = 0; i < 16; i++) ; #pragma omp for - [[omp::directive (master)]] // { dg-error "for statement expected before '\\\[' token" } + [[omp::directive (master)]] // { dg-error "loop nest expected before '\\\[' token" } ; #pragma omp target teams [[omp::directive (parallel)]] // { dg-error "mixing OpenMP directives with attribute and pragma syntax on the same statement" } diff --git a/gcc/testsuite/g++.dg/gomp/for-1.C b/gcc/testsuite/g++.dg/gomp/for-1.C index f8bb9d54727..0e042fd1381 100644 --- a/gcc/testsuite/g++.dg/gomp/for-1.C +++ b/gcc/testsuite/g++.dg/gomp/for-1.C @@ -24,7 +24,7 @@ void foo (int j, int k) // Malformed parallel loops. #pragma omp for - i = 0; // { dg-error "for statement expected" } + i = 0; // { dg-error "loop nest expected" } for ( ; i < 10; ) { baz (i); diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-1.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-1.C new file mode 100644 index 00000000000..0906ff3bbe8 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-1.C @@ -0,0 +1,164 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ + [[omp::directive (tile sizes(1))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes(0))]] /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes(-1))]] /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes())]] /* { dg-error {expected primary-expression before} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes)]] /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes(1) sizes(1))]] /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile, sizes(1), sizes(1))]] /* { dg-error {expected end of line before ','} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1)), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1, 2)), + directive (tile sizes(1)))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1, 2)), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence + (directive (tile sizes(5, 6)), + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1)), + directive (unroll partia), /* { dg-error {expected an OpenMP clause before 'partia'} } */ + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1)), + directive (unroll))]] /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1)), + directive (unroll full))]] /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(1)), + directive (unroll partial), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(8,8)), + directive (unroll partial), /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence + (directive (tile sizes(8,8)), + directive (unroll partial))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes(1, 2))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::directive (tile sizes(1, 2))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + [[omp::directive (tile sizes(1, 2))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + [[omp::directive (tile sizes(1, 2, 3))]] + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::directive (tile sizes(1))]] + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + [[omp::directive (tile sizes(1))]] + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + [[omp::directive (tile sizes(1, 2))]] + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (j); + } + + [[omp::directive (tile sizes(1, 2))]] + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + int s; + [[omp::directive (tile sizes(s))]] /* { dg-error {'tile sizes' argument needs positive integral constant} "" { target { ! c++98_only } } } */ + /* { dg-error {the value of 's' is not usable in a constant expression} "" { target { c++ && { ! c++98_only } } } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::directive (tile sizes(42.0))]] /* { dg-error {'tile sizes' argument needs integral type} } */ + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-2.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-2.C new file mode 100644 index 00000000000..ab02924defa --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-2.C @@ -0,0 +1,174 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(0)))]] /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(-1)))]] /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes()))]] /* { dg-error {expected primary-expression before} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(,)))]] /* { dg-error {expected primary-expression before} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes))]] /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1) sizes(1)))]] /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)), + directive (tile sizes(1)))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(5, 6)), + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)), + directive (unroll partia), /* { dg-error {expected an OpenMP clause before 'partia'} } */ + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)), + directive (unroll))]] /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)), + directive (unroll full))]] /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)), + directive (unroll partial), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(8,8)), + directive (unroll partial), /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(8,8)), + directive (unroll partial))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (j); + } + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {inner loops must be perfectly nested} } */ + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + [[omp::sequence (directive (parallel for), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-3.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-3.C new file mode 100644 index 00000000000..95a0115b014 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-tile-3.C @@ -0,0 +1,111 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ + [[omp::sequence (directive (for), + directive (tile sizes(1, 2)))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for), + directive (tile sizes(1, 2)))]] /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < i; ++j) + dummy (i); + + + [[omp::sequence (directive (for collapse(1)), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1)))]] /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(3)), + directive (tile sizes(1, 2)))]] /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(1)), + directive (tile sizes(1)), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1, 2)), + directive (tile sizes(1)))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1, 2)), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(5, 6)), + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + dummy (i); + + [[omp::sequence (directive (for collapse(1)), + directive (tile sizes(1)), + directive (tile sizes(1)))]] + for (int i = 0; i < 100; ++i) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1, 2)), + directive (tile sizes(1)))]] /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(1, 2)), + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(2)), + directive (tile sizes(5, 6)), + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + [[omp::sequence (directive (for collapse(3)), + directive (tile sizes(1, 2)), /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + directive (tile sizes(1, 2)))]] + for (int i = 0; i < 100; ++i) /* { dg-error {not enough nested loops} } */ + for (int j = 0; j < 100; ++j) + dummy (i); + + [[omp::sequence (directive (for collapse(3)), + directive (tile sizes(5, 6)), /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + directive (tile sizes(1, 2, 3)))]] + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); +} diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-1.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-1.C new file mode 100644 index 00000000000..5b93b9fa59e --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-1.C @@ -0,0 +1,135 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test1 () +{ +[[omp::directive (unroll partial)]] + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test2 () +{ +[[omp::directive (unroll partial(10))]] + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test3 () +{ +[[omp::directive (unroll full)]] + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test4 () +{ +[[omp::directive (unroll full)]] + for (int i = 0; i > 100; ++i) + dummy (i); +} + +void +test5 () +{ +[[omp::directive (unroll full)]] + for (int i = 1; i <= 100; ++i) + dummy (i); +} + +void +test6 () +{ +[[omp::directive (unroll full)]] + for (int i = 200; i >= 100; i--) + dummy (i); +} + +void +test7 () +{ +[[omp::directive (unroll full)]] + for (int i = -100; i > 100; ++i) + dummy (i); +} + +void +test8 () +{ +[[omp::directive (unroll full)]] + for (int i = 100; i > -200; --i) + dummy (i); +} + +void +test9 () +{ +[[omp::directive (unroll full)]] + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test10 () +{ +[[omp::directive (unroll full)]] + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test12 () +{ +[[omp::sequence (directive (unroll full), + directive (unroll partial), + directive (unroll partial))]] + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test13 () +{ + for (int i = 0; i < 100; ++i) +[[omp::sequence (directive (unroll full), + directive (unroll partial), + directive (unroll partial))]] + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test14 () +{ + [[omp::directive (for)]] + for (int i = 0; i < 100; ++i) + [[omp::sequence (directive (unroll full), + directive (unroll partial), + directive (unroll partial))]] + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test15 () +{ + [[omp::directive (for)]] + for (int i = 0; i < 100; ++i) + { + + dummy (i); + + [[omp::sequence (directive (unroll full), + directive (unroll partial), + directive (unroll partial))]] + for (int j = -300; j != 100; ++j) + dummy (j); + + dummy (i); + } + } diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-2.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-2.C new file mode 100644 index 00000000000..1a45eadec64 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-2.C @@ -0,0 +1,81 @@ +/* { dg-prune-output "error: invalid controlling predicate" } */ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ +[[omp::sequence (directive (unroll partial), + directive (unroll full))]] /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll full), /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + directive (unroll partial))]] + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll full), /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + directive (unroll full))]] + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll partial partial))]] /* { dg-error {too many 'partial' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::directive (unroll full full)]] /* { dg-error {too many 'full' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (unroll partial), + directive (unroll))]] /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll))]] /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + + int i; + +[[omp::sequence (directive (for), + directive (unroll foo))]] /* { dg-error {expected an OpenMP clause before 'foo'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::directive (unroll partial(i))]] + /* { dg-error {the value of 'i' is not usable in a constant expression} "" { target c++ } .-1 } */ + /* { dg-error {partial argument needs positive constant integer expression} "" { target *-*-* } .-2 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::directive (unroll parti)]] /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll partial(1)), + directive (unroll parti))]] /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +[[omp::sequence (directive (for), + directive (unroll partial(1)), + directive (unroll parti))]] /* { dg-error {expected an OpenMP clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +int sum = 0; +[[omp::sequence (directive (parallel for reduction(+ : sum) collapse(2)), + directive (unroll partial(1)))]] /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 3; i < 10; ++i) + for (int j = -2; j < 7; ++j) + sum++; +} + diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-3.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-3.C new file mode 100644 index 00000000000..20c11c0f314 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-3.C @@ -0,0 +1,20 @@ +// { dg-do compile { target c++11 } } + +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; + [[omp::directive (unroll full)]] + for (int i = 0; i < 10; i++) + dummy (i); +} + + /* Loop should be removed with 10 copies of the body remaining + * { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } */ diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-1.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-1.C new file mode 100644 index 00000000000..234753ad017 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-1.C @@ -0,0 +1,15 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ + + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::directive (unroll, partial)]] + for (int j = 0; j != 100; ++j) + dummy (i); +} + diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-2.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-2.C new file mode 100644 index 00000000000..26cc665007d --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-2.C @@ -0,0 +1,29 @@ +// { dg-do compile { target c++11 } } + +extern void dummy (int); + +void +test () +{ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) + [[omp::directive (tile sizes (2))]] + for (int j = 0; j != 100; ++j) + dummy (i); + + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) /* { dg-error {not enough nested loops} } */ + [[omp::directive (tile sizes(2, 3))]] + for (int j = 0; j != 100; ++j) + dummy (i); + + [[omp::directive (target parallel for, collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::directive (tile, sizes(2, 3))]] + for (int j = 0; j != 100; ++j) + for (int k = 0; k != 100; ++k) + dummy (i); +} + + diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-3.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-3.C new file mode 100644 index 00000000000..46970b84a24 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/attrs-unroll-inner-3.C @@ -0,0 +1,71 @@ +// { dg-do compile { target c++11 } } + +// Test that omp::sequence is handled properly in a loop nest, but that +// invalid attribute specifiers are rejected. + +extern void dummy (int); + +void +test1 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::sequence (directive (unroll, partial))]] // OK + for (int j = 0; j != 100; ++j) + dummy (i); +} + +void +test2 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::directive (masked)]] // { dg-error "loop nest expected" } + for (int j = 0; j != 100; ++j) + dummy (i); +} + +void +test3 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::directive (unroll, partial)]] // { dg-error "attributes on the same statement" } + [[omp::directive (masked)]] + for (int j = 0; j != 100; ++j) + dummy (i); +} + +void +test4 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::sequence (directive (unroll, partial), + directive (masked))]] // { dg-error "loop nest expected" } + for (int j = 0; j != 100; ++j) + dummy (i); +} + +void +test5 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::sequence (directive (masked), // { dg-error "loop nest expected" } + directive (unroll, partial))]] + for (int j = 0; j != 100; ++j) + dummy (i); +} + +void +test6 () +{ + [[omp::directive (target parallel for collapse(2))]] + for (int i = -300; i != 100; ++i) + [[omp::directive (unroll, partial), // { dg-error "attributes on the same statement" } + omp::directive (masked)]] + for (int j = 0; j != 100; ++j) + dummy (i); +} + diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h new file mode 100644 index 00000000000..166d1d48677 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp tile sizes (U, 10, V) + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C new file mode 100644 index 00000000000..1ee76da3d4a --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp teams distribute parallel for num_teams(V) + for (int i : v) + dummy (i); + +#pragma omp tile sizes (V, U) + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C new file mode 100644 index 00000000000..263c9b301c6 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp tile sizes (U, 10, V) // { dg-error {'tile sizes' argument needs positive integral constant} } + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C new file mode 100644 index 00000000000..cba37c88ebe --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C @@ -0,0 +1,42 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } +#include + +extern void dummy (int); + +void +test1 () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp unroll partial(5) + for (int i : v) + dummy (i); +} + +void +test2 () +{ + std::vector> v; + + for (unsigned i = 0; i < 10; i++) + { + std::vector u; + for (unsigned j = 0; j < 10; j++) + u.push_back (j); + v.push_back (u); + } + +#pragma omp for +#pragma omp unroll partial(5) + for (auto u : v) + for (int i : u) + dummy (i); +} diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C new file mode 100644 index 00000000000..f606f3de757 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C @@ -0,0 +1,47 @@ +// { dg-do link } +// { dg-additional-options "-std=c++11" } +#include + +extern void dummy (int); + +template void +test_template () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp unroll partial(U1) + for (T i : v) + dummy (i); + +#pragma omp unroll partial(U2) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp unroll partial(U3) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U1) + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U2) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U3) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); +} + +void test () { test_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C new file mode 100644 index 00000000000..ae9f5500360 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C @@ -0,0 +1,37 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +// { dg-additional-options "-fdump-tree-original" } +#include + +extern void dummy (int); + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +void +test1 () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp unroll partial(fib(10)) + for (int i : v) + dummy (i); +} + + +// Loop should be unrolled fib(10) = 55 times +// ! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial\(55\)} "original" } } +// ! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +// ! { dg-final { scan-tree-dump-times "dummy" 55 "omp_transform_loops" } } + +// There should be one loop that fills the vector ... +// ! { dg-final { scan-tree-dump-times {if \(i.*? <= .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } + +// ... and one resulting from the lowering of the unrolled loop +// ! { dg-final { scan-tree-dump-times {if \(D\.[0-9]+ < retval.+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/g++.dg/gomp/pr94512.C b/gcc/testsuite/g++.dg/gomp/pr94512.C index 8ba0e65795f..1d5cf150987 100644 --- a/gcc/testsuite/g++.dg/gomp/pr94512.C +++ b/gcc/testsuite/g++.dg/gomp/pr94512.C @@ -8,7 +8,7 @@ void bar () { #pragma omp parallel master taskloop - foo (); // { dg-error "for statement expected before" } + foo (); // { dg-error "loop nest expected before" } } void diff --git a/gcc/testsuite/gcc.dg/gomp/for-1.c b/gcc/testsuite/gcc.dg/gomp/for-1.c index 80e0d0be844..ecaf0c55796 100644 --- a/gcc/testsuite/gcc.dg/gomp/for-1.c +++ b/gcc/testsuite/gcc.dg/gomp/for-1.c @@ -26,7 +26,7 @@ void foo (int j, int k) /* Malformed parallel loops. */ #pragma omp for - i = 0; /* { dg-error "3:for statement expected" } */ + i = 0; /* { dg-error "3:loop nest expected" } */ for ( ; i < 10; ) { baz (i); diff --git a/gcc/testsuite/gcc.dg/gomp/for-11.c b/gcc/testsuite/gcc.dg/gomp/for-11.c index 8c747cdb981..abafa487283 100644 --- a/gcc/testsuite/gcc.dg/gomp/for-11.c +++ b/gcc/testsuite/gcc.dg/gomp/for-11.c @@ -30,7 +30,7 @@ void foo (int j, int k) /* Malformed parallel loops. */ #pragma omp for - i = 0; /* { dg-error "for statement expected" } */ + i = 0; /* { dg-error "loop nest expected" } */ for ( ; i < 10; ) { baz (i); diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C b/libgomp/testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C new file mode 100644 index 00000000000..3a684219627 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C @@ -0,0 +1,13 @@ +/* { dg-additional-options { -O0 -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE +#define COMMON_TOP_TRANSFORM omp unroll full +#define COLLAPSE_1 +#define COLLAPSE_2 +#define COLLAPSE_3 +#define IMPLEMENTATION_FILE "../../libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h" + +#include "../../libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "unroll_full" 13 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C new file mode 100644 index 00000000000..780421fa4c7 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C @@ -0,0 +1,69 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include +#include + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +int +test1 () +{ + std::vector v; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (1); + + int sum = 0; + for (int k = 0; k < 10; k++) + #pragma omp tile sizes(fib(4)) + for (int i : v) { + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +test2 () +{ + std::vector v; + + for (unsigned i = 0; i <= 10; i++) + v.push_back (i); + + int sum = 0; + for (int k = 0; k < 10; k++) +#pragma omp parallel for collapse(2) reduction(+:sum) +#pragma omp tile sizes(fib(4), 1) + for (int i : v) + for (int j = 8; j > -2; --j) + sum = sum + i; + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 1000) + { + fprintf (stderr, "%d: Wrong result: %d\n", __LINE__, result); + __builtin_abort (); + } + + result = test2 (); + if (result != 5500) + { + fprintf (stderr, "%d: Wrong result: %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C new file mode 100644 index 00000000000..91ec8f5c137 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C @@ -0,0 +1,28 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include + +int +main () +{ + std::vector v; + std::vector w; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (i); + + int iter = 0; +#pragma omp for +#pragma omp tile sizes(5) + for (int i : v) + { + w.push_back (iter); + iter++; + } + + for (int i = 0; i < w.size (); i++) + if (w[i] != i) + __builtin_abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C new file mode 100644 index 00000000000..004eef91649 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C @@ -0,0 +1,73 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include +#include + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +int +test1 () +{ + std::vector v; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (1); + + int sum = 0; + for (int k = 0; k < 10; k++) +#pragma omp unroll partial(fib(3)) + for (int i : v) { + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +test2 () +{ + std::vector v; + + for (unsigned i = 0; i <= 10; i++) + v.push_back (i); + + int sum = 0; +#pragma omp parallel for reduction(+:sum) + for (int k = 0; k < 10; k++) +#pragma omp unroll +#pragma omp unroll partial(fib(4)) + for (int i : v) + { + #pragma omp unroll full + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 1000) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + + result = test2 (); + if (result != 5500) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C new file mode 100644 index 00000000000..90d2775c95b --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C @@ -0,0 +1,34 @@ +// { dg-do run } +// { dg-additional-options "-std=c++11" } +#include +#include + +int +main () +{ + std::vector> v; + std::vector w; + + for (unsigned i = 0; i < 10; i++) + { + std::vector u; + for (unsigned j = 0; j < 10; j++) + u.push_back (j); + v.push_back (u); + } + +#pragma omp for +#pragma omp unroll partial(7) + for (auto u : v) + for (int x : u) + w.push_back (x); + + std::size_t l = w.size (); + for (std::size_t i = 0; i < l; i++) + { + if (w[i] != i % 10) + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C new file mode 100644 index 00000000000..8970bfa7fd8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C @@ -0,0 +1,84 @@ +#include +#include + +template +int sum () +{ + int sum = 0; +#pragma omp unroll full +#pragma omp tile sizes(dim0, dim1) + for (unsigned i = 0; i < 4; i++) + for (unsigned j = 0; j < 5; j++) + sum++; + + return sum; +} + +int main () +{ + if (sum <1,1> () != 20) + __builtin_abort (); + if (sum <1,2> () != 20) + __builtin_abort (); + if (sum <1,3> () != 20) + __builtin_abort (); + if (sum <1,4> () != 20) + __builtin_abort (); + if (sum <1,5> () != 20) + __builtin_abort (); + + if (sum <2,1> () != 20) + __builtin_abort (); + if (sum <2,2> () != 20) + __builtin_abort (); + if (sum <2,3> () != 20) + __builtin_abort (); + if (sum <2,4> () != 20) + __builtin_abort (); + if (sum <2,5> () != 20) + __builtin_abort (); + + if (sum <3,1> () != 20) + __builtin_abort (); + if (sum <3,2> () != 20) + __builtin_abort (); + if (sum <3,3> () != 20) + __builtin_abort (); + if (sum <3,4> () != 20) + __builtin_abort (); + if (sum <3,5> () != 20) + __builtin_abort (); + + if (sum <4,1> () != 20) + __builtin_abort (); + if (sum <4,2> () != 20) + __builtin_abort (); + if (sum <4,3> () != 20) + __builtin_abort (); + if (sum <4,4> () != 20) + __builtin_abort (); + if (sum <4,5> () != 20) + __builtin_abort (); + + if (sum <5,1> () != 20) + __builtin_abort (); + if (sum <5,2> () != 20) + __builtin_abort (); + if (sum <5,3> () != 20) + __builtin_abort (); + if (sum <5,4> () != 20) + __builtin_abort (); + if (sum <5,5> () != 20) + __builtin_abort (); + + if (sum <6,1> () != 20) + __builtin_abort (); + if (sum <6,2> () != 20) + __builtin_abort (); + if (sum <6,3> () != 20) + __builtin_abort (); + if (sum <6,4> () != 20) + __builtin_abort (); + if (sum <6,5> () != 20) + __builtin_abort (); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-1.c b/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-1.c new file mode 100644 index 00000000000..6743594b2eb --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-1.c @@ -0,0 +1,79 @@ +/* { dg-do run } */ + +/* Like imperfect1.c, but also includes loop transforms. */ + +static int f1count[3], f2count[3]; + +#ifndef __cplusplus +extern void abort (void); +#else +extern "C" void abort (void); +#endif + +int f1 (int depth, int iter) +{ + f1count[depth]++; + return iter; +} + +int f2 (int depth, int iter) +{ + f2count[depth]++; + return iter; +} + +void s1 (int a1, int a2, int a3) +{ + int i, j, k; + +#pragma omp for collapse(2) + for (i = 0; i < a1; i++) + { + f1 (0, i); + for (j = 0; j < a2; j++) + { + f1 (1, j); +#pragma omp unroll partial + for (k = 0; k < a3; k++) + { + f1 (2, k); + f2 (2, k); + } + f2 (1, j); + } + f2 (0, i); + } +} + +int +main (void) +{ + f1count[0] = 0; + f1count[1] = 0; + f1count[2] = 0; + f2count[0] = 0; + f2count[1] = 0; + f2count[2] = 0; + + s1 (3, 4, 5); + + /* All intervening code at the same depth must be executed the same + number of times. */ + if (f1count[0] != f2count[0]) abort (); + if (f1count[1] != f2count[1]) abort (); + if (f1count[2] != f2count[2]) abort (); + + /* Intervening code must be executed at least as many times as the loop + that encloses it. */ + if (f1count[0] < 3) abort (); + if (f1count[1] < 3 * 4) abort (); + + /* Intervening code must not be executed more times than the number + of logical iterations. */ + if (f1count[0] > 3 * 4 * 5) abort (); + if (f1count[1] > 3 * 4 * 5) abort (); + + /* Check that the innermost loop body is executed exactly the number + of logical iterations expected. */ + if (f1count[2] != 3 * 4 * 5) abort (); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-2.c b/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-2.c new file mode 100644 index 00000000000..e7d6a9941b4 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/imperfect-transform-2.c @@ -0,0 +1,79 @@ +/* { dg-do run } */ + +/* Like imperfect1.c, but also includes loop transforms. */ + +static int f1count[3], f2count[3]; + +#ifndef __cplusplus +extern void abort (void); +#else +extern "C" void abort (void); +#endif + +int f1 (int depth, int iter) +{ + f1count[depth]++; + return iter; +} + +int f2 (int depth, int iter) +{ + f2count[depth]++; + return iter; +} + +void s1 (int a1, int a2, int a3) +{ + int i, j, k; + +#pragma omp for collapse(2) + for (i = 0; i < a1; i++) + { + f1 (0, i); + for (j = 0; j < a2; j++) + { + f1 (1, j); +#pragma omp tile sizes(5) + for (k = 0; k < a3; k++) + { + f1 (2, k); + f2 (2, k); + } + f2 (1, j); + } + f2 (0, i); + } +} + +int +main (void) +{ + f1count[0] = 0; + f1count[1] = 0; + f1count[2] = 0; + f2count[0] = 0; + f2count[1] = 0; + f2count[2] = 0; + + s1 (3, 4, 5); + + /* All intervening code at the same depth must be executed the same + number of times. */ + if (f1count[0] != f2count[0]) abort (); + if (f1count[1] != f2count[1]) abort (); + if (f1count[2] != f2count[2]) abort (); + + /* Intervening code must be executed at least as many times as the loop + that encloses it. */ + if (f1count[0] < 3) abort (); + if (f1count[1] < 3 * 4) abort (); + + /* Intervening code must not be executed more times than the number + of logical iterations. */ + if (f1count[0] > 3 * 4 * 5) abort (); + if (f1count[1] > 3 * 4 * 5) abort (); + + /* Check that the innermost loop body is executed exactly the number + of logical iterations expected. */ + if (f1count[2] != 3 * 4 * 5) abort (); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h new file mode 100644 index 00000000000..b9b865cf554 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h @@ -0,0 +1,70 @@ +#include +#include +#include +#include + +#ifndef FUN_NAME_SUFFIX +#define FUN_NAME_SUFFIX +#endif + +#ifdef MULT +#undef MULT +#endif +#define MULT CAT(mult, FUN_NAME_SUFFIX) + +#ifdef MAIN +#undef MAIN +#endif +#define MAIN CAT(main, FUN_NAME_SUFFIX) + +void MULT (float *matrix1, float *matrix2, float *result, + unsigned dim0, unsigned dim1) +{ + unsigned i; + + memset (result, 0, sizeof (float) * dim0 * dim1); + DIRECTIVE + TRANSFORMATION1 + for (i = 0; i < dim0; i++) + TRANSFORMATION2 + for (unsigned j = 0; j < dim1; j++) + TRANSFORMATION3 + for (unsigned k = 0; k < dim1; k++) + result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; +} + +int MAIN () +{ + unsigned dim0 = 20; + unsigned dim1 = 20; + + float *result = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + matrix1[i * dim1 + j] = j; + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + if (i == j) + matrix2[i * dim1 + j] = 1; + else + matrix2[i * dim1 + j] = 0; + + MULT (matrix1, matrix2, result, dim0, dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) { + if (matrix1[i * dim1 + j] != result[i * dim1 + j]) { + print_matrix (matrix1, dim0, dim1); + print_matrix (matrix2, dim0, dim1); + print_matrix (result, dim0, dim1); + fprintf(stderr, "%s: ERROR at %d, %d\n", __FUNCTION__, i, j); + abort(); + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h new file mode 100644 index 00000000000..769c04044c3 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h @@ -0,0 +1,71 @@ +#include +#include +#include +#include + +#ifndef FUN_NAME_SUFFIX +#define FUN_NAME_SUFFIX +#endif + +#ifdef MULT +#undef MULT +#endif +#define MULT CAT(mult, FUN_NAME_SUFFIX) + +#ifdef MAIN +#undef MAIN +#endif +#define MAIN CAT(main, FUN_NAME_SUFFIX) + +void MULT (float *matrix1, float *matrix2, float *result) +{ + const unsigned dim0 = 20; + const unsigned dim1 = 20; + + memset (result, 0, sizeof (float) * dim0 * dim1); + DIRECTIVE + TRANSFORMATION1 + for (unsigned i = 0; i < dim0; i++) + TRANSFORMATION2 + for (unsigned j = 0; j < dim1; j++) + TRANSFORMATION3 + for (unsigned k = 0; k < dim1; k++) + result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; +} + +int MAIN () +{ + const unsigned dim0 = 20; + const unsigned dim1 = 20; + + float *result = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + matrix1[i * dim1 + j] = j; + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + if (i == j) + matrix2[i * dim1 + j] = 1; + else + matrix2[i * dim1 + j] = 0; + + MULT (matrix1, matrix2, result); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) { + if (matrix1[i * dim1 + j] != result[i * dim1 + j]) { + __builtin_printf("%s: error at %d, %d\n", __FUNCTION__, i, j); + print_matrix (matrix1, dim0, dim1); + print_matrix (matrix2, dim0, dim1); + print_matrix (result, dim0, dim1); + __builtin_printf("\n"); + __builtin_abort(); + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h new file mode 100644 index 00000000000..4f69463d9dd --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h @@ -0,0 +1,19 @@ +#include +#include + +#define CAT(x,y) XCAT(x,y) +#define XCAT(x,y) x ## y +#define DO_PRAGMA(x) XDO_PRAGMA(x) +#define XDO_PRAGMA(x) _Pragma (#x) + + +void print_matrix (float *matrix, unsigned dim0, unsigned dim1) +{ + for (unsigned i = 0; i < dim0; i++) + { + for (unsigned j = 0; j < dim1; j++) + fprintf (stderr, "%f ", matrix[i * dim1 + j]); + fprintf (stderr, "\n"); + } + fprintf (stderr, "\n"); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c new file mode 100644 index 00000000000..7904a5617f3 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c @@ -0,0 +1,11 @@ +/* { dg-additional-options { -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "unroll_partial" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c new file mode 100644 index 00000000000..bd431a25102 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options { -O2 -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE +#define COMMON_TOP_TRANSFORM omp unroll full +#define COLLAPSE_1 +#define COLLAPSE_2 +#define COLLAPSE_3 +#define IMPLEMENTATION_FILE "matrix-constant-iter.h" + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "unroll_full" 13 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c new file mode 100644 index 00000000000..3875014dc96 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE "omp teams distribute parallel for" +#define COLLAPSE_1 "collapse(1)" +#define COLLAPSE_2 "collapse(2)" +#define COLLAPSE_3 "collapse(3)" + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c new file mode 100644 index 00000000000..671396cd533 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options { -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" + + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp for" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c new file mode 100644 index 00000000000..cc66df42679 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options { -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp parallel for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" + + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp parallel" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 9 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c new file mode 100644 index 00000000000..890b460f374 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp parallel masked taskloop +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c new file mode 100644 index 00000000000..74f6271504a --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp parallel masked taskloop simd +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c new file mode 100644 index 00000000000..4abeda73b48 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c @@ -0,0 +1,15 @@ +/* This test appears to have too much parallelism to run without a GPU. */ +/* { dg-do run { target { offload_device } } } */ +/* { dg-additional-options { -fdump-tree-original -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp target parallel for map(tofrom:result[0:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp target" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 9 "original" } } */ +/* { dg-final { scan-tree-dump-times "unroll_partial" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c new file mode 100644 index 00000000000..f836707c43b --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c @@ -0,0 +1,10 @@ +/* This test appears to have too much parallelism to run without a GPU. */ +/* { dg-do run { target { offload_device } } } */ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp target teams distribute parallel for map(tofrom:result[:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c new file mode 100644 index 00000000000..28edb6ce83e --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp taskloop +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c new file mode 100644 index 00000000000..481a20a18d0 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp teams distribute parallel for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c new file mode 100644 index 00000000000..200ddd859f5 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c @@ -0,0 +1,8 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#define COMMON_DIRECTIVE omp simd +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h new file mode 100644 index 00000000000..24c3d073024 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h @@ -0,0 +1,191 @@ +#include "matrix-helper.h" + +#ifndef COMMON_TOP_TRANSFORM +#define COMMON_TOP_TRANSFORM +#endif + +#ifndef IMPLEMENTATION_FILE +#define IMPLEMENTATION_FILE "matrix-1.h" +#endif + +#define FUN_NAME_SUFFIX 1 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp unroll partial(2)") _Pragma("omp tile sizes(10)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 2 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8,16,4)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 3 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 4 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 5 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 6 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(10)") _Pragma("omp unroll partial(2)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 7 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp unroll partial(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 8 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 9 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 10 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 11 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 12 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 +#define TRANSFORMATION3 _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 13 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 _Pragma("omp tile sizes(7,8)") +#define TRANSFORMATION3 _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#include IMPLEMENTATION_FILE + +int main () +{ + main1 (); + main2 (); + main3 (); + main4 (); + main5 (); + main6 (); + main7 (); + main8 (); + main9 (); + main10 (); + main11 (); + main12 (); + main13 (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c new file mode 100644 index 00000000000..eb5d3d77eb8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c @@ -0,0 +1,78 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#include + +int compute_sum1 () +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) lastprivate(j) +#pragma omp unroll partial + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort (); + + return sum; +} + +int compute_sum2() +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) lastprivate(j) +#pragma omp unroll partial(5) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort (); + + return sum; +} + +int compute_sum3() +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) +#pragma omp unroll partial(1) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort (); + + return sum; +} + +int main () +{ + int result; + result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum2 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum3 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c new file mode 100644 index 00000000000..7bd9b906235 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c @@ -0,0 +1,131 @@ +/* { dg-additional-options { -Wall -Wno-unknown-pragmas } } */ + +#include +#include + +void test1 () +{ + int sum = 0; + for (int i = -3; i != 1; ++i) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test2 () +{ + int sum = 0; + #pragma omp unroll partial + for (int i = -3; i != 1; ++i) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test3 () +{ + int sum = 0; + #pragma omp unroll partial + for (int i = -3; i != 1; ++i) + #pragma omp unroll partial + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test4 () +{ + int sum = 0; +#pragma omp for +#pragma omp unroll partial(5) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test5 () +{ + int sum = 0; +#pragma omp parallel for reduction(+:sum) +#pragma omp unroll partial(2) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test6 () +{ + int sum = 0; +#pragma omp target parallel for reduction(+:sum) +#pragma omp unroll partial(7) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test7 () +{ + int sum = 0; +#pragma omp target teams distribute parallel for reduction(+:sum) +#pragma omp unroll partial(7) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +int +main () +{ + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); + test7 (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c new file mode 100644 index 00000000000..0e33e028ac2 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c @@ -0,0 +1,82 @@ +/* { dg-do run } */ + +/* Like imperfect-transform.c, but enables offloading. */ + +static int f1count[3], f2count[3]; +#pragma omp declare target enter (f1count, f2count) + +#ifndef __cplusplus +extern void abort (void); +#else +extern "C" void abort (void); +#endif + +int f1 (int depth, int iter) +{ + #pragma omp atomic + f1count[depth]++; + return iter; +} + +int f2 (int depth, int iter) +{ + #pragma omp atomic + f2count[depth]++; + return iter; +} + +void s1 (int a1, int a2, int a3) +{ + int i, j, k; + +#pragma omp target parallel for collapse(2) map(always, tofrom:f1count, f2count) + for (i = 0; i < a1; i++) + { + f1 (0, i); + for (j = 0; j < a2; j++) + { + f1 (1, j); +#pragma omp unroll partial + for (k = 0; k < a3; k++) + { + f1 (2, k); + f2 (2, k); + } + f2 (1, j); + } + f2 (0, i); + } +} + +int +main (void) +{ + f1count[0] = 0; + f1count[1] = 0; + f1count[2] = 0; + f2count[0] = 0; + f2count[1] = 0; + f2count[2] = 0; + + s1 (3, 4, 5); + + /* All intervening code at the same depth must be executed the same + number of times. */ + if (f1count[0] != f2count[0]) abort (); + if (f1count[1] != f2count[1]) abort (); + if (f1count[2] != f2count[2]) abort (); + + /* Intervening code must be executed at least as many times as the loop + that encloses it. */ + if (f1count[0] < 3) abort (); + if (f1count[1] < 3 * 4) abort (); + + /* Intervening code must not be executed more times than the number + of logical iterations. */ + if (f1count[0] > 3 * 4 * 5) abort (); + if (f1count[1] > 3 * 4 * 5) abort (); + + /* Check that the innermost loop body is executed exactly the number + of logical iterations expected. */ + if (f1count[2] != 3 * 4 * 5) abort (); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c b/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c new file mode 100644 index 00000000000..78986e8d3ae --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c @@ -0,0 +1,82 @@ +/* { dg-do run } */ + +/* Like imperfect-transform.c, but enables offloading. */ + +static int f1count[3], f2count[3]; +#pragma omp declare target enter (f1count, f2count) + +#ifndef __cplusplus +extern void abort (void); +#else +extern "C" void abort (void); +#endif + +int f1 (int depth, int iter) +{ + #pragma omp atomic + f1count[depth]++; + return iter; +} + +int f2 (int depth, int iter) +{ + #pragma omp atomic + f2count[depth]++; + return iter; +} + +void s1 (int a1, int a2, int a3) +{ + int i, j, k; + +#pragma omp target parallel for collapse(2) map(always, tofrom:f1count, f2count) + for (i = 0; i < a1; i++) + { + f1 (0, i); + for (j = 0; j < a2; j++) + { + f1 (1, j); +#pragma omp tile sizes(5) + for (k = 0; k < a3; k++) + { + f1 (2, k); + f2 (2, k); + } + f2 (1, j); + } + f2 (0, i); + } +} + +int +main (void) +{ + f1count[0] = 0; + f1count[1] = 0; + f1count[2] = 0; + f2count[0] = 0; + f2count[1] = 0; + f2count[2] = 0; + + s1 (3, 4, 5); + + /* All intervening code at the same depth must be executed the same + number of times. */ + if (f1count[0] != f2count[0]) abort (); + if (f1count[1] != f2count[1]) abort (); + if (f1count[2] != f2count[2]) abort (); + + /* Intervening code must be executed at least as many times as the loop + that encloses it. */ + if (f1count[0] < 3) abort (); + if (f1count[1] < 3 * 4) abort (); + + /* Intervening code must not be executed more times than the number + of logical iterations. */ + if (f1count[0] > 3 * 4 * 5) abort (); + if (f1count[1] > 3 * 4 * 5) abort (); + + /* Check that the innermost loop body is executed exactly the number + of logical iterations expected. */ + if (f1count[2] != 3 * 4 * 5) abort (); +}