From patchwork Thu Sep 8 11:38:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1093 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp198912wrt; Thu, 8 Sep 2022 04:39:31 -0700 (PDT) X-Google-Smtp-Source: AA6agR6TSaViVDJpq8AOiob1n5DO2CGfg5QH9dD3SjSeoo97nff/WrVFku7hntuBVdKHqTSI3tPM X-Received: by 2002:a50:ee09:0:b0:44e:d6cd:c80b with SMTP id g9-20020a50ee09000000b0044ed6cdc80bmr6895598eds.214.1662637171883; Thu, 08 Sep 2022 04:39:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662637171; cv=none; d=google.com; s=arc-20160816; b=zrhUbPWN6phLcuWjHHWdaNsTyRE+AkNK+pEZVsLn1CdYWNJlQGBSa7xFsMaeH3rZq2 DZdIr3s5yifyiz3LCf+r+IVX3O/bt+on9tI3UIIYDP2AcAyeqQLSLUyrHXczUdMA//TM Ll2r4I0VxTtI2UR0HY06kCpSC5aqXJf+aG5RL/wH6AnIht6gLtqqxGRUY0khv/yzbLSG I4FEVNiMySI6LU0tNeodEMmyVB5z5elN616LcLlGq76oG5TYGy+SraEV6pKFwycwmC7s yDMl1Ot7/EbkOPAscj2nE+V8Sh3OvpYOOV79khIMQd2OCUCw7EMLPdeYc/ZDxiIk/MP5 soBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=n0pzwODvNC3y1VwvFaaWbx6oKIL8isKMi+zNIS3QXSo=; b=iVZwgsOeyPtqRZpLAzmbw5+ZL9vKZJNYuUuHnADnHDbRW8+laoKJUQI9nc+U7o3m+x MxHvlEwxWVbt5coGUg3UKlqdiw+7U82fqD6EiFvzMR/B4sVRtqtpuHw47TK9bjEt1sdq FQJVH2cpgBqqiZnHSyqZvmwsR8zlplWGWesgPH+ci6jv4ID5DgF7qqqvQw9VDrDG4SZo y0vRHWoOz4FqvBtMU1oKNzjCPBHJgAIReuv5qyWM3szfmRiRTbFG0na0RvEArEZvuIDr +ydMMojctQi2xYZXl+QQth9yGwRc3p7lQ1YkGLgvqr3FvNfrdtf4Fjr/QGIgff5YEm+R AC8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PRivBLAf; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cw15-20020a170906478f00b00778626f6ffasi400182ejc.567.2022.09.08.04.39.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Sep 2022 04:39:31 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PRivBLAf; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AB7203858425 for ; Thu, 8 Sep 2022 11:39:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AB7203858425 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1662637170; bh=n0pzwODvNC3y1VwvFaaWbx6oKIL8isKMi+zNIS3QXSo=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=PRivBLAfUieVN0+Kt1bWaWgu4pmA0n5BHsYpekP1zPRlbtSCCmYC7yW3ER4i20vl/ qS4DzVd2H6/fGo0/9HPAfQVkGMT+WvFhfkXh6HSL8Uo8wlps8OTnboRIF/xn7pIqXB aHMxVH3tqvpxQRHSQCL2vOetLxriDI2GF+yNXQMQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id D40543858D28 for ; Thu, 8 Sep 2022 11:38:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D40543858D28 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-606-X_dDMLjFPtWtmUvjuAXoJQ-1; Thu, 08 Sep 2022 07:38:43 -0400 X-MC-Unique: X_dDMLjFPtWtmUvjuAXoJQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3915B85A58D; Thu, 8 Sep 2022 11:38:43 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.41]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A05EB2166B26; Thu, 8 Sep 2022 11:38:42 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 288BcdTX1553829 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 8 Sep 2022 13:38:40 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 288BcXYn1553828; Thu, 8 Sep 2022 13:38:33 +0200 Date: Thu, 8 Sep 2022 13:38:33 +0200 To: gcc-patches@gcc.gnu.org Subject: [committed] openmp: Implement doacross(sink: omp_cur_iteration - 1) Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: Tobias Burnus Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1743401435131633660?= X-GMAIL-MSGID: =?utf-8?q?1743401435131633660?= Hi! This patch implements doacross(sink: omp_cur_iteration - 1) that the previous patchset emitted a sorry on during omp expansion. It can be implemented with existing library functions. To recap, depend(source)/doacross(source:)/doacross(source:omp_cur_iteration) is implemented calling GOMP_doacross_post or GOMP_doacross_ull_post, called with an array of long or unsigned long long elements, one for all collapsed loops together and one for each further ordered loop if any. We initialize that array in each thread when grabbing further set of iterations and update it at the end of loops, so that it represents the current iteration (as 0 based counters). When the worksharing loop is created, we tell the library through another similar array the counts (the loop needs to be rectangular) in each dimension, first element is count of all logical iterations in the collapsed loops. depend(sink:v1 op N1, v2 op N2, ...) is then implemented by conditionally calling GOMP_doacross_wait/GOMP_doacross_ull_wait. For N? of 0 there is no check, otherwise if it wants to wait in a particular dimension for a previous iteration, we check that the corresponding iterator isn't the first one (or first few), where the previous iterator in that dimension would be out of range, and similarly for checking of next iteration in a dimension that it isn't the last one (or last few) where it would be similarly out of bounds. Then the collapsed loop counters are folded into a single 0 based counter (first argument) and then other 0 based iterations counters on what iteration it should wait for. Now, doacross(sink: omp_cur_iteration - 1) is supposed to wait for the previous logical iteration in the combined iteration space of all ordered loops. For the very first iteration in that combined iteration space it does nothing, there is no previous iteration. And similarly it does nothing if there are more ordered loops than collapsed loop and it isn't the first logical iteration of the combined loops inside of the collapsed loops, because as implemented we know the previous iteration in that case is always executed by the same thread as the current one. In the implementation, we use the same value as is stored in the first element of the array for GOMP_doacross_post/GOMP_doacross_ull_post, if that value is 0, we do nothing. The rest is different based on if ordered argument is equal to collapse or not. If it is, then we otherwise call GOMP_doacross_wait/GOMP_doacross_ull_wait with a single argument, one less than that counter we compare against 0. If ordered argument is bigger than collapse, we add a per-thread boolean variable .first.N, which we set to true at the start of the outermost ordered loop inside of the collapsed set of loops and set to false at the end of the innermost ordered loop. If .first.N is false, we don't do anything (we know the previous iteration was handled by the current thread and by my reading of the spec we don't need to emit even a memory barrier in that case, because it is just synchronization with the same thread), otherwise we call GOMP_doacross_wait/GOMP_doacross_ull_wait with the first argument one less than the counter we compare against 0, and then one less than 2nd and following counts if iterations we pass to the workshare initialization. If say .counts.N passed to the workshare initialization is { 256, 13, 5, 2 } for collapse(3) ordered(6) loop, then GOMP_doacross_post/GOMP_doacross_ull_post is called with arguments equal to .ordereda.N[0] - 1, 12, 4, 1. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. For Tobias: The new libgomp.c/ testcases are modified copies of existing testcases, doacross-4.c is a copy of doacross-2.c with just using the OpenMP 5.2 syntax (i.e. doacross clauses instead of depend (including the corresponding slight syntax differences)), while doacross-{5,6,7}.c are copies of doacross-{1,2,3}.c which use the new syntax and use doacross(sink:omp_cur_iteration - 1) as much as possible (but not more than once in each loop) and corresponding adjustments on the checking. And, doacross-7.c is actually the same as doacross-6.c with differences in schedule clauses only. 2022-09-08 Jakub Jelinek gcc/ * omp-expand.cc (expand_omp_ordered_sink): Add CONT_BB argument. Add doacross(sink:omp_cur_iteration-1) support. (expand_omp_ordered_source_sink): Clear counts[fd->ordered + 1]. Adjust expand_omp_ordered_sink caller. (expand_omp_for_ordered_loops): If counts[fd->ordered + 1] is non-NULL, set that variable to true at the start of outermost non-collapsed loop and set it to false at the end of innermost ordered loop. (expand_omp_for_generic): If fd->ordered, allocate 1 + (fd->ordered - fd->collapse) further elements in counts array. Copy to counts + 2 + fd->ordered the counts of fd->collapse .. fd->ordered - 1 loop if any. gcc/testsuite/ * c-c++-common/gomp/doacross-7.c: New test. libgomp/ * libgomp.texi (OpenMP 5.2): Mention that omp_cur_iteration is now fully supported. * testsuite/libgomp.c/doacross-4.c: New test. * testsuite/libgomp.c/doacross-5.c: New test. * testsuite/libgomp.c/doacross-6.c: New test. * testsuite/libgomp.c/doacross-7.c: New test. Jakub --- gcc/omp-expand.cc.jj 2022-09-06 09:19:14.735561333 +0200 +++ gcc/omp-expand.cc 2022-09-07 21:08:18.649782291 +0200 @@ -3287,7 +3287,8 @@ expand_omp_ordered_source (gimple_stmt_i static void expand_omp_ordered_sink (gimple_stmt_iterator *gsi, struct omp_for_data *fd, - tree *counts, tree c, location_t loc) + tree *counts, tree c, location_t loc, + basic_block cont_bb) { auto_vec args; enum built_in_function sink_ix @@ -3300,7 +3301,93 @@ expand_omp_ordered_sink (gimple_stmt_ite if (deps == NULL) { - sorry_at (loc, "% not supported yet"); + /* Handle doacross(sink: omp_cur_iteration - 1). */ + gsi_prev (&gsi2); + edge e1 = split_block (gsi_bb (gsi2), gsi_stmt (gsi2)); + edge e2 = split_block_after_labels (e1->dest); + gsi2 = gsi_after_labels (e1->dest); + *gsi = gsi_last_bb (e1->src); + gimple_stmt_iterator gsi3 = *gsi; + + if (counts[fd->collapse - 1]) + { + gcc_assert (fd->collapse == 1); + t = counts[fd->collapse - 1]; + } + else if (fd->collapse > 1) + t = fd->loop.v; + else + { + t = fold_build2 (MINUS_EXPR, TREE_TYPE (fd->loops[0].v), + fd->loops[0].v, fd->loops[0].n1); + t = fold_convert (fd->iter_type, t); + } + + t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, + false, GSI_CONTINUE_LINKING); + gsi_insert_after (gsi, gimple_build_cond (NE_EXPR, t, + build_zero_cst (TREE_TYPE (t)), + NULL_TREE, NULL_TREE), + GSI_NEW_STMT); + + t = fold_build2 (PLUS_EXPR, TREE_TYPE (t), t, + build_minus_one_cst (TREE_TYPE (t))); + t = force_gimple_operand_gsi (&gsi2, t, true, NULL_TREE, + true, GSI_SAME_STMT); + args.safe_push (t); + for (i = fd->collapse; i < fd->ordered; i++) + { + t = counts[fd->ordered + 2 + (i - fd->collapse)]; + t = fold_build2 (PLUS_EXPR, TREE_TYPE (t), t, + build_minus_one_cst (TREE_TYPE (t))); + t = fold_convert (fd->iter_type, t); + t = force_gimple_operand_gsi (&gsi2, t, true, NULL_TREE, + true, GSI_SAME_STMT); + args.safe_push (t); + } + + gimple *g = gimple_build_call_vec (builtin_decl_explicit (sink_ix), + args); + gimple_set_location (g, loc); + gsi_insert_before (&gsi2, g, GSI_SAME_STMT); + + edge e3 = make_edge (e1->src, e2->dest, EDGE_FALSE_VALUE); + e3->probability = profile_probability::guessed_always () / 8; + e1->probability = e3->probability.invert (); + e1->flags = EDGE_TRUE_VALUE; + set_immediate_dominator (CDI_DOMINATORS, e2->dest, e1->src); + + if (fd->ordered > fd->collapse && cont_bb) + { + if (counts[fd->ordered + 1] == NULL_TREE) + counts[fd->ordered + 1] + = create_tmp_var (boolean_type_node, ".first"); + + edge e4; + if (gsi_end_p (gsi3)) + e4 = split_block_after_labels (e1->src); + else + { + gsi_prev (&gsi3); + e4 = split_block (gsi_bb (gsi3), gsi_stmt (gsi3)); + } + gsi3 = gsi_last_bb (e4->src); + + gsi_insert_after (&gsi3, + gimple_build_cond (NE_EXPR, + counts[fd->ordered + 1], + boolean_false_node, + NULL_TREE, NULL_TREE), + GSI_NEW_STMT); + + edge e5 = make_edge (e4->src, e2->dest, EDGE_FALSE_VALUE); + e4->probability = profile_probability::guessed_always () / 8; + e5->probability = e4->probability.invert (); + e4->flags = EDGE_TRUE_VALUE; + set_immediate_dominator (CDI_DOMINATORS, e2->dest, e4->src); + } + + *gsi = gsi_after_labels (e2->dest); return; } for (i = 0; i < fd->ordered; i++) @@ -3558,6 +3645,7 @@ expand_omp_ordered_source_sink (struct o = build_array_type_nelts (fd->iter_type, fd->ordered - fd->collapse + 1); counts[fd->ordered] = create_tmp_var (atype, ".orditera"); TREE_ADDRESSABLE (counts[fd->ordered]) = 1; + counts[fd->ordered + 1] = NULL_TREE; for (inner = region->inner; inner; inner = inner->next) if (inner->type == GIMPLE_OMP_ORDERED) @@ -3575,7 +3663,7 @@ expand_omp_ordered_source_sink (struct o for (c = gimple_omp_ordered_clauses (ord_stmt); c; c = OMP_CLAUSE_CHAIN (c)) if (OMP_CLAUSE_DOACROSS_KIND (c) == OMP_CLAUSE_DOACROSS_SINK) - expand_omp_ordered_sink (&gsi, fd, counts, c, loc); + expand_omp_ordered_sink (&gsi, fd, counts, c, loc, cont_bb); gsi_remove (&gsi, true); } } @@ -3611,6 +3699,9 @@ expand_omp_for_ordered_loops (struct omp { tree t, type = TREE_TYPE (fd->loops[i].v); gimple_stmt_iterator gsi = gsi_after_labels (body_bb); + if (counts[fd->ordered + 1] && i == fd->collapse) + expand_omp_build_assign (&gsi, counts[fd->ordered + 1], + boolean_true_node); expand_omp_build_assign (&gsi, fd->loops[i].v, fold_convert (type, fd->loops[i].n1)); if (counts[i]) @@ -3658,6 +3749,9 @@ expand_omp_for_ordered_loops (struct omp size_int (i - fd->collapse + 1), NULL_TREE, NULL_TREE); expand_omp_build_assign (&gsi, aref, t); + if (counts[fd->ordered + 1] && i == fd->ordered - 1) + expand_omp_build_assign (&gsi, counts[fd->ordered + 1], + boolean_false_node); gsi_prev (&gsi); e2 = split_block (cont_bb, gsi_stmt (gsi)); new_header = e2->dest; @@ -3915,7 +4009,10 @@ expand_omp_for_generic (struct omp_regio int first_zero_iter1 = -1, first_zero_iter2 = -1; basic_block zero_iter1_bb = NULL, zero_iter2_bb = NULL, l2_dom_bb = NULL; - counts = XALLOCAVEC (tree, fd->ordered ? fd->ordered + 1 : fd->collapse); + counts = XALLOCAVEC (tree, fd->ordered + ? fd->ordered + 2 + + (fd->ordered - fd->collapse) + : fd->collapse); expand_omp_for_init_counts (fd, &gsi, entry_bb, counts, zero_iter1_bb, first_zero_iter1, zero_iter2_bb, first_zero_iter2, l2_dom_bb); @@ -4352,13 +4449,21 @@ expand_omp_for_generic (struct omp_regio if (fd->ordered) { /* Until now, counts array contained number of iterations or - variable containing it for ith loop. From now on, we need + variable containing it for ith loop. From now on, we usually need those counts only for collapsed loops, and only for the 2nd till the last collapsed one. Move those one element earlier, we'll use counts[fd->collapse - 1] for the first source/sink iteration counter and so on and counts[fd->ordered] as the array holding the current counter values for - depend(source). */ + depend(source). For doacross(sink:omp_cur_iteration - 1) we need + the counts from fd->collapse to fd->ordered - 1; make a copy of + those to counts[fd->ordered + 2] and onwards. + counts[fd->ordered + 1] can be a flag whether it is the first + iteration with a new collapsed counter (used only if + fd->ordered > fd->collapse). */ + if (fd->ordered > fd->collapse) + memcpy (counts + fd->ordered + 2, counts + fd->collapse, + (fd->ordered - fd->collapse) * sizeof (counts[0])); if (fd->collapse > 1) memmove (counts, counts + 1, (fd->collapse - 1) * sizeof (counts[0])); if (broken_loop) --- gcc/testsuite/c-c++-common/gomp/doacross-7.c.jj 2022-09-08 11:01:50.295306390 +0200 +++ gcc/testsuite/c-c++-common/gomp/doacross-7.c 2022-09-08 10:33:37.589881715 +0200 @@ -0,0 +1,78 @@ +void +foo (int l) +{ + int i, j, k; + #pragma omp parallel + { + #pragma omp for schedule(static) ordered (3) + for (i = 2; i < 256 / 16 - 1; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source:) + } + #pragma omp for schedule(static) ordered (3) collapse(2) + for (i = 2; i < 256 / 16 - 1; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source:) + } + #pragma omp for schedule(static) ordered (3) collapse(3) + for (i = 2; i < 256 / 16 - 1; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source: omp_cur_iteration) + } + #pragma omp for schedule(static) ordered (1) nowait + for (i = 2; i < 256 / 16 - 1; i += l) + { + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source:) + } + } +} + +void +bar (int l, int m, int n, int o) +{ + int i, j, k; + #pragma omp for schedule(static) ordered (3) + for (i = 2; i < 256 / 16 - 1; i++) + for (j = 0; j < m; j += n) + for (k = o; k <= 3; k++) + { + foo (l); + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source:omp_cur_iteration) + } + #pragma omp for schedule(static) ordered (3) collapse(2) + for (i = 2; i < 256 / 16 - m; i += n) + for (j = 0; j < 8; j += o) + for (k = 1; k <= 3; k++) + { + foo (l); + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source : omp_cur_iteration) + } + #pragma omp for schedule(static) ordered (3) collapse(3) + for (i = m; i < 256 / 16 - 1; i++) + for (j = 0; j < n; j += 2) + for (k = 1; k <= o; k++) + { + foo (l); + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source :) + } + #pragma omp for schedule(static) ordered + for (i = m; i < n / 16 - 1; i += l) + { + foo (l); + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(source: omp_cur_iteration) + } +} --- libgomp/libgomp.texi.jj 2022-09-05 23:25:28.671018459 +0200 +++ libgomp/libgomp.texi 2022-09-08 12:49:17.575393448 +0200 @@ -397,8 +397,7 @@ to address of matching mapped list item @code{source}/@code{sink} modifier @tab Y @tab @item Deprecation of @code{depend} with @code{source}/@code{sink} modifier @tab N @tab -@item @code{omp_cur_iteration} keyword @tab P - @tab @code{sink: omp_cur_iteration - 1} unsupported +@item @code{omp_cur_iteration} keyword @tab Y @tab @end multitable @unnumberedsubsec Other new OpenMP 5.2 features --- libgomp/testsuite/libgomp.c/doacross-4.c.jj 2022-09-08 11:13:20.274112539 +0200 +++ libgomp/testsuite/libgomp.c/doacross-4.c 2022-09-08 11:51:27.187644498 +0200 @@ -0,0 +1,228 @@ +extern void abort (void); + +#define N 256 +int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6]; +volatile int d, e; +volatile unsigned long long f; + +int +main () +{ + unsigned long long i; + int j, k, l, m; + #pragma omp parallel private (l) + { + #pragma omp for schedule(static, 1) ordered nowait + for (i = 1; i < N + f; i++) + { + #pragma omp atomic write + a[i] = 1; + #pragma omp ordered doacross(sink: i - 1) + if (i > 1) + { + #pragma omp atomic read + l = a[i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + a[i] = 2; + if (i < N - 1) + { + #pragma omp atomic read + l = a[i + 1]; + if (l == 3) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + a[i] = 3; + } + #pragma omp for schedule(static) ordered (3) nowait + for (i = 3; i < N / 16 - 1 + f; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp atomic write + b[i][j][k] = 1; + #pragma omp ordered doacross(sink: i, j - 2, k - 1) \ + doacross(sink: i - 2, j - 2, k + 1) + #pragma omp ordered doacross(sink: i - 3, j + 2, k - 2) + if (j >= 2 && k > 1) + { + #pragma omp atomic read + l = b[i][j - 2][k - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + b[i][j][k] = 2; + if (i >= 5 && j >= 2 && k < 3) + { + #pragma omp atomic read + l = b[i - 2][j - 2][k + 1]; + if (l < 2) + abort (); + } + if (i >= 6 && j < N / 16 - 3 && k == 3) + { + #pragma omp atomic read + l = b[i - 3][j + 2][k - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source : ) + #pragma omp atomic write + b[i][j][k] = 3; + } +#define A(n) int n; +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) + D(m) +#undef A + #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15) + for (i = 2; i < N / 32 + f; i++) + for (j = 7; j > 1; j--) + for (k = 6; k >= 0; k -= 2) +#define A(n) for (n = 4; n < 5; n++) + D(m) +#undef A + { + #pragma omp atomic write + c[i][j][k] = 1; +#define A(n) ,n +#define E(n) C(n##0) C(n##1) C(n##2) B(n##30) B(n##31) A(n##320) A(n##321) + #pragma omp ordered doacross (sink: i, j, k + 2 E(m)) \ + doacross (sink:i - 2, j + 1, k - 4 E(m)) \ + doacross(sink: i - 1, j - 2, k - 2 E(m)) + if (k <= 4) + { + #pragma omp atomic read + l = c[i][j][k + 2]; + if (l < 2) + abort (); + } + #pragma omp atomic write + c[i][j][k] = 2; + if (i >= 4 && j < 7 && k >= 4) + { + #pragma omp atomic read + l = c[i - 2][j + 1][k - 4]; + if (l < 2) + abort (); + } + if (i >= 3 && j >= 4 && k >= 2) + { + #pragma omp atomic read + l = c[i - 1][j - 2][k - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp atomic write + c[i][j][k] = 3; + } + #pragma omp for schedule(static) ordered (3) nowait + for (j = 0; j < N / 16 - 1; j++) + for (k = 0; k < 8; k += 2) + for (i = 3; i <= 5 + f; i++) + { + #pragma omp atomic write + g[j][k][i] = 1; + #pragma omp ordered doacross(sink: j, k - 2, i - 1) \ + doacross(sink: j - 2, k - 2, i + 1) + #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2) + if (k >= 2 && i > 3) + { + #pragma omp atomic read + l = g[j][k - 2][i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + g[j][k][i] = 2; + if (j >= 2 && k >= 2 && i < 5) + { + #pragma omp atomic read + l = g[j - 2][k - 2][i + 1]; + if (l < 2) + abort (); + } + if (j >= 3 && k < N / 16 - 3 && i == 5) + { + #pragma omp atomic read + l = g[j - 3][k + 2][i - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source :) + #pragma omp atomic write + g[j][k][i] = 3; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d; k++) + for (l = 0; l < d + 2; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, l) + if (!e) + abort (); + } + #pragma omp single + { + if (i != 3 || j != -1 || k != 0) + abort (); + i = 8; j = 9; k = 10; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d + 2; k++) + for (m = 0; m < d; m++) + { + #pragma omp ordered doacross (source:) + #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, m) + abort (); + } + #pragma omp single + if (i != 3 || j != -1 || k != 2 || m != 0) + abort (); + #pragma omp for collapse(2) ordered(4) nowait + for (i = 2; i < f + 3; i++) + for (j = d; j > 0; j--) + for (k = 0; k < d + 2; k++) + for (l = 0; l < d + 4; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, l) + if (!e) + abort (); + } + #pragma omp for nowait + for (i = 0; i < N; i++) + if (a[i] != 3 * (i >= 1)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 4; k++) + if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1)) + abort (); + #pragma omp for collapse(3) nowait + for (i = 0; i < N / 32; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 8; k++) + if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 6; k++) + if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3)) + abort (); + } + return 0; +} --- libgomp/testsuite/libgomp.c/doacross-5.c.jj 2022-09-08 11:24:26.787237389 +0200 +++ libgomp/testsuite/libgomp.c/doacross-5.c 2022-09-08 11:47:42.248642381 +0200 @@ -0,0 +1,198 @@ +extern void abort (void); + +#define N 256 +int a[N], b[N / 16][8][4], c[N / 32][8][8]; +volatile int d, e; + +int +main () +{ + int i, j, k, l, m; + #pragma omp parallel private (l) + { + #pragma omp for schedule(static, 1) ordered (1) nowait + for (i = 0; i < N; i++) + { + #pragma omp atomic write + a[i] = 1; + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + if (i) + { + #pragma omp atomic read + l = a[i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + a[i] = 2; + if (i < N - 1) + { + #pragma omp atomic read + l = a[i + 1]; + if (l == 3) + abort (); + } + #pragma omp ordered doacross(source :) + #pragma omp atomic write + a[i] = 3; + } + #pragma omp for schedule(static) ordered (3) nowait + for (i = 2; i < N / 16 - 1; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp atomic write + b[i][j][k] = 1; + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) \ + doacross(sink: i - 2, j - 2, k + 1) + #pragma omp ordered doacross(sink: i - 3, j + 2, k - 2) + if (i != 2 || j || k != 1) + { + if (k != 1) + #pragma omp atomic read + l = b[i][j][k - 1]; + else if (j) + #pragma omp atomic read + l = b[i][j - 2][3]; + else + #pragma omp atomic read + l = b[i - 1][6][3]; + if (l < 2) + abort (); + } + #pragma omp atomic write + b[i][j][k] = 2; + if (i >= 4 && j >= 2 && k < 3) + { + #pragma omp atomic read + l = b[i - 2][j - 2][k + 1]; + if (l < 2) + abort (); + } + if (i >= 5 && j < N / 16 - 3 && k == 3) + { + #pragma omp atomic read + l = b[i - 3][j + 2][k - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + b[i][j][k] = 3; + } +#define A(n) int n; +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) + D(m) +#undef A + #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15) + for (i = 0; i < N / 32; i++) + for (j = 7; j > 1; j--) + for (k = 6; k >= 0; k -= 2) +#define A(n) for (n = 4; n < 5; n++) + D(m) +#undef A + { + #pragma omp atomic write + c[i][j][k] = 1; +#define A(n) ,n +#define E(n) C(n##0) C(n##1) C(n##2) B(n##30) B(n##31) A(n##320) A(n##321) + #pragma omp ordered doacross (sink: i, j, k + 2 E(m)) \ + doacross (sink:omp_cur_iteration - 1) \ + doacross(sink: i - 1, j - 2, k - 2 E(m)) + if (k <= 4) + { + #pragma omp atomic read + l = c[i][j][k + 2]; + if (l < 2) + abort (); + } + #pragma omp atomic write + c[i][j][k] = 2; + if (i || j != 7 && k != 6) + { + if (k != 6) + #pragma omp atomic read + l = c[i][j][k + 2]; + else if (j != 7) + #pragma omp atomic read + l = c[i][j + 1][0]; + else + #pragma omp atomic read + l = c[i - 1][2][0]; + if (l < 2) + abort (); + } + if (i >= 1 && j >= 4 && k >= 2) + { + #pragma omp atomic read + l = c[i - 1][j - 2][k - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross (source: ) + #pragma omp atomic write + c[i][j][k] = 3; + } + + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k) + for (i = 0; i < d + 1; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d; k++) + for (l = 0; l < d + 2; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink: omp_cur_iteration - 1) + if (!e) + abort (); + } + #pragma omp single + { + if (i != 1 || j != -1 || k != 0) + abort (); + i = 8; j = 9; k = 10; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m) + for (i = 0; i < d + 1; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d + 2; k++) + for (m = 0; m < d; m++) + { + #pragma omp ordered doacross (source : ) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + abort (); + } + #pragma omp single + if (i != 1 || j != -1 || k != 2 || m != 0) + abort (); + #pragma omp for collapse(2) ordered(4) nowait + for (i = 0; i < d + 1; i++) + for (j = d; j > 0; j--) + for (k = 0; k < d + 2; k++) + for (l = 0; l < d + 4; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + if (!e) + abort (); + } + #pragma omp for nowait + for (i = 0; i < N; i++) + if (a[i] != 3) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 4; k++) + if (b[i][j][k] != 3 * (i >= 2 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1)) + abort (); + #pragma omp for collapse(3) nowait + for (i = 0; i < N / 32; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 8; k++) + if (c[i][j][k] != 3 * (j >= 2 && (k & 1) == 0)) + abort (); + } + return 0; +} --- libgomp/testsuite/libgomp.c/doacross-6.c.jj 2022-09-08 12:12:09.894080951 +0200 +++ libgomp/testsuite/libgomp.c/doacross-6.c 2022-09-08 12:36:18.902768322 +0200 @@ -0,0 +1,231 @@ +extern void abort (void); + +#define N 256 +int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6]; +volatile int d, e; +volatile unsigned long long f; + +int +main () +{ + unsigned long long i; + int j, k, l, m; + #pragma omp parallel private (l) + { + #pragma omp for schedule(static, 1) ordered (1) nowait + for (i = 1; i < N + f; i++) + { + #pragma omp atomic write + a[i] = 1; + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + if (i > 1) + { + #pragma omp atomic read + l = a[i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + a[i] = 2; + if (i < N - 1) + { + #pragma omp atomic read + l = a[i + 1]; + if (l == 3) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + a[i] = 3; + } + #pragma omp for schedule(static) ordered (3) nowait + for (i = 3; i < N / 16 - 1 + f; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp atomic write + b[i][j][k] = 1; + #pragma omp ordered doacross(sink: i, j - 2, k - 1) \ + doacross(sink: i - 2, j - 2, k + 1) + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + if (j >= 2 && k > 1) + { + #pragma omp atomic read + l = b[i][j - 2][k - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + b[i][j][k] = 2; + if (i >= 5 && j >= 2 && k < 3) + { + #pragma omp atomic read + l = b[i - 2][j - 2][k + 1]; + if (l < 2) + abort (); + } + if (i != 3 || j || k != 1) + { + if (k != 1) + #pragma omp atomic read + l = b[i][j][k - 1]; + else if (j) + #pragma omp atomic read + l = b[i][j - 2][3]; + else + #pragma omp atomic read + l = b[i - 1][6][3]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source:) + #pragma omp atomic write + b[i][j][k] = 3; + } +#define A(n) int n; +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) + D(m) +#undef A + #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15) + for (i = 2; i < N / 32 + f; i++) + for (j = 7; j > 1; j--) + for (k = 6; k >= 0; k -= 2) +#define A(n) for (n = 4; n < 5; n++) + D(m) +#undef A + { + #pragma omp atomic write + c[i][j][k] = 1; + #pragma omp ordered doacross (sink: omp_cur_iteration - 1) + if (i != 2 || j != 7 || k != 6) + { + if (k != 6) + #pragma omp atomic read + l = c[i][j][k + 2]; + else if (j != 7) + #pragma omp atomic read + l = c[i][j + 1][0]; + else + #pragma omp atomic read + l = c[i - 1][2][0]; + if (l < 2) + abort (); + } + #pragma omp atomic write + c[i][j][k] = 2; + #pragma omp ordered doacross (source:) + #pragma omp atomic write + c[i][j][k] = 3; + } + #pragma omp for schedule(static) ordered (3) nowait + for (j = 0; j < N / 16 - 1; j++) + for (k = 0; k < 8; k += 2) + for (i = 3; i <= 5 + f; i++) + { + #pragma omp atomic write + g[j][k][i] = 1; + #pragma omp ordered doacross(sink: j, k - 2, i - 1) \ + doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2) + if (k >= 2 && i > 3) + { + #pragma omp atomic read + l = g[j][k - 2][i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + g[j][k][i] = 2; + if (j || k || i != 3) + { + if (i != 3) + #pragma omp atomic read + l = g[j][k][i - 1]; + else if (k) + #pragma omp atomic read + l = g[j][k - 2][5 + f]; + else + #pragma omp atomic read + l = g[j - 1][6][5 + f]; + if (l < 2) + abort (); + } + if (j >= 3 && k < N / 16 - 3 && i == 5) + { + #pragma omp atomic read + l = g[j - 3][k + 2][i - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + g[j][k][i] = 3; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d; k++) + for (l = 0; l < d + 2; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + if (!e) + abort (); + } + #pragma omp single + { + if (i != 3 || j != -1 || k != 0) + abort (); + i = 8; j = 9; k = 10; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d + 2; k++) + for (m = 0; m < d; m++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + abort (); + } + #pragma omp single + if (i != 3 || j != -1 || k != 2 || m != 0) + abort (); + #pragma omp for collapse(2) ordered(4) nowait + for (i = 2; i < f + 3; i++) + for (j = d; j > 0; j--) + for (k = 0; k < d + 2; k++) + for (l = 0; l < d + 4; l++) + { + #pragma omp ordered doacross (source:) + #pragma omp ordered doacross (sink:omp_cur_iteration-1) + if (!e) + abort (); + } + #pragma omp for nowait + for (i = 0; i < N; i++) + if (a[i] != 3 * (i >= 1)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 4; k++) + if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1)) + abort (); + #pragma omp for collapse(3) nowait + for (i = 0; i < N / 32; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 8; k++) + if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 6; k++) + if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3)) + abort (); + } + return 0; +} --- libgomp/testsuite/libgomp.c/doacross-7.c.jj 2022-09-08 12:37:05.964141298 +0200 +++ libgomp/testsuite/libgomp.c/doacross-7.c 2022-09-08 12:37:52.923515613 +0200 @@ -0,0 +1,231 @@ +extern void abort (void); + +#define N 256 +int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6]; +volatile int d, e; +volatile unsigned long long f; + +int +main () +{ + unsigned long long i; + int j, k, l, m; + #pragma omp parallel private (l) + { + #pragma omp for schedule(guided, 3) ordered (1) nowait + for (i = 1; i < N + f; i++) + { + #pragma omp atomic write + a[i] = 1; + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + if (i > 1) + { + #pragma omp atomic read + l = a[i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + a[i] = 2; + if (i < N - 1) + { + #pragma omp atomic read + l = a[i + 1]; + if (l == 3) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + a[i] = 3; + } + #pragma omp for schedule(guided) ordered (3) nowait + for (i = 3; i < N / 16 - 1 + f; i++) + for (j = 0; j < 8; j += 2) + for (k = 1; k <= 3; k++) + { + #pragma omp atomic write + b[i][j][k] = 1; + #pragma omp ordered doacross(sink: i, j - 2, k - 1) \ + doacross(sink: i - 2, j - 2, k + 1) + #pragma omp ordered doacross(sink: omp_cur_iteration - 1) + if (j >= 2 && k > 1) + { + #pragma omp atomic read + l = b[i][j - 2][k - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + b[i][j][k] = 2; + if (i >= 5 && j >= 2 && k < 3) + { + #pragma omp atomic read + l = b[i - 2][j - 2][k + 1]; + if (l < 2) + abort (); + } + if (i != 3 || j || k != 1) + { + if (k != 1) + #pragma omp atomic read + l = b[i][j][k - 1]; + else if (j) + #pragma omp atomic read + l = b[i][j - 2][3]; + else + #pragma omp atomic read + l = b[i - 1][6][3]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source:) + #pragma omp atomic write + b[i][j][k] = 3; + } +#define A(n) int n; +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) + D(m) +#undef A + #pragma omp for collapse (2) ordered(61) schedule(guided, 15) + for (i = 2; i < N / 32 + f; i++) + for (j = 7; j > 1; j--) + for (k = 6; k >= 0; k -= 2) +#define A(n) for (n = 4; n < 5; n++) + D(m) +#undef A + { + #pragma omp atomic write + c[i][j][k] = 1; + #pragma omp ordered doacross (sink: omp_cur_iteration - 1) + if (i != 2 || j != 7 || k != 6) + { + if (k != 6) + #pragma omp atomic read + l = c[i][j][k + 2]; + else if (j != 7) + #pragma omp atomic read + l = c[i][j + 1][0]; + else + #pragma omp atomic read + l = c[i - 1][2][0]; + if (l < 2) + abort (); + } + #pragma omp atomic write + c[i][j][k] = 2; + #pragma omp ordered doacross (source:) + #pragma omp atomic write + c[i][j][k] = 3; + } + #pragma omp for schedule(guided, 5) ordered (3) nowait + for (j = 0; j < N / 16 - 1; j++) + for (k = 0; k < 8; k += 2) + for (i = 3; i <= 5 + f; i++) + { + #pragma omp atomic write + g[j][k][i] = 1; + #pragma omp ordered doacross(sink: j, k - 2, i - 1) \ + doacross(sink: omp_cur_iteration - 1) + #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2) + if (k >= 2 && i > 3) + { + #pragma omp atomic read + l = g[j][k - 2][i - 1]; + if (l < 2) + abort (); + } + #pragma omp atomic write + g[j][k][i] = 2; + if (j || k || i != 3) + { + if (i != 3) + #pragma omp atomic read + l = g[j][k][i - 1]; + else if (k) + #pragma omp atomic read + l = g[j][k - 2][5 + f]; + else + #pragma omp atomic read + l = g[j - 1][6][5 + f]; + if (l < 2) + abort (); + } + if (j >= 3 && k < N / 16 - 3 && i == 5) + { + #pragma omp atomic read + l = g[j - 3][k + 2][i - 2]; + if (l < 2) + abort (); + } + #pragma omp ordered doacross(source : omp_cur_iteration) + #pragma omp atomic write + g[j][k][i] = 3; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d; k++) + for (l = 0; l < d + 2; l++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + if (!e) + abort (); + } + #pragma omp single + { + if (i != 3 || j != -1 || k != 0) + abort (); + i = 8; j = 9; k = 10; + } + #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m) + for (i = 2; i < f + 3; i++) + for (j = d + 1; j >= 0; j--) + for (k = 0; k < d + 2; k++) + for (m = 0; m < d; m++) + { + #pragma omp ordered doacross (source : omp_cur_iteration) + #pragma omp ordered doacross (sink:omp_cur_iteration - 1) + abort (); + } + #pragma omp single + if (i != 3 || j != -1 || k != 2 || m != 0) + abort (); + #pragma omp for collapse(2) ordered(4) nowait + for (i = 2; i < f + 3; i++) + for (j = d; j > 0; j--) + for (k = 0; k < d + 2; k++) + for (l = 0; l < d + 4; l++) + { + #pragma omp ordered doacross (source:) + #pragma omp ordered doacross (sink:omp_cur_iteration-1) + if (!e) + abort (); + } + #pragma omp for nowait + for (i = 0; i < N; i++) + if (a[i] != 3 * (i >= 1)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 4; k++) + if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1)) + abort (); + #pragma omp for collapse(3) nowait + for (i = 0; i < N / 32; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 8; k++) + if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0)) + abort (); + #pragma omp for collapse(2) private(k) nowait + for (i = 0; i < N / 16; i++) + for (j = 0; j < 8; j++) + for (k = 0; k < 6; k++) + if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3)) + abort (); + } + return 0; +}