From patchwork Fri Mar 10 14:58:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 67420 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp919006wrd; Fri, 10 Mar 2023 06:59:26 -0800 (PST) X-Google-Smtp-Source: AK7set/qnLD3nZWmkeBOiy6adC+oDLLL2w8tpEP0vsssA98uunbbfVgcEM0ehdbYF2HVPD1giRea X-Received: by 2002:a17:906:3ac1:b0:8af:1a8c:eaa0 with SMTP id z1-20020a1709063ac100b008af1a8ceaa0mr24377579ejd.75.1678460366416; Fri, 10 Mar 2023 06:59:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678460366; cv=none; d=google.com; s=arc-20160816; b=tXotZ6yDhpYrHPOYzHizrK1Yix3iw5JYT/pt9cXj4hX6XZttKco7zx7j0VGXGSD7PW 4Lal6uCxVCOXkx9o/sobBIusAYHWtXd7HehZ50QCG/iL3kfYgxWh34mXEb3YczMIdtUS kAG2nucLqcYjCmuib+1O+ZnI6jo9E/ITE3jPKl5rb8USW9tYeS3fr3K3ypqdivxexo54 2lQryFoEedPEpWRG8/CwmXzw456ARPTruQTx41eHW8RXtEmzZgKLkq9RywKHYaB/+NLs DXk2QBm6U5ADawu01p7aEoeYXcbRbkJx9csL7ySxMjXRTsRD2q3BH6MEJqlo/irNnW7Z vIzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version:message-id:date :user-agent:subject:cc:to:from:ironport-sdr:dmarc-filter :delivered-to; bh=5YZH+5KuoETn2ureX+j3Kqcw6j9wQS3ENnAGhf0Uw2I=; b=qyWq83uJc4y7euw39+6oiKr4Jce00g9P62slXzTSp5E+1eoZMWNzLSJPwv4qmzjetF JPiQ7GC20NdiqfWajcYxvciDxWAQ37SdFZ175VYRU3rvG3Gk0gaCtmnksfdfntzGmu1F HHzN336U79w759KHhzdZuQfh+eFS0Uhf3bA9YqHg9kFrci0RSwJeTggihCsE/ocZ4bRX NCi3L3OmrMe3a9p/0wXHsGilzWvwThyX9MdigJT/pcleu+BFzap1KAlbRO4YZp/gI3/L 0USXPpN/XSOhX4/h2877xomSSiSq8qEhoudzf+WyCNbqFs3Rja7k72hkwKoagQdB/8ax zm0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id mp22-20020a1709071b1600b008d0378ec19fsi2994815ejc.650.2023.03.10.06.59.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Mar 2023 06:59:26 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85B8D3850867 for ; Fri, 10 Mar 2023 14:58:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 56584385483E for ; Fri, 10 Mar 2023 14:58:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 56584385483E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,250,1673942400"; d="scan'208,223";a="103874348" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 10 Mar 2023 06:58:28 -0800 IronPort-SDR: PpJnNLhOSt46+DzNpiDEMnxBb5LEmvifozcqugMNhl/GK2Zj+4T/9cd/81M16tizZ0+ByqUkiR JzcLPLd0kCLtRiYNWRlwJDLIZuu8cPUH13Nj1AeecMU/uPJKX9ZPG26Hp8q380rHPKeNoLAm70 U7CIfdOqBxNdMAaQkLQIhaa0AQ3Fw5emdcd+/pYzQHfKjOgl15wRT5r4JNwh40jPFylsp8Bck+ 2nldDAAWqq6cacUPPWHiXuOFG6iz280lnUbBLnZE+rR3s56/eJ28qGxdyfThz2GuYFrXuBXszN zLM= From: Thomas Schwinge To: CC: Julian Brown , Andrew Stubbs , Tom de Vries , Tobias Burnus Subject: Simplify OpenACC 'no_create' clause implementation User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/28.2 (x86_64-pc-linux-gnu) Date: Fri, 10 Mar 2023 15:58:22 +0100 Message-ID: <875yb8bqlt.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, UNWANTED_LANGUAGE_BODY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759993257464665237?= X-GMAIL-MSGID: =?utf-8?q?1759993257464665237?= Hi! Pushed to master branch commit 199867d07be65cb0227a318ebf42b8376ca09313 "Simplify OpenACC 'no_create' clause implementation", see attached. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From 199867d07be65cb0227a318ebf42b8376ca09313 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 27 Feb 2023 12:02:02 +0100 Subject: [PATCH] Simplify OpenACC 'no_create' clause implementation For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then simplify the device plugins accordingly. This is a follow-up to Subversion r279551 (Git commit a6163563f2ce502bd4ef444bd5de33570bb8eeb1) "Add OpenACC 2.6's no_create", Subversion r279622 (Git commit 5bcd470bf0749e1f56d05dd43aa9584ff2e3a090) "Use gomp_map_val for OpenACC host-to-device address translation". libgomp/ * target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for 'GOMP_MAP_IF_PRESENT'. * plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Adjust. * plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Likewise. * testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async' testing. * testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise. --- libgomp/plugin/plugin-gcn.c | 18 +++++------ libgomp/plugin/plugin-nvptx.c | 19 ++++++------ libgomp/target.c | 2 +- .../libgomp.oacc-c-c++-common/no_create-1.c | 30 +++++++++++++++---- .../libgomp.oacc-c-c++-common/no_create-2.c | 12 +++++++- 5 files changed, 54 insertions(+), 27 deletions(-) diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 96920a48d2e..954a140ba5e 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3064,7 +3064,7 @@ wait_queue (struct goacc_asyncqueue *aq) /* Execute an OpenACC kernel, synchronously or asynchronously. */ static void -gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, +gcn_exec (struct kernel_info *kernel, size_t mapnum, void **devaddrs, unsigned *dims, void *targ_mem_desc, bool async, struct goacc_asyncqueue *aq) { @@ -3077,9 +3077,7 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, /* devaddrs must be double-indirect on the target. */ void **ind_da = alloc_by_agent (kernel->agent, sizeof (void*) * mapnum); for (size_t i = 0; i < mapnum; i++) - hsa_fns.hsa_memory_copy_fn (&ind_da[i], - devaddrs[i] ? &devaddrs[i] : &hostaddrs[i], - sizeof (void *)); + hsa_fns.hsa_memory_copy_fn (&ind_da[i], &devaddrs[i], sizeof (void *)); struct hsa_kernel_description *hsa_kernel_desc = NULL; for (unsigned i = 0; i < kernel->module->image_desc->kernel_count; i++) @@ -3887,27 +3885,27 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars, void GOMP_OFFLOAD_openacc_exec (void (*fn_ptr) (void *), size_t mapnum, - void **hostaddrs, void **devaddrs, unsigned *dims, + void **hostaddrs __attribute__((unused)), + void **devaddrs, unsigned *dims, void *targ_mem_desc) { struct kernel_info *kernel = (struct kernel_info *) fn_ptr; - gcn_exec (kernel, mapnum, hostaddrs, devaddrs, dims, targ_mem_desc, false, - NULL); + gcn_exec (kernel, mapnum, devaddrs, dims, targ_mem_desc, false, NULL); } /* Run an asynchronous OpenACC kernel on the specified queue. */ void GOMP_OFFLOAD_openacc_async_exec (void (*fn_ptr) (void *), size_t mapnum, - void **hostaddrs, void **devaddrs, + void **hostaddrs __attribute__((unused)), + void **devaddrs, unsigned *dims, void *targ_mem_desc, struct goacc_asyncqueue *aq) { struct kernel_info *kernel = (struct kernel_info *) fn_ptr; - gcn_exec (kernel, mapnum, hostaddrs, devaddrs, dims, targ_mem_desc, true, - aq); + gcn_exec (kernel, mapnum, devaddrs, dims, targ_mem_desc, true, aq); } /* Create a new asynchronous thread and queue for running future kernels. */ diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 1166807f68f..13e31156d36 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -742,8 +742,7 @@ link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs, } static void -nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, - unsigned *dims, void *targ_mem_desc, +nvptx_exec (void (*fn), size_t mapnum, unsigned *dims, void *targ_mem_desc, CUdeviceptr dp, CUstream stream) { struct targ_fn_descriptor *targ_fn = (struct targ_fn_descriptor *) fn; @@ -1530,7 +1529,8 @@ GOMP_OFFLOAD_free (int ord, void *ptr) void GOMP_OFFLOAD_openacc_exec (void (*fn) (void *), size_t mapnum, - void **hostaddrs, void **devaddrs, + void **hostaddrs __attribute__((unused)), + void **devaddrs, unsigned *dims, void *targ_mem_desc) { GOMP_PLUGIN_debug (0, " %s: prepare mappings\n", __FUNCTION__); @@ -1549,7 +1549,7 @@ GOMP_OFFLOAD_openacc_exec (void (*fn) (void *), size_t mapnum, size_t s = mapnum * sizeof (void *); hp = alloca (s); for (int i = 0; i < mapnum; i++) - hp[i] = (devaddrs[i] ? devaddrs[i] : hostaddrs[i]); + hp[i] = devaddrs[i]; CUDA_CALL_ASSERT (cuMemAlloc, &dp, s); if (profiling_p) goacc_profiling_acc_ev_alloc (thr, (void *) dp, s); @@ -1591,8 +1591,7 @@ GOMP_OFFLOAD_openacc_exec (void (*fn) (void *), size_t mapnum, } } - nvptx_exec (fn, mapnum, hostaddrs, devaddrs, dims, targ_mem_desc, - dp, NULL); + nvptx_exec (fn, mapnum, dims, targ_mem_desc, dp, NULL); CUresult r = CUDA_CALL_NOCHECK (cuStreamSynchronize, NULL); const char *maybe_abort_msg = "(perhaps abort was called)"; @@ -1617,7 +1616,8 @@ cuda_free_argmem (void *ptr) void GOMP_OFFLOAD_openacc_async_exec (void (*fn) (void *), size_t mapnum, - void **hostaddrs, void **devaddrs, + void **hostaddrs __attribute__((unused)), + void **devaddrs, unsigned *dims, void *targ_mem_desc, struct goacc_asyncqueue *aq) { @@ -1639,7 +1639,7 @@ GOMP_OFFLOAD_openacc_async_exec (void (*fn) (void *), size_t mapnum, block = (void **) GOMP_PLUGIN_malloc (2 * sizeof (void *) + s); hp = block + 2; for (int i = 0; i < mapnum; i++) - hp[i] = (devaddrs[i] ? devaddrs[i] : hostaddrs[i]); + hp[i] = devaddrs[i]; CUDA_CALL_ASSERT (cuMemAlloc, &dp, s); if (profiling_p) goacc_profiling_acc_ev_alloc (thr, (void *) dp, s); @@ -1688,8 +1688,7 @@ GOMP_OFFLOAD_openacc_async_exec (void (*fn) (void *), size_t mapnum, } } - nvptx_exec (fn, mapnum, hostaddrs, devaddrs, dims, targ_mem_desc, - dp, aq->cuda_stream); + nvptx_exec (fn, mapnum, dims, targ_mem_desc, dp, aq->cuda_stream); if (mapnum > 0) GOMP_OFFLOAD_openacc_async_queue_callback (aq, cuda_free_argmem, block); diff --git a/libgomp/target.c b/libgomp/target.c index 483851c95ac..0344f68a936 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -1207,7 +1207,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, { /* Not present, hence, skip entry - including its MAP_POINTER, when existing. */ - tgt->list[i].offset = OFFSET_POINTER; + tgt->list[i].offset = OFFSET_INLINED; if (i + 1 < mapnum && ((typemask & get_kind (short_mapkind, kinds, i + 1)) == GOMP_MAP_POINTER)) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c index 22e0c20cce9..05297d3a280 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-1.c @@ -22,15 +22,10 @@ main (int argc, char *argv[]) devptr[0] = &var; devptr[1] = &arr[2]; } - if (acc_hostptr (devptr[0]) != (void *) &var) __builtin_abort (); if (acc_hostptr (devptr[1]) != (void *) &arr[2]) __builtin_abort (); - - acc_delete (&var, sizeof (var)); - acc_delete (arr, N * sizeof (*arr)); - #if ACC_MEM_SHARED if (devptr[0] != &var) __builtin_abort (); @@ -43,6 +38,31 @@ main (int argc, char *argv[]) __builtin_abort (); #endif +#pragma acc parallel no_create(var, arr[0:N]) copyout(devptr) async + { + devptr[0] = &arr[N - 2]; + devptr[1] = &var; + } +#pragma acc wait + if (acc_hostptr (devptr[0]) != (void *) &arr[N - 2]) + __builtin_abort (); + if (acc_hostptr (devptr[1]) != (void *) &var) + __builtin_abort (); +#if ACC_MEM_SHARED + if (devptr[0] != &arr[N - 2]) + __builtin_abort (); + if (devptr[1] != &var) + __builtin_abort (); +#else + if (devptr[0] == &arr[N - 2]) + __builtin_abort (); + if (devptr[1] == &var) + __builtin_abort (); +#endif + + acc_delete (&var, sizeof (var)); + acc_delete (arr, N * sizeof (*arr)); + free (arr); return 0; diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c index fbd01a25956..202092fe8a8 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/no_create-2.c @@ -18,12 +18,22 @@ main (int argc, char *argv[]) devptr[0] = &var; devptr[1] = &arr[2]; } - if (devptr[0] != &var) __builtin_abort (); if (devptr[1] != &arr[2]) __builtin_abort (); +#pragma acc parallel no_create(var, arr[0:N]) copyout(devptr) async + { + devptr[0] = &arr[N - 2]; + devptr[1] = &var; + } +#pragma acc wait + if (devptr[0] != &arr[N - 2]) + __builtin_abort (); + if (devptr[1] != &var) + __builtin_abort (); + free (arr); return 0; -- 2.25.1