From patchwork Thu Oct 20 10:19:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 6140 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp18921wrr; Thu, 20 Oct 2022 03:20:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4hlARwtCQbkPNUqoQKUJ9IB+gZuqKmTZyLbc5/7sYQ/tlL2pmyhPwLoEhbb9z8n7Y6rTfX X-Received: by 2002:a17:906:5dda:b0:78d:e7d2:7499 with SMTP id p26-20020a1709065dda00b0078de7d27499mr10428435ejv.588.1666261221508; Thu, 20 Oct 2022 03:20:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666261221; cv=none; d=google.com; s=arc-20160816; b=DQFEf+z5ZaxTiWezwA6+82nwEp61YezVAysn/32u1MLAHP2lL1cRKrASOB+CUfMBqz nysAuXDchNQXFmusUXknUaMCRLM4h/D4eaqZxeexRvN8uAaWN2iiYhhOIkMJ7TcvyNEF hmFzcu5+560gKqBeF7QLV3JIgOkjGZJZuYxM4ejWVW7+7ieorta4AzLz/hVP6bIv61tb dKui/rsUp7pWoHTHT+3eEjJaZj9ZjEVxtkvqT/zGUdwy3/dM2cbg/2Qu/9rnb925scMP 4UM4u7ZkolxoqRAGardVilsQpLNez1sZN4tnlfT2uzJSzOWZozytcUhvFmAODjzBMGHE XB5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version:message-id:date :user-agent:references:in-reply-to:subject:to:from:ironport-sdr :dmarc-filter:delivered-to; bh=QQob7x/U5q4dXHZPet2VHqlGnN1z/yEXukLQTxPt/mA=; b=XltSO59BqMdDmdRNUKjBkHMHdontIEoIGqFfgjw11pzjrePXSYEoEBFACDtO/htxTM X4pD34SOGX1NLkSmtwHl7OBWfMRmUzs7m1ldlL99BcIBHL2Url3N8UiRFPbsnP48TeSP nsili1Fot31xCo/cemiIXHWAE9U/mtCuJXMOePjpVrIIF/vvdeWXrIYtUNLhgfPrXefz 2B7nW2qE0zOOvJaryKdAbr18+ra3yKNofu0tb4wJLV6P1ToAsArqiGQC7Olx2fxGAlVB xgN4pIMhFKnLvW76s4dWNDMgjxqPLbSlEb3EDlPVxEzfhpALl2f3my0qfLEfdBl7N045 13pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id et3-20020a056402378300b004534c7d4ebfsi15443156edb.434.2022.10.20.03.20.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Oct 2022 03:20:21 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C8B7A38515C5 for ; Thu, 20 Oct 2022 10:20:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 2ED1D38561B5; Thu, 20 Oct 2022 10:19:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2ED1D38561B5 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.95,198,1661846400"; d="scan'208,223";a="85202633" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 20 Oct 2022 02:19:35 -0800 IronPort-SDR: W9gIUmvzN6vYSN6XN96/wI1kgcPS8wepuSt08+EC8TUCDmbrdFgm+Q0XEUGZXaj03uJ7JsJmkQ L0/cKZtMXS817LVis6M94M1SYURpgtpFfeeocNBY2EPhA80yqPe2i0Tje/2B8YEuQwsOLjyChl z/OAtQDJMjuvg/HC0NGtERJPYASyit/ZO5av+vIvzuLM+MqKfCM37dzpHsCyfAeBoqAq4tKOzh ddeH4qVGUgpTNod2o8lKRoKfs/RFEGgUMYVW9li4RMq3NisWUgVx38TVQ10Keke4Jhku1+JsrZ H4c= From: Thomas Schwinge To: Julian Brown , Subject: Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421] (was: amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]) In-Reply-To: <87lepahkt3.fsf@euler.schwinge.homeip.net> References: <20221014133856.3388109-1-julian@codesourcery.com> <87lepahkt3.fsf@euler.schwinge.homeip.net> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Thu, 20 Oct 2022 12:19:28 +0200 Message-ID: <87h6zyhk5r.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andrew Stubbs , fortran@gcc.gnu.org Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747201526491664139?= X-GMAIL-MSGID: =?utf-8?q?1747201526491664139?= Hi! On 2022-10-20T12:05:28+0200, I wrote: > On 2022-10-14T13:38:55+0000, Julian Brown wrote: >> The GCN backend uses a heuristic to determine whether to use FLAT or >> GLOBAL addressing in a particular (offload) function: namely, if a >> function takes a pointer-to-scalar parameter, it is assumed that the >> pointer may refer to "flat scratch" space, and thus FLAT addressing must >> be used instead of GLOBAL. >> >> I came up with this heuristic initially whilst working on support for >> moving OpenACC gang-private variables into local-data share (scratch) >> memory. The assumption that only scalar variables would be transformed in >> that way turned out to be wrong. For example, [...] >> Fortran compiler-generated temporary structures were treated >> as gang private and moved to LDS space, typically overflowing the region >> allocated for such variables. [...] >> there may be other cases of structs moving to LDS >> space now or in the future that this patch may be needed for. When I (back then) had looked into PR105421 "GCN offloading, raised '-mgang-private-size': 'HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION'", I had been experimenting with different test codes, that all didn't exhibit this problem. Now I understand that 'struct' (as implied by PR105421's Fortran 'write', for example) was the crucial thing there (that is, 'AGGREGATE_TYPE_P (TREE_TYPE (TREE_VALUE (arg)))' in context of the previous code). With... > pushed to master branch commit 7c55755d4c760de326809636531478fd7419e1e5 > "amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]" ... that addressed, I've now pushed to master branch commit c7ebee2378426eeca425ca5406af213a926f154c "Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421]", see attached. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From c7ebee2378426eeca425ca5406af213a926f154c Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Tue, 18 Oct 2022 00:13:47 +0200 Subject: [PATCH] Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421] After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5 "amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]", "big" private data now works for GCN offloading, too. PR target/105421 libgomp/ * testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New. --- .../libgomp.oacc-c-c++-common/private-big-1.c | 100 ++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c new file mode 100644 index 00000000000..c0e8db0c894 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c @@ -0,0 +1,100 @@ +/* Test "big" private data. */ + +/* { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'. */ + +/* { dg-additional-options -fopt-info-all-omp } + { dg-additional-options --param=openacc-privatization=noisy } + { dg-additional-options -foffload=-fopt-info-all-omp } + { dg-additional-options -foffload=--param=openacc-privatization=noisy } + for testing/documenting aspects of that functionality. */ + +/* { dg-additional-options -Wopenacc-parallelism } for testing/documenting + aspects of that functionality. */ + +/* For GCN offloading compilation, we (expectedly) run into a + 'gang-private data-share memory exhausted' error: the default + '-mgang-private-size' is too small. Raise it so that 'uint32_t x[344]' plus + some internal-use data fits in: + { dg-additional-options -foffload-options=amdgcn-amdhsa=-mgang-private-size=1555 { target openacc_radeon_accel_selected } } */ + +/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName' + passed to 'incr' may be unset, and in that case, it will be set to [...]", + so to maintain compatibility with earlier Tcl releases, we manually + initialize counter variables: + { dg-line l_dummy[variable c_compute 0 c_loop 0] } + { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid + "WARNING: dg-line var l_dummy defined, but not used". */ + +#include +#include + + +/* Based on 'private-variables.c:loop_g_5'. */ + +/* To demonstrate PR105421 "GCN offloading, raised '-mgang-private-size': + 'HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION'", a 'struct' indirection, for + example, has been necessary in combination with a separate routine. */ + +struct data +{ + uint32_t *x; + uint32_t *arr; + uint32_t i; +}; + +#pragma acc routine worker +static void +loop_g_5_r(struct data *data) +{ + uint32_t *x = data->x; + uint32_t *arr = data->arr; + uint32_t i = data->i; + +#pragma acc loop /* { dg-line l_loop[incr c_loop] } */ + /* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */ + /* { dg-optimized {assigned OpenACC worker vector loop parallelism} {} { target *-*-* } l_loop$c_loop } */ + for (int j = 0; j < 320; j++) + arr[i * 320 + j] += x[(i * 320 + j) % 344]; +} + +void loop_g_5() +{ + uint32_t x[344], i, arr[320 * 320]; + + for (i = 0; i < 320 * 320; i++) + arr[i] = i; + + #pragma acc parallel copy(arr) + { + #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */ + /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop$c_loop } + { dg-note {variable 'x' ought to be adjusted for OpenACC privatization level: 'gang'} {} { target *-*-* } l_loop$c_loop } + { dg-note {variable 'x' adjusted for OpenACC privatization level: 'gang'} {} { target { ! openacc_host_selected } } l_loop$c_loop } */ + /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */ + /* { dg-note {variable 'data' declared in block is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop$c_loop } + { dg-note {variable 'data' ought to be adjusted for OpenACC privatization level: 'gang'} {} { target *-*-* } l_loop$c_loop } + { dg-note {variable 'data' adjusted for OpenACC privatization level: 'gang'} {} { target { ! openacc_host_selected } } l_loop$c_loop } */ + /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */ + /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop$c_loop } */ + for (i = 0; i < 320; i++) + { + for (int j = 0; j < 344; j++) + x[j] = j * (2 + i); + + struct data data = { x, arr, i }; + loop_g_5_r(&data); /* { dg-line l_compute[incr c_compute] } */ + /* { dg-optimized {assigned OpenACC worker vector loop parallelism} {} { target *-*-* } l_compute$c_compute } */ + } + } + + for (i = 0; i < 320 * 320; i++) + assert(arr[i] == i + (i % 344) * (2 + (i / 320))); +} + + +int main () +{ + loop_g_5(); + + return 0; +} -- 2.35.1