Hi!
On 2023-02-28T11:56:01+0100, I wrote:
> I'm currently reviewing 'gomp_copy_host2dev', 'ephemeral' in a different
> context, and a question came up here;
> commit r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49
> "OpenMP: Handle descriptors in target's firstprivate [PR104949]":
It doesn't seem as if we're going to address this question anytime soon,
but it also isn't necessary; the worrysome condition currently cannot
arise. I've therefore now documented that via 'assert (!aq)', and pushed
to master branch commit e8fec6998b656dac02d4bc6c69b35a0fb5611e87
"Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]",
see attached.
Grüße
Thomas
> On 2022-05-11T19:33:00+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
>> this patch handles (for target regions)
>> firstprivate(array_descriptor)
>> by not only firstprivatizing the descriptor but also the data
>> it points to. This is done by turning it in omp-low.cc the clause
>> into
>> firstprivate(descr) firstprivate(descr.data)
>> and then attaching the latter to the former. That works by
>> adding an 'attach' after the last firstprivate (and checking
>> for it in libgomp). The attached-to device address for a
>> previous (here: the first) firstprivate is obtained by returning
>> the device address inside the hostaddrs[i] alias omp_arr array,
>> i.e. the compiler generates:
>> omp_arr.1 = &descr; /* firstprivate */
>> omp_arr.2 = descr.data; /* firstprivate */
>> omp_arr.3 = &omp_arr.1; /* attach; bias: &desc.data-&desc */
>> and libgomp then knows that the device address is in the
>> pointer.
>
>> Note: The code is not active for OpenACC. The existing code uses, e.g.,
>> 'goto oacc_firstprivate' – thus, the new code would be
>> partially active. I went for making it completely inactive for OpenACC
>> by adding one '!is_gimple_omp_oacc'.
>
> ACK.
>
>> I bet that a deep copy would be
>> also useful for OpenACC, but I have neither checked what the current
>> code does nor what the OpenACC spec says about this.
>
> Instead of adding corresponding handling to the OpenACC 'firstprivate'
> special code paths later on, I suggest that we first address known issues
> with OpenACC 'firstprivate' -- which probably may largely be achieved by
> in fact removing those 'goto oacc_firstprivate's and other special code
> paths? For example, see <https://gcc.gnu.org/PR92036>
> "OpenACC 'firstprivate' clause: initial value".
>
> That means, the following code currently isn't active for OpenACC, and
> given that OpenMP 'target' doesn't do asynchronous device execution
> (meaning: not in the way/implementation of OpenACC 'async'), it thus
> doesn't care about the 'ephemeral' argument to 'gomp_copy_host2dev', but
> still, for correctness (and once that code gets used for OpenACC):
>
>> OpenMP: Handle descriptors in target's firstprivate [PR104949]
>>
>> For allocatable/pointer arrays, a firstprivate to a device
>> not only needs to privatize the descriptor but also the actual
>> data. This is implemented as:
>> firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
>> where the address of x in device memory is saved in hostaddrs[i]
>> by libgomp and the middle end actually passes hostaddrs[i]' to
>> attach.
>
>> --- a/libgomp/target.c
>> +++ b/libgomp/target.c
>> @@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
>> gomp_copy_host2dev (devicep, aq,
>> (void *) (tgt->tgt_start + tgt_size),
>> (void *) hostaddrs[i], len, false, cbufp);
>
> Here, passing 'ephemeral <- false' is correct, as 'h <- hostaddrs[i]'
> points to non-ephemeral data.
>
>> + /* Save device address in hostaddr to permit latter availablity
>> + when doing a deep-firstprivate with pointer attach. */
>> + hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size);
>
> Here, we modify 'hostaddrs[i]' (itself -- *not* the data that the
> original 'hostaddrs[i]' points to), so the above 'gomp_copy_host2dev'
> with 'ephemeral <- false' is still correct, right?
>
>> tgt_size += len;
>> +
>> + /* If followed by GOMP_MAP_ATTACH, pointer assign this
>> + firstprivate to hostaddrs[i+1], which is assumed to contain a
>> + device address. */
>> + if (i + 1 < mapnum
>> + && (GOMP_MAP_ATTACH
>> + == (typemask & get_kind (short_mapkind, kinds, i+1))))
>> + {
>> + uintptr_t target = (uintptr_t) hostaddrs[i];
>> + void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
>> + gomp_copy_host2dev (devicep, aq, devptr, &target,
>> + sizeof (void *), false, cbufp);
>
> However, 'h <- &target' here points to data in the local frame
> ('target'), which potentially goes out of scope before an asynchronous
> 'gomp_copy_host2dev' has completed. Thus, don't we have to pass here
> 'ephemeral <- true' instead of 'ephemeral <- false'? Or, actually
> instead of '&target', pass '&hostaddrs[i]', which then again points to
> non-ephemeral data? Is the latter safe to do, or are we potentially
> further down the line modifying the data that '&hostaddrs[i]' points to?
> (I got a bit lost in the use of 'hostaddrs[i]' here.)
>
>> + ++i;
>> + }
>> continue;
>> case GOMP_MAP_FIRSTPRIVATE_INT:
>> case GOMP_MAP_ZERO_LEN_ARRAY_SECTION:
>> @@ -2517,6 +2534,11 @@ copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
>> memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
>> hostaddrs[i] = tgt + tgt_size;
>> tgt_size = tgt_size + sizes[i];
>> + if (i + 1 < mapnum && (kinds[i+1] & 0xff) == GOMP_MAP_ATTACH)
>> + {
>> + *(*(uintptr_t**) hostaddrs[i+1] + sizes[i+1]) = (uintptr_t) hostaddrs[i];
>> + ++i;
>> + }
>> }
>> }
>
> For reference, the 'gomp_copy_host2dev' code that I highlighted above
> still triggers for the following test cases only:
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90
>> @@ -0,0 +1,33 @@
>> +! PR fortran/104949
>> +
>> +implicit none (type,external)
>> +integer, allocatable :: A(:)
>> +A = [1,2,3,4,5,6]
>> +
>> +!$omp parallel firstprivate(A)
>> +!$omp master
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end master
>> +!$omp end parallel
>> +
>> +!$omp target firstprivate(A)
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end target
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +
>> +!$omp parallel default(firstprivate)
>> +!$omp master
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end master
>> +!$omp end parallel
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +
>> +!$omp target defaultmap(firstprivate)
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end target
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +end
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90
>> @@ -0,0 +1,113 @@
>> +! PR fortran/104949
>> +
>> +module m
>> +use omp_lib
>> +implicit none (type, external)
>> +
>> +contains
>> +subroutine one
>> + integer, allocatable :: x(:)
>> + integer :: i
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x)
>> + if (allocated(x)) error stop
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (allocated(x)) error stop
>> + x = [10,20,30,40] + i
>> + if (any (x /= [10,20,30,40] + i)) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + x = [1,2,3,4]
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (i <= 0) error stop
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 4) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (any (x /= [1,2,3,4])) error stop
>> + ! no reallocation, just malloced + assignment
>> + x = [10,20,30,40] + i
>> + if (any (x /= [10,20,30,40] + i)) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 4) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (any (x /= [1,2,3,4])) error stop
>> + end do
>> + deallocate(x)
>> +end
>> +
>> +subroutine two
>> + character(len=:), allocatable :: x(:)
>> + character(len=5) :: str
>> + integer :: i
>> +
>> + str = "abcde" ! work around for PR fortran/91544
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x)
>> + if (allocated(x)) error stop
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (allocated(x)) error stop
>> + ! no reallocation, just malloced + assignment
>> + x = [character(len=2+i) :: str,"fhji","klmno"]
>> + if (len(x) /= 2+i) error stop
>> + if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + x = [character(len=4) :: "ABCDE","FHJI","KLMNO"]
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (i <= 0) error stop
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 3) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (len(x) /= 4) error stop
>> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
>> + !! Reallocation runs into the issue PR fortran/105538
>> + !!
>> + !!x = [character(len=2+i) :: str,"fhji","klmno"]
>> + !!if (len(x) /= 2+i) error stop
>> + !!if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
>> + !! This leaks memory!
>> + !! deallocate(x)
>> + ! Just assign:
>> + x = [character(len=4) :: "abcde","fhji","klmno"]
>> + if (any (x /= [character(len=4) :: "abcde","fhji","klmno"])) error stop
>> + !$omp end target
>> + if (.not.allocated(x)) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (size(x) /= 3) error stop
>> + if (len(x) /= 4) error stop
>> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
>> + end do
>> + deallocate(x)
>> +end
>> +end module m
>> +
>> +use m
>> +call one
>> +call two
>> +end
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90
>> @@ -0,0 +1,24 @@
>> +implicit none
>> + integer, allocatable :: x(:)
>> + x = [1,2,3,4]
>> + call foo(x)
>> + if (any (x /= [1,2,3,4])) error stop
>> + call foo()
>> +contains
>> +subroutine foo(c)
>> + integer, allocatable, optional :: c(:)
>> + logical :: is_present
>> + is_present = present (c)
>> + !$omp target firstprivate(c)
>> + if (is_present) then
>> + if (.not. allocated(c)) error stop
>> + if (any (c /= [1,2,3,4])) error stop
>> + c = [99,88,77,66]
>> + if (any (c /= [99,88,77,66])) error stop
>> + end if
>> + !$omp end target
>> + if (is_present) then
>> + if (any (c /= [1,2,3,4])) error stop
>> + end if
>> +end
>> +end
>
>
> Grüße
> Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955