OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236]
Checks
Commit Message
Found when playing around with reverse offload once I used 'omp target parallel'.
The other issue showed up when running the testsuite (which is done with -O2).
In all cases, the ICE is in expand_GOMP_TARGET_REV of this IFN, which should
be unreachable
Note: ENABLE_OFFLOADING inside the compiler must evaluate to true to show up
as ICE - otherwise, the IFN is not even generated.
I did not see a good reason for DECL_CONTEXT = NULL, thus, I now set it to
the same as was set for child_fn - for no good reason.
Tested on x86-64 with ENABLE_OFFLOADING albeit without true offloading.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Comments
Ping this patch – and also "Re: [Patch][v5] libgomp/nvptx: Prepare for
reverse-offload callback handling".
For the latter cf. Alexander's code approval
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603908.html – and
his concerns regarding the generic feature in
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601959.html (I
think 'target nowait' permits what he thinks is the better way for GPUs.)
Tobias
On 18.10.22 21:27, Tobias Burnus wrote:
> Found when playing around with reverse offload once I used 'omp target
> parallel'.
> The other issue showed up when running the testsuite (which is done
> with -O2).
>
> In all cases, the ICE is in expand_GOMP_TARGET_REV of this IFN, which
> should
> be unreachable
>
> Note: ENABLE_OFFLOADING inside the compiler must evaluate to true to
> show up
> as ICE - otherwise, the IFN is not even generated.
>
> I did not see a good reason for DECL_CONTEXT = NULL, thus, I now set
> it to
> the same as was set for child_fn - for no good reason.
>
> Tested on x86-64 with ENABLE_OFFLOADING albeit without true offloading.
> OK for mainline?
>
> Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
On Tue, Oct 18, 2022 at 09:27:04PM +0200, Tobias Burnus wrote:
> The cgraph_node::create_clone issue is exposed with -O2 for the existing
> libgomp.fortran/reverse-offload-1.f90.
>
> omp-offload.cc
>
> PR middle-end/107236
>
> gcc/ChangeLog:
> * omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
> in DECL_CONTEXT and not to cfun->decl.
> * cgraphclones.cc (cgraph_node::create_clone): Copy also the
> node's calls_declare_variant_alt value.
>
> gcc/testsuite/ChangeLog:
> * gfortran.dg/gomp/target-device-ancestor-6.f90: New test.
LGTM, thanks.
Jakub
OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236]
For 'target parallel' and similarly nested directives, cgraph_node's
calls_declare_variant_alt was not set in the parent region node but in
cfun->decl. Hence, pass_omp_device_lower did not process handle the
internal function GOMP_TARGET_REV. - Solution is to set it to the
DECL_CONTEXT, which is set in adjust_context_and_scope.
The cgraph_node::create_clone issue is exposed with -O2 for the existing
libgomp.fortran/reverse-offload-1.f90.
omp-offload.cc
PR middle-end/107236
gcc/ChangeLog:
* omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
in DECL_CONTEXT and not to cfun->decl.
* cgraphclones.cc (cgraph_node::create_clone): Copy also the
node's calls_declare_variant_alt value.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/target-device-ancestor-6.f90: New test.
gcc/cgraphclones.cc | 1 +
gcc/omp-expand.cc | 13 ++++++-------
.../gfortran.dg/gomp/target-device-ancestor-6.f90 | 17 +++++++++++++++++
3 files changed, 24 insertions(+), 7 deletions(-)
@@ -375,6 +375,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
if (!new_inlined_to)
prof_count = count.combine_with_ipa_count (prof_count);
new_node->count = prof_count;
+ new_node->calls_declare_variant_alt = this->calls_declare_variant_alt;
/* Update IPA profile. Local profiles need no updating in original. */
if (update_original)
@@ -10054,13 +10054,8 @@ expand_omp_target (struct omp_region *region)
/* Handle the case that an inner ancestor:1 target is called by an outer
target region. */
- if (!is_ancestor)
- cgraph_node::get (child_fn)->calls_declare_variant_alt
- |= cgraph_node::get (cfun->decl)->calls_declare_variant_alt;
- else /* Duplicate function to create empty nonhost variant. */
+ if (is_ancestor)
{
- /* Enable pass_omp_device_lower pass. */
- cgraph_node::get (cfun->decl)->calls_declare_variant_alt = 1;
cgraph_node *fn2_node;
child_fn2 = build_decl (DECL_SOURCE_LOCATION (child_fn),
FUNCTION_DECL,
@@ -10074,7 +10069,7 @@ expand_omp_target (struct omp_region *region)
TREE_PUBLIC (child_fn2) = 0;
DECL_UNINLINABLE (child_fn2) = 1;
DECL_EXTERNAL (child_fn2) = 0;
- DECL_CONTEXT (child_fn2) = NULL_TREE;
+ DECL_CONTEXT (child_fn2) = DECL_CONTEXT (child_fn);
DECL_INITIAL (child_fn2) = make_node (BLOCK);
BLOCK_SUPERCONTEXT (DECL_INITIAL (child_fn2)) = child_fn2;
DECL_ATTRIBUTES (child_fn)
@@ -10098,6 +10093,10 @@ expand_omp_target (struct omp_region *region)
fn2_node->force_output = 1;
node->offloadable = 0;
+ /* Enable pass_omp_device_lower pass. */
+ fn2_node = cgraph_node::get (DECL_CONTEXT (child_fn));
+ fn2_node->calls_declare_variant_alt = 1;
+
t = build_decl (DECL_SOURCE_LOCATION (child_fn),
RESULT_DECL, NULL_TREE, void_type_node);
DECL_ARTIFICIAL (t) = 1;
new file mode 100644
@@ -0,0 +1,17 @@
+! PR middle-end/107236
+
+! Did ICE before because IFN .GOMP_TARGET_REV was not
+! processed in omp-offload.cc.
+! Note: Test required ENABLE_OFFLOADING being true inside GCC.
+
+implicit none
+!$omp requires reverse_offload
+!$omp target parallel num_threads(4)
+ !$omp target device(ancestor:1)
+ call foo()
+ !$omp end target
+!$omp end target parallel
+contains
+ subroutine foo
+ end
+end