libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling)

Message ID 87mt9lvw3y.fsf@euler.schwinge.homeip.net
State Repeat Merge
Headers
Series libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling) |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Thomas Schwinge Oct. 24, 2022, 7:51 p.m. UTC
  Hi!

On 2022-10-24T21:11:04+0200, I wrote:
> On 2022-10-24T21:05:46+0200, I wrote:
>> On 2022-10-24T16:07:25+0200, Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> On Wed, Oct 12, 2022 at 10:55:26AM +0200, Tobias Burnus wrote:
>>>> libgomp/nvptx: Prepare for reverse-offload callback handling
>>
>>> Ok, thanks.
>>
>> Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
>> "libgomp/nvptx: Prepare for reverse-offload callback handling",
>> I'm seeing a lot of libgomp execution test regressions.  Random
>> example, 'libgomp.c-c++-common/error-1.c':
>>
>>     [...]
>>       GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]
>>
>>     Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
>>     0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
>>     2127            if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
>>     (gdb) print ptx_dev
>>     $1 = (struct ptx_device *) 0x6a55a0
>>     (gdb) print ptx_dev->rev_data
>>     $2 = (struct rev_offload *) 0xffffffff00000000
>>     (gdb) print ptx_dev->rev_data->fn
>>     Cannot access memory at address 0xffffffff00000000
>>
>> Why is it even taking this 'if (reverse_offload)' code path, which isn't
>> applicable to this test case (as far as I understand)?  (Well, the answer
>> is 'bool reverse_offload = ptx_dev->rev_data != NULL;', but why is that?)
>
> Well.
>
>     --- a/libgomp/plugin/plugin-nvptx.c
>     +++ b/libgomp/plugin/plugin-nvptx.c
>
>     @@ -329,6 +332,7 @@ struct ptx_device
>            pthread_mutex_t lock;
>          } omp_stacks;
>
>     +  struct rev_offload *rev_data;
>        struct ptx_device *next;
>      };
>
> ... but as far as I can tell, this is never initialized in
> 'nvptx_open_device', which does 'ptx_dev = GOMP_PLUGIN_malloc ([...]);'.
> Would the following be the correct fix (currently testing)?
>
>     --- libgomp/plugin/plugin-nvptx.c
>     +++ libgomp/plugin/plugin-nvptx.c
>     @@ -546,6 +546,8 @@ nvptx_open_device (int n)
>        ptx_dev->omp_stacks.size = 0;
>        pthread_mutex_init (&ptx_dev->omp_stacks.lock, NULL);
>
>     +  ptx_dev->rev_data = NULL;
>     +
>        return ptx_dev;
>      }

That did clean up libgomp execution test regressions; pushed to
master branch commit 205538832b7033699047900cf25928f5920d8b93
"libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Patch

From 205538832b7033699047900cf25928f5920d8b93 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon, 24 Oct 2022 21:11:47 +0200
Subject: [PATCH] libgomp/nvptx: Prepare for reverse-offload callback handling,
 resolve spurious SIGSEGVs

Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
I'm seeing a lot of libgomp execution test regressions.  Random
example, 'libgomp.c-c++-common/error-1.c':

    [...]
      GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]

    Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
    0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
    2127            if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
    (gdb) print ptx_dev
    $1 = (struct ptx_device *) 0x6a55a0
    (gdb) print ptx_dev->rev_data
    $2 = (struct rev_offload *) 0xffffffff00000000
    (gdb) print ptx_dev->rev_data->fn
    Cannot access memory at address 0xffffffff00000000

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_open_device): Initialize
	'ptx_dev->rev_data'.
---
 libgomp/plugin/plugin-nvptx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index ad057edabec..0768fca350b 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -546,6 +546,8 @@  nvptx_open_device (int n)
   ptx_dev->omp_stacks.size = 0;
   pthread_mutex_init (&ptx_dev->omp_stacks.lock, NULL);
 
+  ptx_dev->rev_data = NULL;
+
   return ptx_dev;
 }
 
-- 
2.35.1