invoke.texi: Add note that -foffload= does not affect device detection
Checks
Commit Message
Not very often, but do I keep running into issues (fails, segfaults)
related to testing programs compiled with a GCC without offload
configured and then using the system libraries. - That's equivalent
to having the system compiler (or any offload compiler) and
compiling with -foffload=disable.
The problem is that while the program only contains host code,
the run-time library still initializes devices when an API
routine - such as omp_get_num_devices - is invoked. This can
lead to odd bugs as target regions, obviously, will use host
fallback (for any device number) but the API routines will
happily operate on the actual devices, which can lead to odd
errors.
(Likewise issue when compiling for one offload target type
and running on a system which has devices of an other type.)
I assume that that's not a very common problem, but it can be
rather confusing when hitting this issue.
Maybe the proposed wording will help others to avoid this pitfall.
(Or is this superfluous as -foffload= is not much used and, even if,
no one then remembers or finds this none?)
Thoughts?
* * *
It was not clear to me how to refer to libgomp.texi
- Should it be 'libgomp' as in 'info libgomp' or the URL
https://gcc.gnu.org/onlinedocs/libgomp/ (or filename of the PDF) implies?
- Or as 'GNU Offloading and Multi Processing Runtime Library Manual'
as named linked to at https://gcc.gnu.org/onlinedocs or on the title page
of the the PDF - but that name is not repeated in the info file or the HTML
file.
- Or even 'GNU libgomp' to mirror a substring in the <title> of the HTML file.
I now ended up only implicitly referring that document.
Aside: Shouldn't all the HTML documents start with a <h1> and <title> before
the table of content? Currently, it has:
<title>Top (GNU libgomp)</title>
and the body starts with
<h2>Short Table of Contents</h2>
Tobias
PS: In the testsuite, it mostly happens when iterating over
omp_get_num_devices() or when mixing calls to API routines with
device code ('omp target', compute constructs).
Comments
On 3/1/24 08:23, Tobias Burnus wrote:
> Not very often, but do I keep running into issues (fails, segfaults)
> related to testing programs compiled with a GCC without offload
> configured and then using the system libraries. - That's equivalent
> to having the system compiler (or any offload compiler) and
> compiling with -foffload=disable.
>
> The problem is that while the program only contains host code,
> the run-time library still initializes devices when an API
> routine - such as omp_get_num_devices - is invoked. This can
> lead to odd bugs as target regions, obviously, will use host
> fallback (for any device number) but the API routines will
> happily operate on the actual devices, which can lead to odd
> errors.
>
> (Likewise issue when compiling for one offload target type
> and running on a system which has devices of an other type.)
>
> I assume that that's not a very common problem, but it can be
> rather confusing when hitting this issue.
>
> Maybe the proposed wording will help others to avoid this pitfall.
> (Or is this superfluous as -foffload= is not much used and, even if,
> no one then remembers or finds this none?)
>
> Thoughts?
Well, I spent a long time looking at this, and my only conclusion is
that I don't really understand what the problem you're trying to solve
is. If it's problematical to have the runtime know about offload
devices the compiled code isn't using, don't users also need to know how
to restrict the runtime to a particular set of devices the same way
-foffload= lets you do, and not just how to disable offloading in the
runtime entirely?
It's pretty clearly documented already how -foffload affects the
compiler's behavior, and the library's behavior is already documented in
its own manual. Maybe what we don't have is a tutorial on how to
build/link/run programs using a specific offload device, or on the host?
Anyway, I don't really object to the text you want to add, but it makes
me more confused instead of less so. :-S
>
> * * *
>
> It was not clear to me how to refer to libgomp.texi
> - Should it be 'libgomp' as in 'info libgomp' or the URL
> https://gcc.gnu.org/onlinedocs/libgomp/ (or filename of the PDF)
> implies?
> - Or as 'GNU Offloading and Multi Processing Runtime Library Manual'
> as named linked to at https://gcc.gnu.org/onlinedocs or on the title
> page
> of the the PDF - but that name is not repeated in the info file or
> the HTML
> file.
> - Or even 'GNU libgomp' to mirror a substring in the <title> of the HTML
> file.
> I now ended up only implicitly referring that document.
The Texinfo input file has "@settitle GNU libgomp".
> Aside: Shouldn't all the HTML documents start with a <h1> and <title>
> before
> the table of content? Currently, it has:
> <title>Top (GNU libgomp)</title>
> and the body starts with
> <h2>Short Table of Contents</h2>
I think this is a bug in the version of texinfo used to produce the HTML
content for the GCC web site. Looking at a recent build of my own using
Texinfo 6.7, I do see
<body lang="en">
<h1 class="settitle" align="center">GNU libgomp</h1>
The manual on the web site says it was produced by "GNU Texinfo 7.0dev".
-Sandra
On 3/1/24 17:29, Sandra Loosemore wrote:
> On 3/1/24 08:23, Tobias Burnus wrote:
>> Aside: Shouldn't all the HTML documents start with a <h1> and <title>
>> before
>> the table of content? Currently, it has:
>> <title>Top (GNU libgomp)</title>
>> and the body starts with
>> <h2>Short Table of Contents</h2>
>
> I think this is a bug in the version of texinfo used to produce the HTML
> content for the GCC web site. Looking at a recent build of my own using
> Texinfo 6.7, I do see
>
> <body lang="en">
> <h1 class="settitle" align="center">GNU libgomp</h1>
>
> The manual on the web site says it was produced by "GNU Texinfo 7.0dev".
I poked at this a little and apparently you need to fiddle with the
SHOW_TITLE or NO_TOP_NODE_OUTPUT customization variables in recent
versions of Texinfo in order to get the document title to show up in
HTML output.
https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#index-SHOW_005fTITLE
Probably this has to be controlled by a configure check since older
Texinfo versions may barf on unknown options.
I'm not at a good point to fiddle with this myself right now (I'm deep
inside more metadirective/declare variant hacking), also I have no idea
how to re-do the HTML manuals linked from the GCC web site to tweak the
formatting in this way. I'd think that if we were going to do that,
we'd also want to use an official release version of Texinfo instead of
a "dev" snapshot.
-Sandra
Hi,
Sandra Loosemore wrote:
> On 3/1/24 17:29, Sandra Loosemore wrote:
>> On 3/1/24 08:23, Tobias Burnus wrote:
>>> Aside: Shouldn't all the HTML documents start with a <h1> and
>>> <title> before
>>> the table of content? Currently, it has:
>>> <title>Top (GNU libgomp)</title>
>>> and the body starts with
>>> <h2>Short Table of Contents</h2>
I note that the 'Top(...)' in <title> already appears in the GCC 8.5
docs (created with Texinfo 6.5; while GCC 7.5, created with texinfo 6.3,
is okay). And the <h1> disappears in the GCC 10.5 doc, created with
Texinfo 7.0dev.
I have no idea why the 'Top(...)' appears with Texinfo 6.5, but the
missing <h1> is because of Texinfo 7.0, cf.
https://git.savannah.gnu.org/cgit/texinfo.git/plain/NEWS
I think it would be useful to remove the 'Top()' in <title> and add the
<h1> in general.
For the GCC website, we might want to set TOP_NODE_UP_URL.
>> I think this is a bug in the version of texinfo used to produce the
>> HTML content for the GCC web site. Looking at a recent build of my
>> own using Texinfo 6.7, I do see
>>
>> <body lang="en">
>> <h1 class="settitle" align="center">GNU libgomp</h1>
>>
>> The manual on the web site says it was produced by "GNU Texinfo 7.0dev".
>
> I poked at this a little and apparently you need to fiddle with the
> SHOW_TITLE or NO_TOP_NODE_OUTPUT customization variables in recent
> versions of Texinfo in order to get the document title to show up in
> HTML output.
>
> https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#index-SHOW_005fTITLE
>
>
> Probably this has to be controlled by a configure check since older
> Texinfo versions may barf on unknown options.
...
> I'd think that if we were going to do that, we'd also want to use an
> official release version of Texinfo instead of a "dev" snapshot.
(I concur that we should update 7.0dev to 7.0.3 or 7.1 on the server to
have a defined version.)
Thanks,
Tobias
Hi Sandra,
Sandra Loosemore wrote:
> On 3/1/24 08:23, Tobias Burnus wrote:
>> Maybe the proposed wording will help others to avoid this pitfall.
>> (Or is this superfluous as -foffload= is not much used and, even if,
>> no one then remembers or finds this none?)
>
> Well, I spent a long time looking at this, and my only conclusion is
> that I don't really understand what the problem you're trying to solve
> is. If it's problematical to have the runtime know about offload
> devices the compiled code isn't using, don't users also need to know
> how to restrict the runtime to a particular set of devices the same
> way -foffload= lets you do, and not just how to disable offloading in
> the runtime entirely?
> It's pretty clearly documented already how -foffload affects the
> compiler's behavior, and the library's behavior is already documented
> in its own manual. Maybe what we don't have is a tutorial on how to
> build/link/run programs using a specific offload device, or on the host?
The problem is for code like the following, which is perfectly valid
and works
(A) If you don't have any offload device
(independent of the compiler options)
(B) If you have an offload device (supported by your libgomp)
and compiled with offloading support (for that device)
But (C) if you have an offload device and compile as:
gcc -fopenmp -foffload=disabled
it will fail at runtime with:
dev = 0 / num devs = 1 Segmentation fault (core dumped) The problem is
that there is a mismatch between the code (assumes no offload code +
always host fallback) and the run-time library (which detects offload
devices), such that the API routines uses a different device than the
'target' code:
--------------------
#include <omp.h>
#include <stdio.h>
#define N 2064
int
main ()
{
int *x = (int*) omp_target_alloc (sizeof(int)*N,
omp_get_default_device ());
printf ("dev = %d / num devs = %d\n",
omp_get_default_device (), omp_get_num_devices ());
#pragma omp target is_device_ptr(x)
for (int i = 0; i < N; ++i)
x[i] = i;
}
-------------------
On the technical side, it is not really surprising but it
might be still be confusing for the user. Obviously, it can
also occur if you compile, e.g., for AMD GCN and only an
Nvidia device is available - but there the solution would be
the same (disable all devices).
(OpenMP 6.0 will provide a environment variable that allows
fine tuning of the available devices.)
Questions:
* Is such a usage common enough to matter?
I guess for some benchmark use it make – to test whether
real offloading or host fallback is faster + if the latter
is true, it might also get used in operational code.
* Are API routines used in such a code in a way that it breaks?
(Unfortunately not very unlikely in larger code.)
If there is enough real-world usage (= 2x yes to the questions above):
* How to word is to help users and not to confuse them?
Tobias
invoke.texi: Add note that -foffload= does not affect device detection
gcc/ChangeLog:
* doc/invoke.texi (-foffload): Add note that the flag does not
affect whether offload devices are detected.
gcc/doc/invoke.texi | 7 +++++++
1 file changed, 7 insertions(+)
@@ -2736,38 +2736,45 @@ targets using ms-abi.
@opindex foffload
@cindex Offloading targets
@cindex OpenACC offloading targets
@cindex OpenMP offloading targets
@item -foffload=disable
@itemx -foffload=default
@itemx -foffload=@var{target-list}
Specify for which OpenMP and OpenACC offload targets code should be generated.
The default behavior, equivalent to @option{-foffload=default}, is to generate
code for all supported offload targets. The @option{-foffload=disable} form
generates code only for the host fallback, while
@option{-foffload=@var{target-list}} generates code only for the specified
comma-separated list of offload targets.
Offload targets are specified in GCC's internal target-triplet format. You can
run the compiler with @option{-v} to show the list of configured offload targets
under @code{OFFLOAD_TARGET_NAMES}.
+Note that this option does not affect the available offload devices detected by
+the run-time library and, hence, the values returned by the OpenMP/OpenACC API
+routines or access to devices using those routines. The run-time library
+itself can be tuned using environment variables; in particular, to fully disable
+the device detection, set the @code{OMP_TARGET_OFFLOAD} environment variable to
+@code{disabled}.
+
@opindex foffload-options
@cindex Offloading options
@cindex OpenACC offloading options
@cindex OpenMP offloading options
@item -foffload-options=@var{options}
@itemx -foffload-options=@var{target-triplet-list}=@var{options}
With @option{-foffload-options=@var{options}}, GCC passes the specified
@var{options} to the compilers for all enabled offloading targets. You can
specify options that apply only to a specific target or targets by using
the @option{-foffload-options=@var{target-list}=@var{options}} form. The
@var{target-list} is a comma-separated list in the same format as for the
@option{-foffload=} option.
Typical command lines are
@smallexample
-foffload-options='-fno-math-errno -ffinite-math-only' -foffload-options=nvptx-none=-latomic
-foffload-options=amdgcn-amdhsa=-march=gfx906