On 10/15/23 04:39, Tobias Burnus wrote:
> @@ -905,8 +905,8 @@ For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
> @subsection @code{omp_get_schedule} -- Obtain the runtime scheduling method
> @table @asis
> @item @emph{Description}:
> -Obtain the runtime scheduling method. The @var{kind} argument will be
> -set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
> +Obtain the runtime scheduling method. The @var{kind} argument is set to
> +to @code{omp_sched_static}, @code{omp_sched_dynamic},
You've introduced an extra "to" here.
> @@ -3029,7 +3029,7 @@ OMP_ALLOCATOR=omp_low_lat_mem_space:pinned=true,partition=nearest
> Sets the format string used when displaying OpenMP thread affinity information.
> Special values are output using @code{%} followed by an optional size
> specification and then either the single-character field type or its long
> -name enclosed in curly braces; using @code{%%} will display a literal percent.
> +name enclosed in curly braces; using @code{%%} displays a literal percent.
> The size specification consists of an optional @code{0.} or @code{.} followed
> by a positive integer, specifying the minimal width of the output. With
> @code{0.} and numerical values, the output is padded with zeros on the left;
I think all the @code markups here ought to be @samp, but let's not do that in
this patch.
> -If set to @code{DISABLED}, then offloading is disabled and all code will run on
> -the host. If set to @code{DEFAULT}, the program will try offloading to the
> +If set to @code{DISABLED}, then offloading is disabled and all code runs on
> +the host. If set to @code{DEFAULT}, the program tries offloading to the
> device first, then fall back to running code on the host if it cannot.
Missed one here; s/[will] fall back/falls back/.
> @@ -3559,7 +3559,7 @@ Binds threads to specific CPUs. The variable should contain a space-separated
> or comma-separated list of CPUs. This list may contain different kinds of
> entries: either single CPU numbers in any order, a range of CPUs (M-N)
> or a range with some stride (M-N:S). CPU numbers are zero based. For example,
> -@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
> +@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} binds the initial thread
> to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
> CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
> and 14 respectively and then start assigning back from the beginning of
Similarly, s/[will] start/starts/.
> @item @emph{C/C++}:
> @@ -3983,8 +3983,8 @@ but might be removed in a future version of GCC.
> @table @asis
> @item @emph{Description}
> This function tests for completion of the asynchronous operation specified
> -in @var{arg}. In C/C++, a non-zero value will be returned to indicate
> -the specified asynchronous operation has completed. While Fortran will return
> +in @var{arg}. In C/C++, a non-zero value is returned to indicate
> +the specified asynchronous operation has completed. While Fortran returns
> a @code{true}. If the asynchronous operation has not completed, C/C++ returns
> a zero and Fortran returns a @code{false}.
Hmmmm. How about s/, While/while/ here. And it sounds odd to me to say "a
true" or a "a zero"; I'd suggest deleting the indefinite article from all those
uses.
> @@ -4012,8 +4012,8 @@ a zero and Fortran returns a @code{false}.
> @table @asis
> @item @emph{Description}
> This function tests for completion of all asynchronous operations.
> -In C/C++, a non-zero value will be returned to indicate all asynchronous
> -operations have completed. While Fortran will return a @code{true}. If
> +In C/C++, a non-zero value is returned to indicate all asynchronous
> +operations have completed. While Fortran returns a @code{true}. If
> any asynchronous operation has not completed, C/C++ returns a zero and
> Fortran returns a @code{false}.
Ditto here.
>
> @@ -4196,9 +4196,9 @@ This function shuts down the runtime for the device type specified in
> This function returns whether the program is executing on a particular
> device specified in @var{devicetype}. In C/C++ a non-zero value is
> returned to indicate the device is executing on the specified device type.
> -In Fortran, @code{true} will be returned. If the program is not executing
> -on the specified device type C/C++ will return a zero, while Fortran will
> -return @code{false}.
> +In Fortran, @code{true} is returned. If the program is not executing
> +on the specified device type C/C++ will return a zero, while Fortran
You missed a "will return" here, and same issues with "a zero".
> @@ -5178,7 +5178,7 @@ subsequent to the calls to @code{acc_copyin()}.
> As seen in the previous use case, a call to @code{cublasCreate()}
> initializes the CUBLAS library and allocates the hardware resources on the
> host and the device. However, since the device has already been allocated,
> -@code{cublasCreate()} will only initialize the CUBLAS library and allocate
> +@code{cublasCreate()} only initializes the CUBLAS library and allocate
s/[will] allocate/allocates/
> @@ -5267,7 +5267,7 @@ possible for the (very common) case that the Profiling Interface is
> not enabled. This is relevant, as the Profiling Interface affects all
> the @emph{hot} code paths (in the target code, not in the offloaded
> code). Users of the OpenACC Profiling Interface can be expected to
> -understand that performance will be impacted to some degree once the
> +understand that performance is impacted to some degree once the
> Profiling Interface has gotten enabled: for example, because of the
While you're at it, please s/has gotten/is/ to put that in the present tense too.
OK with those changes.
-Sandra
libgomp.texi: Use present not future tense
libgomp/ChangeLog:
* libgomp.texi: Replace most future tense by present tense.
@@ -794,9 +794,9 @@ are allowed to create new teams. The function takes the language-specific
equivalent of @code{true} and @code{false}, where @code{true} enables
dynamic adjustment of team sizes and @code{false} disables it.
-Enabling nested parallel regions will also set the maximum number of
+Enabling nested parallel regions also sets the maximum number of
active nested regions to the maximum supported. Disabling nested parallel
-regions will set the maximum number of active nested regions to one.
+regions sets the maximum number of active nested regions to one.
Note that the @code{omp_set_nested} API routine was deprecated
in the OpenMP specification 5.2 in favor of @code{omp_set_max_active_levels}.
@@ -905,8 +905,8 @@ For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
@subsection @code{omp_get_schedule} -- Obtain the runtime scheduling method
@table @asis
@item @emph{Description}:
-Obtain the runtime scheduling method. The @var{kind} argument will be
-set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
+Obtain the runtime scheduling method. The @var{kind} argument is set to
+to @code{omp_sched_static}, @code{omp_sched_dynamic},
@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
@var{chunk_size}, is set to the chunk size.
@@ -934,7 +934,7 @@ set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
@subsection @code{omp_get_teams_thread_limit} -- Maximum number of threads imposed by teams
@table @asis
@item @emph{Description}:
-Return the maximum number of threads that will be able to participate in
+Return the maximum number of threads that are able to participate in
each team created by a teams construct.
@item @emph{C/C++}:
@@ -1316,7 +1316,7 @@ that does not use the clause @code{num_teams}.
@subsection @code{omp_set_teams_thread_limit} -- Set upper thread limit for teams construct
@table @asis
@item @emph{Description}:
-Specifies the upper bound for number of threads that will be available
+Specifies the upper bound for number of threads that are available
for each team created by the teams construct which does not specify a
@code{thread_limit} clause. The argument of
@code{omp_set_teams_thread_limit} shall be a positive integer.
@@ -2456,7 +2456,7 @@ may be used as trait value to specify that the default value should be used.
Releases all resources used by a memory allocator, which must not represent
a predefined memory allocator. Accessing memory after its allocator has been
destroyed has unspecified behavior. Passing @code{omp_null_allocator} to the
-routine is permitted but will have no effect.
+routine is permitted but has no effect.
@item @emph{C/C++}:
@@ -3029,7 +3029,7 @@ OMP_ALLOCATOR=omp_low_lat_mem_space:pinned=true,partition=nearest
Sets the format string used when displaying OpenMP thread affinity information.
Special values are output using @code{%} followed by an optional size
specification and then either the single-character field type or its long
-name enclosed in curly braces; using @code{%%} will display a literal percent.
+name enclosed in curly braces; using @code{%%} displays a literal percent.
The size specification consists of an optional @code{0.} or @code{.} followed
by a positive integer, specifying the minimal width of the output. With
@code{0.} and numerical values, the output is padded with zeros on the left;
@@ -3110,7 +3110,7 @@ if unset, cancellation is disabled and the @code{cancel} construct is ignored.
@item @emph{Scope:} global
@item @emph{Description}:
If set to @code{FALSE} or if unset, affinity displaying is disabled.
-If set to @code{TRUE}, the runtime will display affinity information about
+If set to @code{TRUE}, the runtime displays affinity information about
OpenMP threads in a parallel region upon entering the region and every time
any change occurs.
@@ -3135,7 +3135,7 @@ If set to @code{TRUE}, the OpenMP version number and the values
associated with the OpenMP environment variables are printed to @code{stderr}.
If set to @code{VERBOSE}, it additionally shows the value of the environment
variables which are GNU extensions. If undefined or set to @code{FALSE},
-this information will not be shown.
+this information is not shown.
@item @emph{Reference}:
@@ -3157,7 +3157,7 @@ clause. The value shall be the nonnegative device number. If no device with
the given device number exists, the code is executed on the host. If unset,
@env{OMP_TARGET_OFFLOAD} is @code{mandatory} and no non-host devices are
available, it is set to @code{omp_invalid_device}. Otherwise, if unset,
-device number 0 will be used.
+device number 0 is used.
@item @emph{See also}:
@@ -3203,8 +3203,8 @@ regions. The value of this variable shall be a positive integer.
If undefined, then if @env{OMP_NESTED} is defined and set to true, or
if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
a list with more than one item, the maximum number of nested parallel
-regions will be initialized to the largest number supported, otherwise
-it will be set to one.
+regions is initialized to the largest number supported, otherwise
+it is set to one.
@item @emph{See also}:
@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}, @ref{OMP_PROC_BIND},
@@ -3250,9 +3250,9 @@ integer, and zero is allowed. If undefined, the default priority is
Enable or disable nested parallel regions, i.e., whether team members
are allowed to create new teams. The value of this environment variable
shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
-of maximum active nested regions supported will by default be set to the
-maximum supported, otherwise it will be set to one. If
-@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting will override this
+of maximum active nested regions supported is by default set to the
+maximum supported, otherwise it is set to one. If
+@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting overrides this
setting. If both are undefined, nested parallel regions are enabled if
@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
more than one item, otherwise they are disabled by default.
@@ -3302,7 +3302,7 @@ implementation defined upper bound.
Specifies the default number of threads to use in parallel regions. The
value of this variable shall be a comma-separated list of positive integers;
the value specifies the number of threads to use for the corresponding nested
-level. Specifying more than one item in the list will automatically enable
+level. Specifying more than one item in the list automatically enables
nesting by default. If undefined one thread per CPU is used.
When a list with more than value is specified, it also affects the
@@ -3334,7 +3334,7 @@ same place partition as the primary thread. With @code{CLOSE} those are
kept close to the primary thread in contiguous place partitions. And
with @code{SPREAD} a sparse distribution
across the place partitions is used. Specifying more than one item in the
-list will automatically enable nesting by default.
+list automatically enables nesting by default.
When a list is specified, it also affects the @var{max-active-levels-var} ICV
as described in @ref{OMP_MAX_ACTIVE_LEVELS}.
@@ -3383,8 +3383,8 @@ specify an interval, a colon followed by the count is placed after
the hardware thread number or the place. Optionally, the length can be
followed by a colon and the stride number -- otherwise a unit stride is
assumed. Placing an exclamation mark (@code{!}) directly before a curly
-brace or numbers inside the curly braces (excluding intervals) will
-exclude those hardware threads.
+brace or numbers inside the curly braces (excluding intervals)
+excludes those hardware threads.
For instance, the following specifies the same places list:
@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
@@ -3464,16 +3464,16 @@ Specifies the behavior with regard to offloading code to a device. This
variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
or @code{DEFAULT}.
-If set to @code{MANDATORY}, the program will terminate with an error if
+If set to @code{MANDATORY}, the program terminates with an error if
any device construct or device memory routine uses a device that is unavailable
or not supported by the implementation, or uses a non-conforming device number.
-If set to @code{DISABLED}, then offloading is disabled and all code will run on
-the host. If set to @code{DEFAULT}, the program will try offloading to the
+If set to @code{DISABLED}, then offloading is disabled and all code runs on
+the host. If set to @code{DEFAULT}, the program tries offloading to the
device first, then fall back to running code on the host if it cannot.
-If undefined, then the program will behave as if @code{DEFAULT} was set.
+If undefined, then the program behaves as if @code{DEFAULT} was set.
-Note: Even with @code{MANDATORY}, there will be no run-time termination when
+Note: Even with @code{MANDATORY}, no run-time termination is performed when
the device number in a @code{device} clause or argument to a device memory
routine is for host, which includes using the device number in the
@var{default-device-var} ICV. However, the initial value of
@@ -3559,7 +3559,7 @@ Binds threads to specific CPUs. The variable should contain a space-separated
or comma-separated list of CPUs. This list may contain different kinds of
entries: either single CPU numbers in any order, a range of CPUs (M-N)
or a range with some stride (M-N:S). CPU numbers are zero based. For example,
-@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
+@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} binds the initial thread
to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
and 14 respectively and then start assigning back from the beginning of
@@ -3575,7 +3575,7 @@ or disabled during the runtime of the application.
If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
-@code{FALSE}, the host system will handle the assignment of threads to CPUs.
+@code{FALSE}, the host system handles the assignment of threads to CPUs.
@item @emph{See also}:
@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
@@ -3591,7 +3591,7 @@ If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
Enable debugging output. The variable should be set to @code{0}
(disabled, also the default if not set), or @code{1} (enabled).
-If enabled, some debugging output will be printed during execution.
+If enabled, some debugging output is printed during execution.
This is currently not specified in more detail, and subject to change.
@end table
@@ -3664,7 +3664,7 @@ separated by @code{:} where:
instance.
@item @code{$<priority>} is an optional priority for the worker threads of a
thread pool according to @code{pthread_setschedparam}. In case a priority
-value is omitted, then a worker thread will inherit the priority of the OpenMP
+value is omitted, then a worker thread inherits the priority of the OpenMP
primary thread that created it. The priority of the worker thread is not
changed after creation, even if a new OpenMP primary thread using the worker has
a different priority.
@@ -3672,7 +3672,7 @@ a different priority.
RTEMS application configuration.
@end itemize
In case no thread pool configuration is specified for a scheduler instance,
-then each OpenMP primary thread of this scheduler instance will use its own
+then each OpenMP primary thread of this scheduler instance uses its own
dynamically allocated thread pool. To limit the worker thread count of the
thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
@item @emph{Example}:
@@ -3950,7 +3950,7 @@ modified the interface introduced in OpenACC 2.6. The kind-value parameter
@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
for consistency and the return type of the @code{acc_get_property} function is
now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
-The parameter @code{acc_device_property} will continue to be provided,
+The parameter @code{acc_device_property} is still provided,
but might be removed in a future version of GCC.
@item @emph{C/C++}:
@@ -3983,8 +3983,8 @@ but might be removed in a future version of GCC.
@table @asis
@item @emph{Description}
This function tests for completion of the asynchronous operation specified
-in @var{arg}. In C/C++, a non-zero value will be returned to indicate
-the specified asynchronous operation has completed. While Fortran will return
+in @var{arg}. In C/C++, a non-zero value is returned to indicate
+the specified asynchronous operation has completed. While Fortran returns
a @code{true}. If the asynchronous operation has not completed, C/C++ returns
a zero and Fortran returns a @code{false}.
@@ -4012,8 +4012,8 @@ a zero and Fortran returns a @code{false}.
@table @asis
@item @emph{Description}
This function tests for completion of all asynchronous operations.
-In C/C++, a non-zero value will be returned to indicate all asynchronous
-operations have completed. While Fortran will return a @code{true}. If
+In C/C++, a non-zero value is returned to indicate all asynchronous
+operations have completed. While Fortran returns a @code{true}. If
any asynchronous operation has not completed, C/C++ returns a zero and
Fortran returns a @code{false}.
@@ -4196,9 +4196,9 @@ This function shuts down the runtime for the device type specified in
This function returns whether the program is executing on a particular
device specified in @var{devicetype}. In C/C++ a non-zero value is
returned to indicate the device is executing on the specified device type.
-In Fortran, @code{true} will be returned. If the program is not executing
-on the specified device type C/C++ will return a zero, while Fortran will
-return @code{false}.
+In Fortran, @code{true} is returned. If the program is not executing
+on the specified device type C/C++ will return a zero, while Fortran
+returns @code{false}.
@item @emph{C/C++}:
@multitable @columnfractions .20 .80
@@ -4303,8 +4303,8 @@ variable or array element and @var{len} specifies the length in bytes.
@table @asis
@item @emph{Description}
This function tests if the host data specified by @var{a} and of length
-@var{len} is present or not. If it is not present, then device memory
-will be allocated and the host memory copied. The device address of
+@var{len} is present or not. If it is not present, device memory
+is allocated and the host memory copied. The device address of
the newly allocated device memory is returned.
In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
@@ -4387,8 +4387,8 @@ array element and @var{len} specifies the length in bytes.
@table @asis
@item @emph{Description}
This function tests if the host data specified by @var{a} and of length
-@var{len} is present or not. If it is not present, then device memory
-will be allocated and mapped to host memory. In C/C++, the device address
+@var{len} is present or not. If it is not present, device memory
+is allocated and mapped to host memory. In C/C++, the device address
of the newly allocated device memory is returned.
In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
@@ -5075,11 +5075,11 @@ completed.
Normally, the management of the streams that are created as a result of
using the @code{async} clause, is done without any intervention by the
caller. This implies the association between the @code{async-argument}
-and the CUDA stream will be maintained for the lifetime of the program.
+and the CUDA stream is maintained for the lifetime of the program.
However, this association can be changed through the use of the library
function @code{acc_set_cuda_stream}. When the function
@code{acc_set_cuda_stream} is called, the CUDA stream that was
-originally associated with the @code{async} clause will be destroyed.
+originally associated with the @code{async} clause is destroyed.
Caution should be taken when changing the association as subsequent
references to the @code{async-argument} refer to a different
CUDA stream.
@@ -5129,7 +5129,7 @@ parameters to the OpenACC library function @code{acc_set_device_num()}.
Once the call to @code{acc_set_device_num()} has completed, the OpenACC
library uses the context that was created during the call to
-@code{cublasCreate()}. In other words, both libraries will be sharing the
+@code{cublasCreate()}. In other words, both libraries share the
same context.
@smallexample
@@ -5168,7 +5168,7 @@ call parameters specify which device to use and what device
type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
is but one method to initialize the OpenACC library and allocate the
appropriate hardware resources. Other methods are available through the
-use of environment variables and these will be discussed in the next section.
+use of environment variables and these is discussed in the next section.
Once the call to @code{acc_set_device_num()} has completed, other OpenACC
functions can be called as seen with multiple calls being made to
@@ -5178,7 +5178,7 @@ subsequent to the calls to @code{acc_copyin()}.
As seen in the previous use case, a call to @code{cublasCreate()}
initializes the CUBLAS library and allocates the hardware resources on the
host and the device. However, since the device has already been allocated,
-@code{cublasCreate()} will only initialize the CUBLAS library and allocate
+@code{cublasCreate()} only initializes the CUBLAS library and allocate
the appropriate hardware resources on the host. The context that was created
as part of the OpenACC initialization is shared with the CUBLAS library,
similarly to the first use case.
@@ -5267,7 +5267,7 @@ possible for the (very common) case that the Profiling Interface is
not enabled. This is relevant, as the Profiling Interface affects all
the @emph{hot} code paths (in the target code, not in the offloaded
code). Users of the OpenACC Profiling Interface can be expected to
-understand that performance will be impacted to some degree once the
+understand that performance is impacted to some degree once the
Profiling Interface has gotten enabled: for example, because of the
@emph{runtime} (libgomp) calling into a third-party @emph{library} for
every event that has been registered.
@@ -5289,7 +5289,7 @@ does directly calling @code{acc_prof_register},
@code{acc_prof_unregister}, @code{acc_prof_lookup}.
As currently there are no inquiry functions defined, calls to
-@code{acc_prof_lookup} will always return @code{NULL}.
+@code{acc_prof_lookup} always returns @code{NULL}.
There aren't separate @emph{start}, @emph{stop} events defined for the
event types @code{acc_ev_create}, @code{acc_ev_delete},
@@ -5307,7 +5307,7 @@ It's not clear if for @emph{nested} event callbacks (for example,
construct), this should be set for the nested event
(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
construct should remain (@code{acc_ev_compute_construct_start}). In
-this implementation, the value will generally correspond to the
+this implementation, the value generally corresponds to the
innermost nested event type.
@item @code{acc_prof_info.device_type}
@@ -5315,7 +5315,7 @@ innermost nested event type.
@item
For @code{acc_ev_compute_construct_start}, and in presence of an
-@code{if} clause with @emph{false} argument, this will still refer to
+@code{if} clause with @emph{false} argument, this still refers to
the offloading device type.
It's not clear if that's the expected behavior.
@@ -5340,20 +5340,20 @@ Not yet implemented correctly for
@item
In a compute construct, for host-fallback
-execution/@code{acc_device_host} it will always be
+execution/@code{acc_device_host} it always is
@code{acc_async_sync}.
-It's not clear if that's the expected behavior.
+It is unclear if that is the expected behavior.
@item
For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
it will always be @code{acc_async_sync}.
-It's not clear if that's the expected behavior.
+It is unclear if that is the expected behavior.
@end itemize
@item @code{acc_prof_info.async_queue}
There is no @cite{limited number of asynchronous queues} in libgomp.
-This will always have the same value as @code{acc_prof_info.async}.
+This always has the same value as @code{acc_prof_info.async}.
@item @code{acc_prof_info.src_file}
Always @code{NULL}; not yet implemented.