__atomic_test_and_set: Fall back to library, not non-atomic code
Checks
Commit Message
Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.
For arm-eabi, notably lacking any atomic support for the
default multilib, with --target_board=arm-sim it regressed
29_atomics/atomic_flag/cons/value_init.cc with the expected
linker failure due to lack of __atomic_test_and_set - which
is a good thing. With this one, there are 44 unexpected
FAILs for libstdc+++ at r14-4210-g94982a6b9cf4. This number
was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
r14-3980-g62b29347c38394, deliberately. To fix the
regression, I'll do the same and follow up with adding
dg-require-thread-fence on
29_atomics/atomic_flag/cons/value_init.cc (and if approved,
commit it before this one).
Incidentally, the fortran test-results for arm-eabi are
riddled with missing-__sync_synchronize linker errors
causing some 18134 unexpected failures, where cris-elf has
121.
Ok to commit?
-- >8 --
Make __atomic_test_and_set consistent with other __atomic_ and __sync_
builtins: call a matching library function instead of emitting
non-atomic code when the target has no direct insn support.
There's special-case code handling targetm.atomic_test_and_set_trueval
!= 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously,
if that worked but its matching emit_store_flag_force returned NULL,
we'd segfault later on. Now that the caller handles NULL, gcc_assert
here instead.
While the referenced PR:s are ARM-specific, the issue is general.
PR target/107567
PR target/109166
* builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
Handle failure from expand_builtin_atomic_test_and_set.
* optabs.cc (expand_atomic_test_and_set): When all attempts fail to
generate atomic code through target support, return NULL
instead of emitting non-atomic code. Also, for code handling
targetm.atomic_test_and_set_trueval != 1, gcc_assert result
from calling emit_store_flag_force instead of returning NULL.
---
gcc/builtins.cc | 5 ++++-
gcc/optabs.cc | 22 +++++++---------------
2 files changed, 11 insertions(+), 16 deletions(-)
Comments
On 9/26/23 08:34, Hans-Peter Nilsson wrote:
> Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.
>
> For arm-eabi, notably lacking any atomic support for the
> default multilib, with --target_board=arm-sim it regressed
> 29_atomics/atomic_flag/cons/value_init.cc with the expected
> linker failure due to lack of __atomic_test_and_set - which
> is a good thing. With this one, there are 44 unexpected
> FAILs for libstdc+++ at r14-4210-g94982a6b9cf4. This number
> was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
> r14-3980-g62b29347c38394, deliberately. To fix the
> regression, I'll do the same and follow up with adding
> dg-require-thread-fence on
> 29_atomics/atomic_flag/cons/value_init.cc (and if approved,
> commit it before this one).
>
> Incidentally, the fortran test-results for arm-eabi are
> riddled with missing-__sync_synchronize linker errors
> causing some 18134 unexpected failures, where cris-elf has
> 121.
>
> Ok to commit?
>
> -- >8 --
> Make __atomic_test_and_set consistent with other __atomic_ and __sync_
> builtins: call a matching library function instead of emitting
> non-atomic code when the target has no direct insn support.
>
> There's special-case code handling targetm.atomic_test_and_set_trueval
> != 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously,
> if that worked but its matching emit_store_flag_force returned NULL,
> we'd segfault later on. Now that the caller handles NULL, gcc_assert
> here instead.
>
> While the referenced PR:s are ARM-specific, the issue is general.
>
> PR target/107567
> PR target/109166
> * builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
> Handle failure from expand_builtin_atomic_test_and_set.
> * optabs.cc (expand_atomic_test_and_set): When all attempts fail to
> generate atomic code through target support, return NULL
> instead of emitting non-atomic code. Also, for code handling
> targetm.atomic_test_and_set_trueval != 1, gcc_assert result
> from calling emit_store_flag_force instead of returning NULL.
OK
jeff
Hi!
On Tue, 26 Sept 2023 at 16:34, Hans-Peter Nilsson <hp@axis.com> wrote:
> Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.
>
> For arm-eabi, notably lacking any atomic support for the
> default multilib, with --target_board=arm-sim it regressed
> 29_atomics/atomic_flag/cons/value_init.cc with the expected
> linker failure due to lack of __atomic_test_and_set - which
> is a good thing. With this one, there are 44 unexpected
> FAILs for libstdc+++ at r14-4210-g94982a6b9cf4. This number
> was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
> r14-3980-g62b29347c38394, deliberately. To fix the
> regression, I'll do the same and follow up with adding
> dg-require-thread-fence on
> 29_atomics/atomic_flag/cons/value_init.cc (and if approved,
> commit it before this one).
>
> Incidentally, the fortran test-results for arm-eabi are
> riddled with missing-__sync_synchronize linker errors
> causing some 18134 unexpected failures, where cris-elf has
> 121.
>
>
The patch passed almost all our CI configurations, except arm-eabi when
testing with
-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
where is causes these failures:
FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
errors)
UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation
failed to produce executable
FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for
excess errors)
UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for
excess errors)
UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test
for excess errors)
UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test
for excess errors)
UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17
compilation failed to produce executable
The linker error is:
undefined reference to `__atomic_test_and_set'
Maybe we need a new variant of dg-require-thread-fence ?
Thanks,
Christophe
Ok to commit?
>
> -- >8 --
> Make __atomic_test_and_set consistent with other __atomic_ and __sync_
> builtins: call a matching library function instead of emitting
> non-atomic code when the target has no direct insn support.
>
> There's special-case code handling targetm.atomic_test_and_set_trueval
> != 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously,
> if that worked but its matching emit_store_flag_force returned NULL,
> we'd segfault later on. Now that the caller handles NULL, gcc_assert
> here instead.
>
> While the referenced PR:s are ARM-specific, the issue is general.
>
> PR target/107567
> PR target/109166
> * builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
> Handle failure from expand_builtin_atomic_test_and_set.
> * optabs.cc (expand_atomic_test_and_set): When all attempts fail to
> generate atomic code through target support, return NULL
> instead of emitting non-atomic code. Also, for code handling
> targetm.atomic_test_and_set_trueval != 1, gcc_assert result
> from calling emit_store_flag_force instead of returning NULL.
> ---
> gcc/builtins.cc | 5 ++++-
> gcc/optabs.cc | 22 +++++++---------------
> 2 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 6e4274bb2a4e..40dfd36a3197 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -8387,7 +8387,10 @@ expand_builtin (tree exp, rtx target, rtx
> subtarget, machine_mode mode,
> break;
>
> case BUILT_IN_ATOMIC_TEST_AND_SET:
> - return expand_builtin_atomic_test_and_set (exp, target);
> + target = expand_builtin_atomic_test_and_set (exp, target);
> + if (target)
> + return target;
> + break;
>
> case BUILT_IN_ATOMIC_CLEAR:
> return expand_builtin_atomic_clear (exp);
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 8b96f23aec05..e1898da22808 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -7080,25 +7080,17 @@ expand_atomic_test_and_set (rtx target, rtx mem,
> enum memmodel model)
> /* Recall that the legacy lock_test_and_set optab was allowed to do
> magic
> things with the value 1. Thus we try again without trueval. */
> if (!ret && targetm.atomic_test_and_set_trueval != 1)
> - ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx,
> model);
> -
> - /* Failing all else, assume a single threaded environment and simply
> - perform the operation. */
> - if (!ret)
> {
> - /* If the result is ignored skip the move to target. */
> - if (subtarget != const0_rtx)
> - emit_move_insn (subtarget, mem);
> + ret = maybe_emit_sync_lock_test_and_set (subtarget, mem,
> const1_rtx, model);
>
> - emit_move_insn (mem, trueval);
> - ret = subtarget;
> + if (ret)
> + {
> + /* Rectify the not-one trueval. */
> + ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode,
> 0, 1);
> + gcc_assert (ret);
> + }
> }
>
> - /* Recall that have to return a boolean value; rectify if trueval
> - is not exactly one. */
> - if (targetm.atomic_test_and_set_trueval != 1)
> - ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
> -
> return ret;
> }
>
> --
> 2.30.2
>
>
> From: Christophe Lyon <christophe.lyon@linaro.org>
> Date: Tue, 3 Oct 2023 15:20:39 +0200
> The patch passed almost all our CI configurations, except arm-eabi when
> testing with
> -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> where is causes these failures:
> FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> errors)
> UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation
> failed to produce executable
> FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for
> excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for
> excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test
> for excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test
> for excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17
> compilation failed to produce executable
For which set of multilibs in that set, do you get these
errors? I'm guessing -march=armv6s-m, but I'm checking.
> The linker error is:
> undefined reference to `__atomic_test_and_set'
I read that as you're saying you have a multilib combination
where you currently don't emit __sync_synchronize but also
don't emit anything for __atomic_test_and_set.
> Maybe we need a new variant of dg-require-thread-fence ?
Perhaps. Unless of course, there's a multilib combination
for which you *can* emit the proper atomic spell; missing it
because the need for it, has been hidden!
(At first I thought it was related to caching the
thread-fence property across multilib testing, but I don't
think that was correct.)
>
> Thanks,
>
> Christophe
>
>
> Ok to commit?
ENOPATCH
brgds, H-P
On Wed, 4 Oct 2023 at 02:49, Hans-Peter Nilsson <hp@axis.com> wrote:
> (Just before sending, I noticed you replied off-list; I
> won't add back gcc-patches to cc here myself, but please
> feel free to do it, if you choose to reply.)
>
Sorry, it was a typo of mine, I meant to reply to the list
>
> > From: Christophe Lyon <christophe.lyon@linaro.org>
> > Date: Tue, 3 Oct 2023 18:06:21 +0200
>
> > On Tue, 3 Oct 2023 at 17:16, Hans-Peter Nilsson <hp@axis.com> wrote:
> >
> > > > From: Christophe Lyon <christophe.lyon@linaro.org>
> > > > Date: Tue, 3 Oct 2023 15:20:39 +0200
> > >
> > > > The patch passed almost all our CI configurations, except arm-eabi
> when
> > > > testing with
> > > > -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> > > > where is causes these failures:
> > > > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> > > > errors)
> > > > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17
> compilation
> > > > failed to produce executable
> [...]
> > > For which set of multilibs in that set, do you get these
> > > errors? I'm guessing -march=armv6s-m, but I'm checking.
> > >
> >
> > Not sure to understand the question?
>
> By your "testing with
> -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto"
> I presume you mean "testing with make check
>
> 'RUNTESTFLAGS=--target_board=arm-sim/-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto'
> (as that's where you usually put that expression)
>
Yes
>
> - but I misremembered what "/" means in RUNTESTFLAGS (it's
> combining options in one single set of options passed to
> gcc, not delimiting separate options used to form
> combinations). Sorry for the confusion!
>
I see.
Actually I always find "multilib" confusing in the context of running the
tests, where in fact we generally mean we add some flags in
RUNTESTFLAGS and/or --target_board.
To me, multilib refers to the set of lib variants we build, but I noticed
"multilib" is often used in the context of running the tests...
>
> > > > Maybe we need a new variant of dg-require-thread-fence ?
> > >
> > > Perhaps.
>
> Or rather: most certainly. IIUC, ARMv6 (or whatever you
> prefer to call it) can load and store atomically, but only
> as separate events; it can't atomically exchange a stored
> value and therefore arm-eabi calls out to a library
> function.
>
In this case, it's armv6s-m (which is different from armv6....)
And yes, it seems so, as your patch showed, assuming there's
no bug in the target description ;-)
> I think I'll help and replace the obvious uses of
> dg-require-thread-fence where actually an atomic exchange is
> required; replacing those with a new directive
> dg-require-atomic-exchange. That will however not be *all*
> places where such a guard should be added.
>
Indeed.
> I also see lots of undefined references to *other* outlined
> atomic builtins, for example __atomic_compare_exchange_4 and
> __atomic_fetch_add_4. Though, those likely aren't
> regressions. I understand you focus on regressions here.
>
Yes, my reply to your patch was meant to look at the regressions.
As a separate action, I plan to look at the remaining existing such
failures.
> By the way, how do you test; what simulator, what baseboard
> file? Please share! Also, please send *some*
> contrib/test_summary reports for arm-eabi to
> gcc-testresults@ every now and then. (But also, please
>
We use qemu, with qemu.exp from:
https://git.linaro.org/toolchain/abe.git/tree/config/boards/qemu.exp
nothing fancy ;-)
> don't post multiple results several times a day for similar
> configurations. Looking at you, powerpc people!)
>
We have plans to restart sending such results, like I was doing several
years ago
(with many results every day, too ;-))
> I can't test *anything* newer than default arm-eabi (armv4t)
> on arm-sim (the one next to gdb), or else execution tests
> get lost and time out while also complaining about "Unknown
> machine type". I noticed there's a qemu.exp in ToT dejagnu,
> but it doesn't work for arm-eabi for at least two reasons.
> (I might get to that yak later, I just take it as a sign
> that qemu-arm isn't what I look for.)
We do use qemu-arm, depending on how you want to test,
maybe you need to add a -cpu flag such that it supports
the required instructions.
> > Unless of course, there's a multilib combination
>
> > for which you *can* emit the proper atomic spell; missing it
> > > because the need for it, has been hidden!
> > >
> > > (At first I thought it was related to caching the
> > > thread-fence property across multilib testing, but I don't
> > > think that was correct.)
> > >
> > Not sure what you mean? We run the tests for a single multilib here
> > (mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto)
> > so the cached value should always be correct?
>
> Yeah, part of my RUNTESTFLAGS confusion per above, please
> ignore that.
>
> brgds, H-P
>
@@ -8387,7 +8387,10 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
break;
case BUILT_IN_ATOMIC_TEST_AND_SET:
- return expand_builtin_atomic_test_and_set (exp, target);
+ target = expand_builtin_atomic_test_and_set (exp, target);
+ if (target)
+ return target;
+ break;
case BUILT_IN_ATOMIC_CLEAR:
return expand_builtin_atomic_clear (exp);
@@ -7080,25 +7080,17 @@ expand_atomic_test_and_set (rtx target, rtx mem, enum memmodel model)
/* Recall that the legacy lock_test_and_set optab was allowed to do magic
things with the value 1. Thus we try again without trueval. */
if (!ret && targetm.atomic_test_and_set_trueval != 1)
- ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model);
-
- /* Failing all else, assume a single threaded environment and simply
- perform the operation. */
- if (!ret)
{
- /* If the result is ignored skip the move to target. */
- if (subtarget != const0_rtx)
- emit_move_insn (subtarget, mem);
+ ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model);
- emit_move_insn (mem, trueval);
- ret = subtarget;
+ if (ret)
+ {
+ /* Rectify the not-one trueval. */
+ ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
+ gcc_assert (ret);
+ }
}
- /* Recall that have to return a boolean value; rectify if trueval
- is not exactly one. */
- if (targetm.atomic_test_and_set_trueval != 1)
- ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
-
return ret;
}