__atomic_test_and_set: Fall back to library, not non-atomic code

Message ID 20230926143439.B589920431@pchp3.se.axis.com
State Accepted
Headers
Series __atomic_test_and_set: Fall back to library, not non-atomic code |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Hans-Peter Nilsson Sept. 26, 2023, 2:34 p.m. UTC
  Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.

For arm-eabi, notably lacking any atomic support for the
default multilib, with --target_board=arm-sim it regressed
29_atomics/atomic_flag/cons/value_init.cc with the expected
linker failure due to lack of __atomic_test_and_set - which
is a good thing.  With this one, there are 44 unexpected
FAILs for libstdc+++ at r14-4210-g94982a6b9cf4.  This number
was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
r14-3980-g62b29347c38394, deliberately.  To fix the
regression, I'll do the same and follow up with adding
dg-require-thread-fence on
29_atomics/atomic_flag/cons/value_init.cc (and if approved,
commit it before this one).

Incidentally, the fortran test-results for arm-eabi are
riddled with missing-__sync_synchronize linker errors
causing some 18134 unexpected failures, where cris-elf has
121.

Ok to commit?

-- >8 --
Make __atomic_test_and_set consistent with other __atomic_ and __sync_
builtins: call a matching library function instead of emitting
non-atomic code when the target has no direct insn support.

There's special-case code handling targetm.atomic_test_and_set_trueval
!= 1 trying a modified maybe_emit_sync_lock_test_and_set.  Previously,
if that worked but its matching emit_store_flag_force returned NULL,
we'd segfault later on.  Now that the caller handles NULL, gcc_assert
here instead.

While the referenced PR:s are ARM-specific, the issue is general.

	PR target/107567
	PR target/109166
	* builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
	Handle failure from expand_builtin_atomic_test_and_set.
	* optabs.cc (expand_atomic_test_and_set): When all attempts fail to
	generate atomic code through target support, return NULL
	instead of emitting non-atomic code.  Also, for code handling
	targetm.atomic_test_and_set_trueval != 1, gcc_assert result
	from calling emit_store_flag_force instead of returning NULL.
---
 gcc/builtins.cc |  5 ++++-
 gcc/optabs.cc   | 22 +++++++---------------
 2 files changed, 11 insertions(+), 16 deletions(-)
  

Comments

Jeff Law Sept. 26, 2023, 3:23 p.m. UTC | #1
On 9/26/23 08:34, Hans-Peter Nilsson wrote:
> Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.
> 
> For arm-eabi, notably lacking any atomic support for the
> default multilib, with --target_board=arm-sim it regressed
> 29_atomics/atomic_flag/cons/value_init.cc with the expected
> linker failure due to lack of __atomic_test_and_set - which
> is a good thing.  With this one, there are 44 unexpected
> FAILs for libstdc+++ at r14-4210-g94982a6b9cf4.  This number
> was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
> r14-3980-g62b29347c38394, deliberately.  To fix the
> regression, I'll do the same and follow up with adding
> dg-require-thread-fence on
> 29_atomics/atomic_flag/cons/value_init.cc (and if approved,
> commit it before this one).
> 
> Incidentally, the fortran test-results for arm-eabi are
> riddled with missing-__sync_synchronize linker errors
> causing some 18134 unexpected failures, where cris-elf has
> 121.
> 
> Ok to commit?
> 
> -- >8 --
> Make __atomic_test_and_set consistent with other __atomic_ and __sync_
> builtins: call a matching library function instead of emitting
> non-atomic code when the target has no direct insn support.
> 
> There's special-case code handling targetm.atomic_test_and_set_trueval
> != 1 trying a modified maybe_emit_sync_lock_test_and_set.  Previously,
> if that worked but its matching emit_store_flag_force returned NULL,
> we'd segfault later on.  Now that the caller handles NULL, gcc_assert
> here instead.
> 
> While the referenced PR:s are ARM-specific, the issue is general.
> 
> 	PR target/107567
> 	PR target/109166
> 	* builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
> 	Handle failure from expand_builtin_atomic_test_and_set.
> 	* optabs.cc (expand_atomic_test_and_set): When all attempts fail to
> 	generate atomic code through target support, return NULL
> 	instead of emitting non-atomic code.  Also, for code handling
> 	targetm.atomic_test_and_set_trueval != 1, gcc_assert result
> 	from calling emit_store_flag_force instead of returning NULL.
OK
jeff
  
Christophe Lyon Oct. 3, 2023, 1:20 p.m. UTC | #2
Hi!

On Tue, 26 Sept 2023 at 16:34, Hans-Peter Nilsson <hp@axis.com> wrote:

> Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi.
>
> For arm-eabi, notably lacking any atomic support for the
> default multilib, with --target_board=arm-sim it regressed
> 29_atomics/atomic_flag/cons/value_init.cc with the expected
> linker failure due to lack of __atomic_test_and_set - which
> is a good thing.  With this one, there are 44 unexpected
> FAILs for libstdc+++ at r14-4210-g94982a6b9cf4.  This number
> was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by
> r14-3980-g62b29347c38394, deliberately.  To fix the
> regression, I'll do the same and follow up with adding
> dg-require-thread-fence on
> 29_atomics/atomic_flag/cons/value_init.cc (and if approved,
> commit it before this one).
>
> Incidentally, the fortran test-results for arm-eabi are
> riddled with missing-__sync_synchronize linker errors
> causing some 18134 unexpected failures, where cris-elf has
> 121.
>
>
The patch passed almost all our CI configurations, except arm-eabi when
testing with
 -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
where is causes these failures:
FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
errors)
UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation
failed to produce executable
FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for
excess errors)
UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for
excess errors)
UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test
for excess errors)
UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17
compilation failed to produce executable
FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test
for excess errors)
UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17
compilation failed to produce executable

The linker error is:
undefined reference to `__atomic_test_and_set'

Maybe we need a new variant of dg-require-thread-fence ?

Thanks,

Christophe


Ok to commit?
>
> -- >8 --
> Make __atomic_test_and_set consistent with other __atomic_ and __sync_
> builtins: call a matching library function instead of emitting
> non-atomic code when the target has no direct insn support.
>
> There's special-case code handling targetm.atomic_test_and_set_trueval
> != 1 trying a modified maybe_emit_sync_lock_test_and_set.  Previously,
> if that worked but its matching emit_store_flag_force returned NULL,
> we'd segfault later on.  Now that the caller handles NULL, gcc_assert
> here instead.
>
> While the referenced PR:s are ARM-specific, the issue is general.
>
>         PR target/107567
>         PR target/109166
>         * builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
>         Handle failure from expand_builtin_atomic_test_and_set.
>         * optabs.cc (expand_atomic_test_and_set): When all attempts fail to
>         generate atomic code through target support, return NULL
>         instead of emitting non-atomic code.  Also, for code handling
>         targetm.atomic_test_and_set_trueval != 1, gcc_assert result
>         from calling emit_store_flag_force instead of returning NULL.
> ---
>  gcc/builtins.cc |  5 ++++-
>  gcc/optabs.cc   | 22 +++++++---------------
>  2 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 6e4274bb2a4e..40dfd36a3197 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -8387,7 +8387,10 @@ expand_builtin (tree exp, rtx target, rtx
> subtarget, machine_mode mode,
>        break;
>
>      case BUILT_IN_ATOMIC_TEST_AND_SET:
> -      return expand_builtin_atomic_test_and_set (exp, target);
> +      target = expand_builtin_atomic_test_and_set (exp, target);
> +      if (target)
> +       return target;
> +      break;
>
>      case BUILT_IN_ATOMIC_CLEAR:
>        return expand_builtin_atomic_clear (exp);
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 8b96f23aec05..e1898da22808 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -7080,25 +7080,17 @@ expand_atomic_test_and_set (rtx target, rtx mem,
> enum memmodel model)
>    /* Recall that the legacy lock_test_and_set optab was allowed to do
> magic
>       things with the value 1.  Thus we try again without trueval.  */
>    if (!ret && targetm.atomic_test_and_set_trueval != 1)
> -    ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx,
> model);
> -
> -  /* Failing all else, assume a single threaded environment and simply
> -     perform the operation.  */
> -  if (!ret)
>      {
> -      /* If the result is ignored skip the move to target.  */
> -      if (subtarget != const0_rtx)
> -        emit_move_insn (subtarget, mem);
> +      ret = maybe_emit_sync_lock_test_and_set (subtarget, mem,
> const1_rtx, model);
>
> -      emit_move_insn (mem, trueval);
> -      ret = subtarget;
> +      if (ret)
> +       {
> +         /* Rectify the not-one trueval.  */
> +         ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode,
> 0, 1);
> +         gcc_assert (ret);
> +       }
>      }
>
> -  /* Recall that have to return a boolean value; rectify if trueval
> -     is not exactly one.  */
> -  if (targetm.atomic_test_and_set_trueval != 1)
> -    ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
> -
>    return ret;
>  }
>
> --
> 2.30.2
>
>
  
Hans-Peter Nilsson Oct. 3, 2023, 3:16 p.m. UTC | #3
> From: Christophe Lyon <christophe.lyon@linaro.org>
> Date: Tue, 3 Oct 2023 15:20:39 +0200

> The patch passed almost all our CI configurations, except arm-eabi when
> testing with
>  -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> where is causes these failures:
> FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> errors)
> UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation
> failed to produce executable
> FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for
> excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for
> excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test
> for excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17
> compilation failed to produce executable
> FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test
> for excess errors)
> UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17
> compilation failed to produce executable

For which set of multilibs in that set, do you get these
errors?  I'm guessing -march=armv6s-m, but I'm checking.

> The linker error is:
> undefined reference to `__atomic_test_and_set'

I read that as you're saying you have a multilib combination
where you currently don't emit __sync_synchronize but also
don't emit anything for __atomic_test_and_set.

> Maybe we need a new variant of dg-require-thread-fence ?

Perhaps.  Unless of course, there's a multilib combination
for which you *can* emit the proper atomic spell; missing it
because the need for it, has been hidden!

(At first I thought it was related to caching the
thread-fence property across multilib testing, but I don't
think that was correct.)

> 
> Thanks,
> 
> Christophe
> 
> 
> Ok to commit?

ENOPATCH

brgds, H-P
  
Christophe Lyon Oct. 4, 2023, 8:53 a.m. UTC | #4
On Wed, 4 Oct 2023 at 02:49, Hans-Peter Nilsson <hp@axis.com> wrote:

> (Just before sending, I noticed you replied off-list; I
> won't add back gcc-patches to cc here myself, but please
> feel free to do it, if you choose to reply.)
>

Sorry, it was a typo of mine, I meant to reply to the list


>
> > From: Christophe Lyon <christophe.lyon@linaro.org>
> > Date: Tue, 3 Oct 2023 18:06:21 +0200
>
> > On Tue, 3 Oct 2023 at 17:16, Hans-Peter Nilsson <hp@axis.com> wrote:
> >
> > > > From: Christophe Lyon <christophe.lyon@linaro.org>
> > > > Date: Tue, 3 Oct 2023 15:20:39 +0200
> > >
> > > > The patch passed almost all our CI configurations, except arm-eabi
> when
> > > > testing with
> > > >  -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> > > > where is causes these failures:
> > > > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> > > > errors)
> > > > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17
> compilation
> > > > failed to produce executable
> [...]
> > > For which set of multilibs in that set, do you get these
> > > errors?  I'm guessing -march=armv6s-m, but I'm checking.
> > >
> >
> > Not sure to understand the question?
>
> By your "testing with
> -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto"
> I presume you mean "testing with make check
>
> 'RUNTESTFLAGS=--target_board=arm-sim/-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto'
> (as that's where you usually put that expression)
>
Yes


>
> - but I misremembered what "/" means in RUNTESTFLAGS (it's
> combining options in one single set of options passed to
> gcc, not delimiting separate options used to form
> combinations).  Sorry for the confusion!
>
I see.

Actually I always find "multilib" confusing in the context of running the
tests, where in fact we generally mean we add some flags in
RUNTESTFLAGS and/or --target_board.
To me, multilib refers to the set of lib variants we build, but I noticed
"multilib" is often used in the context of running the tests...


>
> > > > Maybe we need a new variant of dg-require-thread-fence ?
> > >
> > > Perhaps.
>
> Or rather: most certainly.  IIUC, ARMv6 (or whatever you
> prefer to call it) can load and store atomically, but only
> as separate events; it can't atomically exchange a stored
> value and therefore arm-eabi calls out to a library
> function.
>
In this case, it's armv6s-m (which is different from armv6....)
And yes, it seems so, as your patch showed, assuming there's
no bug in the target description ;-)


> I think I'll help and replace the obvious uses of
> dg-require-thread-fence where actually an atomic exchange is
> required; replacing those with a new directive
> dg-require-atomic-exchange.  That will however not be *all*
> places where such a guard should be added.
>
Indeed.


> I also see lots of undefined references to *other* outlined
> atomic builtins, for example __atomic_compare_exchange_4 and
> __atomic_fetch_add_4.  Though, those likely aren't
> regressions.  I understand you focus on regressions here.
>
Yes, my reply to your patch was meant to look at the regressions.

As a separate action, I plan to look at the remaining existing such
failures.


> By the way, how do you test; what simulator, what baseboard
> file?  Please share!  Also, please send *some*
> contrib/test_summary reports for arm-eabi to
> gcc-testresults@ every now and then.  (But also, please
>
We use qemu, with qemu.exp from:
https://git.linaro.org/toolchain/abe.git/tree/config/boards/qemu.exp
nothing fancy ;-)



> don't post multiple results several times a day for similar
> configurations.  Looking at you, powerpc people!)
>
We have plans to restart sending such results, like I was doing several
years ago
(with many results every day, too ;-))


> I can't test *anything* newer than default arm-eabi (armv4t)
> on arm-sim (the one next to gdb), or else execution tests
> get lost and time out while also complaining about "Unknown
> machine type".  I noticed there's a qemu.exp in ToT dejagnu,
> but it doesn't work for arm-eabi for at least two reasons.
> (I might get to that yak later, I just take it as a sign
> that qemu-arm isn't what I look for.)

We do use qemu-arm, depending on how you want to test,
maybe you need to add a -cpu flag such that it supports
the required instructions.


> >  Unless of course, there's a multilib combination
>
> > for which you *can* emit the proper atomic spell; missing it
> > > because the need for it, has been hidden!
> > >
> > > (At first I thought it was related to caching the
> > > thread-fence property across multilib testing, but I don't
> > > think that was correct.)
> > >
> > Not sure what you mean? We run the tests for a single multilib here
> > (mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto)
> > so the cached value should always be correct?
>
> Yeah, part of my RUNTESTFLAGS confusion per above, please
> ignore that.
>
> brgds, H-P
>
  

Patch

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 6e4274bb2a4e..40dfd36a3197 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -8387,7 +8387,10 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
       break;
 
     case BUILT_IN_ATOMIC_TEST_AND_SET:
-      return expand_builtin_atomic_test_and_set (exp, target);
+      target = expand_builtin_atomic_test_and_set (exp, target);
+      if (target)
+	return target;
+      break;
 
     case BUILT_IN_ATOMIC_CLEAR:
       return expand_builtin_atomic_clear (exp);
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 8b96f23aec05..e1898da22808 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -7080,25 +7080,17 @@  expand_atomic_test_and_set (rtx target, rtx mem, enum memmodel model)
   /* Recall that the legacy lock_test_and_set optab was allowed to do magic
      things with the value 1.  Thus we try again without trueval.  */
   if (!ret && targetm.atomic_test_and_set_trueval != 1)
-    ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model);
-
-  /* Failing all else, assume a single threaded environment and simply
-     perform the operation.  */
-  if (!ret)
     {
-      /* If the result is ignored skip the move to target.  */
-      if (subtarget != const0_rtx)
-        emit_move_insn (subtarget, mem);
+      ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model);
 
-      emit_move_insn (mem, trueval);
-      ret = subtarget;
+      if (ret)
+	{
+	  /* Rectify the not-one trueval.  */
+	  ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
+	  gcc_assert (ret);
+	}
     }
 
-  /* Recall that have to return a boolean value; rectify if trueval
-     is not exactly one.  */
-  if (targetm.atomic_test_and_set_trueval != 1)
-    ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1);
-  
   return ret;
 }