[v5,06/11] RISC-V: Strengthen atomic stores
Checks
Commit Message
This change makes atomic stores strictly stronger than table A.6 of the
ISA manual. This mapping makes the overall patchset compatible with
table A.7 as well.
2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
PR 89835
gcc/ChangeLog:
* config/riscv/sync.md:
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr89835.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
gcc/config/riscv/sync.md | 21 ++++++++++++++++++---
gcc/testsuite/gcc.target/riscv/pr89835.c | 9 +++++++++
2 files changed, 27 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/pr89835.c
Comments
On 4/27/23 10:22, Patrick O'Neill wrote:
> This change makes atomic stores strictly stronger than table A.6 of the
> ISA manual. This mapping makes the overall patchset compatible with
> table A.7 as well.
>
> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
>
> PR 89835
Should be "PR target/89835"
>
> gcc/ChangeLog:
>
> * config/riscv/sync.md:
Needs some text here :-)
I'm not objecting to this patch, but I think we've got an option
question about whether or not this approach is too expensive for
existing or soon arriving implementations.
If the decision on that topic is to just pay the cost, then this patch
is fine. If we decide to make compatibility optional to avoid the
additional cost, then this will need suitable adjustments.
Jeff
On Fri, 28 Apr 2023 10:40:15 PDT (-0700), jeffreyalaw@gmail.com wrote:
>
>
> On 4/27/23 10:22, Patrick O'Neill wrote:
>> This change makes atomic stores strictly stronger than table A.6 of the
>> ISA manual. This mapping makes the overall patchset compatible with
>> table A.7 as well.
>>
>> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
>>
>> PR 89835
> Should be "PR target/89835"
>
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/sync.md:
> Needs some text here :-)
>
>
> I'm not objecting to this patch, but I think we've got an option
> question about whether or not this approach is too expensive for
> existing or soon arriving implementations.
>
> If the decision on that topic is to just pay the cost, then this patch
> is fine. If we decide to make compatibility optional to avoid the
> additional cost, then this will need suitable adjustments.
IMO the only hardware that's going to be here by gcc-14 and to have
enough concurrency for these to matter is the Ventana stuff. I think
you're the only one who can figure out if these are slow, at least until
that stuff is availiable outside the lab.
So are they too slow for you?
>
> Jeff
The concern with making the new behavior non-default is of course that the
generated code will eventually end up on an A.7-capable platform.
An A.6-classic option for compiling code that will never run on a newer
machine seems OK. But I'm not sure that seq_cst stores are dynamically
frequent enough in C++ code for this to be worth the trouble. Unlike loads,
they are also costly on x86, programmers may also have been somewhat
trained to avoid them where possible. (And probably where not possible, too
:-( )
Hans
On Fri, Apr 28, 2023 at 10:43 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
> On Fri, 28 Apr 2023 10:40:15 PDT (-0700), jeffreyalaw@gmail.com wrote:
> >
> >
> > On 4/27/23 10:22, Patrick O'Neill wrote:
> >> This change makes atomic stores strictly stronger than table A.6 of the
> >> ISA manual. This mapping makes the overall patchset compatible with
> >> table A.7 as well.
> >>
> >> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
> >>
> >> PR 89835
> > Should be "PR target/89835"
> >
> >>
> >> gcc/ChangeLog:
> >>
> >> * config/riscv/sync.md:
> > Needs some text here :-)
> >
> >
> > I'm not objecting to this patch, but I think we've got an option
> > question about whether or not this approach is too expensive for
> > existing or soon arriving implementations.
> >
> > If the decision on that topic is to just pay the cost, then this patch
> > is fine. If we decide to make compatibility optional to avoid the
> > additional cost, then this will need suitable adjustments.
>
> IMO the only hardware that's going to be here by gcc-14 and to have
> enough concurrency for these to matter is the Ventana stuff. I think
> you're the only one who can figure out if these are slow, at least until
> that stuff is availiable outside the lab.
>
> So are they too slow for you?
>
> >
> > Jeff
>
The RISC-V psABI pull request is at
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/378 . Pointers to
Paul Kirth's corresponding LLVM patches are also there.
On Fri, Apr 28, 2023 at 2:42 PM Hans Boehm <hboehm@google.com> wrote:
> The concern with making the new behavior non-default is of course that the
> generated code will eventually end up on an A.7-capable platform.
>
> An A.6-classic option for compiling code that will never run on a newer
> machine seems OK. But I'm not sure that seq_cst stores are dynamically
> frequent enough in C++ code for this to be worth the trouble. Unlike loads,
> they are also costly on x86, programmers may also have been somewhat
> trained to avoid them where possible. (And probably where not possible, too
> :-( )
>
> Hans
>
> On Fri, Apr 28, 2023 at 10:43 AM Palmer Dabbelt <palmer@rivosinc.com>
> wrote:
>
>> On Fri, 28 Apr 2023 10:40:15 PDT (-0700), jeffreyalaw@gmail.com wrote:
>> >
>> >
>> > On 4/27/23 10:22, Patrick O'Neill wrote:
>> >> This change makes atomic stores strictly stronger than table A.6 of the
>> >> ISA manual. This mapping makes the overall patchset compatible with
>> >> table A.7 as well.
>> >>
>> >> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
>> >>
>> >> PR 89835
>> > Should be "PR target/89835"
>> >
>> >>
>> >> gcc/ChangeLog:
>> >>
>> >> * config/riscv/sync.md:
>> > Needs some text here :-)
>> >
>> >
>> > I'm not objecting to this patch, but I think we've got an option
>> > question about whether or not this approach is too expensive for
>> > existing or soon arriving implementations.
>> >
>> > If the decision on that topic is to just pay the cost, then this patch
>> > is fine. If we decide to make compatibility optional to avoid the
>> > additional cost, then this will need suitable adjustments.
>>
>> IMO the only hardware that's going to be here by gcc-14 and to have
>> enough concurrency for these to matter is the Ventana stuff. I think
>> you're the only one who can figure out if these are slow, at least until
>> that stuff is availiable outside the lab.
>>
>> So are they too slow for you?
>>
>> >
>> > Jeff
>>
>
On 4/28/23 15:42, Hans Boehm wrote:
> The concern with making the new behavior non-default is of course that
> the generated code will eventually end up on an A.7-capable platform.
Yea, certainly a significant concern in general, though probably not for
Ventana. I expect we'll have folks rebuilding as they go from V1 to V2
due to the ventanacondops -> condops change.
>
> An A.6-classic option for compiling code that will never run on a newer
> machine seems OK. But I'm not sure that seq_cst stores are dynamically
> frequent enough in C++ code for this to be worth the trouble.
> Unlike loads, they are also costly on x86, programmers may also have
> been somewhat trained to avoid them where possible. (And probably where
> not possible, too :-( )
I was more worried about the kernel, perhaps because I'm less familiar
with it these days than ever. WRT C++, I can ping Jon@RedHat, he's got
a great sense what's in the standard library obviously, but also a
reasonable sense of what folks are doing in the wild.
Jeff
Discussed in the patchworks meeting with Jeff Law and decided to move
forward with the trailing fence compatibility approach. If the trailing
fence becomes a performance issue and people want to generate A.6 code,
we'll need a PSABI change to identify which mapping a binary uses. We'll
cross that bridge when/if we get to it.
Patrick
On 4/27/23 09:22, Patrick O'Neill wrote:
> This change makes atomic stores strictly stronger than table A.6 of the
> ISA manual. This mapping makes the overall patchset compatible with
> table A.7 as well.
>
> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
>
> PR 89835
>
> gcc/ChangeLog:
>
> * config/riscv/sync.md:
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/pr89835.c: New test.
>
> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
On 4/28/23 10:40, Jeff Law wrote:
>
>
> On 4/27/23 10:22, Patrick O'Neill wrote:
>> This change makes atomic stores strictly stronger than table A.6 of the
>> ISA manual. This mapping makes the overall patchset compatible with
>> table A.7 as well.
>>
>> 2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
>>
>> PR 89835
> Should be "PR target/89835"
>
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/sync.md:
> Needs some text here :-)
>
>
> I'm not objecting to this patch, but I think we've got an option
> question about whether or not this approach is too expensive for
> existing or soon arriving implementations.
>
> If the decision on that topic is to just pay the cost, then this patch
> is fine. If we decide to make compatibility optional to avoid the
> additional cost, then this will need suitable adjustments.
>
> Jeff
Acked in Patchworks meeting:
https://inbox.sourceware.org/gcc-patches/c53ac2b2-4edf-34c6-a935-3b31644c9864@rivosinc.com/
Updated changelog and committed:
PR target/89835
gcc/ChangeLog:
* config/riscv/sync.md (atomic_store<mode>): Use simple store
instruction in combination with fence(s).
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr89835.c: New test.
Patrick
@@ -56,7 +56,9 @@
;; Atomic memory operations.
-;; Implement atomic stores with amoswap. Fall back to fences for atomic loads.
+;; Implement atomic stores with conservative fences. Fall back to fences for
+;; atomic loads.
+;; This allows us to be compatible with the ISA manual Table A.6 and Table A.7.
(define_insn "atomic_store<mode>"
[(set (match_operand:GPR 0 "memory_operand" "=A")
(unspec_volatile:GPR
@@ -64,9 +66,22 @@
(match_operand:SI 2 "const_int_operand")] ;; model
UNSPEC_ATOMIC_STORE))]
"TARGET_ATOMIC"
- "%F2amoswap.<amo>%A2 zero,%z1,%0"
+ {
+ enum memmodel model = (enum memmodel) INTVAL (operands[2]);
+ model = memmodel_base (model);
+
+ if (model == MEMMODEL_SEQ_CST)
+ return "fence\trw,w\;"
+ "s<amo>\t%z1,%0\;"
+ "fence\trw,rw";
+ if (model == MEMMODEL_RELEASE)
+ return "fence\trw,w\;"
+ "s<amo>\t%z1,%0";
+ else
+ return "s<amo>\t%z1,%0";
+ }
[(set_attr "type" "atomic")
- (set (attr "length") (const_int 8))])
+ (set (attr "length") (const_int 12))])
(define_insn "atomic_<atomic_optab><mode>"
[(set (match_operand:GPR 0 "memory_operand" "+A")
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* Verify that relaxed atomic stores use simple store instuctions. */
+/* { dg-final { scan-assembler-not "amoswap" } } */
+
+void
+foo(int bar, int baz)
+{
+ __atomic_store_n(&bar, baz, __ATOMIC_RELAXED);
+}