tools: memory-model: Make plain accesses carry dependencies

Message ID 20221201121808.20785-1-jonas.oberhauser@huaweicloud.com
State New
Headers
Series tools: memory-model: Make plain accesses carry dependencies |

Commit Message

Jonas Oberhauser Dec. 1, 2022, 12:18 p.m. UTC
  From: Jonas Oberhauser <jonas.oberhauser@huawei.com>

As reported by Viktor, plain accesses in LKMM are weaker than
accesses to registers: the latter carry dependencies but the former
do not. This is exemplified in the following snippet:

  int r = READ_ONCE(*x);
  WRITE_ONCE(*y, r);

Here a data dependency links the READ_ONCE() to the WRITE_ONCE(),
preserving their order, because the model treats r as a register.
If r is turned into a memory location accessed by plain accesses,
however, the link is broken and the order between READ_ONCE() and
WRITE_ONCE() is no longer preserved.

This is too conservative, since any optimizations on plain
accesses that might break dependencies are also possible on
registers; it also contradicts the intuitive notion of "dependency"
as the data stored by the WRITE_ONCE() does depend on the data read
by the READ_ONCE(), independently of whether r is a register or a
memory location.

This is resolved by redefining all dependencies to include
dependencies carried by memory accesses; a dependency is said to be
carried by memory accesses (in the model: carry-dep) from one load
to another load if the initial load is followed by an arbitrarily
long sequence alternating between stores and loads of the same
thread, where the data of each store depends on the previous load,
and is read by the next load.

Any dependency linking the final load in the sequence to another
access also links the initial load in the sequence to that access.

Reported-by: Viktor Vafeiadis <viktor@mpi-sws.org>
Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huawei.com>
---
 .../Documentation/explanation.txt             |  9 ++++-
 tools/memory-model/linux-kernel.bell          |  7 ++++
 .../litmus-tests/dep+plain.litmus             | 34 +++++++++++++++++++
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 tools/memory-model/litmus-tests/dep+plain.litmus
  

Comments

Alan Stern Dec. 1, 2022, 4:02 p.m. UTC | #1
In general this seems like a good idea.  I haven't been able to think up 
any situations where adding these new dependencies would give a wrong 
answer (except for cases where the dependencies end up being 
non-semantic, which can already happen for the regular dependencies we 
currently have).

I just have a few stylistic comments...

On Thu, Dec 01, 2022 at 01:18:08PM +0100, Jonas Oberhauser wrote:
> From: Jonas Oberhauser <jonas.oberhauser@huawei.com>
> 
> As reported by Viktor, plain accesses in LKMM are weaker than
> accesses to registers: the latter carry dependencies but the former
> do not. This is exemplified in the following snippet:
> 
>   int r = READ_ONCE(*x);
>   WRITE_ONCE(*y, r);
> 
> Here a data dependency links the READ_ONCE() to the WRITE_ONCE(),
> preserving their order, because the model treats r as a register.
> If r is turned into a memory location accessed by plain accesses,
> however, the link is broken and the order between READ_ONCE() and
> WRITE_ONCE() is no longer preserved.
> 
> This is too conservative, since any optimizations on plain
> accesses that might break dependencies are also possible on
> registers; it also contradicts the intuitive notion of "dependency"
> as the data stored by the WRITE_ONCE() does depend on the data read
> by the READ_ONCE(), independently of whether r is a register or a
> memory location.
> 
> This is resolved by redefining all dependencies to include
> dependencies carried by memory accesses; a dependency is said to be
> carried by memory accesses (in the model: carry-dep) from one load
> to another load if the initial load is followed by an arbitrarily
> long sequence alternating between stores and loads of the same
> thread, where the data of each store depends on the previous load,
> and is read by the next load.
> 
> Any dependency linking the final load in the sequence to another
> access also links the initial load in the sequence to that access.
> 
> Reported-by: Viktor Vafeiadis <viktor@mpi-sws.org>
> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huawei.com>
> ---
>  .../Documentation/explanation.txt             |  9 ++++-
>  tools/memory-model/linux-kernel.bell          |  7 ++++
>  .../litmus-tests/dep+plain.litmus             | 34 +++++++++++++++++++
>  3 files changed, 49 insertions(+), 1 deletion(-)
>  create mode 100644 tools/memory-model/litmus-tests/dep+plain.litmus
> 
> diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
> index ee819a402b69..41f75dff0791 100644
> --- a/tools/memory-model/Documentation/explanation.txt
> +++ b/tools/memory-model/Documentation/explanation.txt
> @@ -2544,7 +2544,7 @@ smp_store_release() -- which is basically how the Linux kernel treats
>  them.
>  
>  Although we said that plain accesses are not linked by the ppo
> -relation, they do contribute to it indirectly.  Namely, when there is
> +relation, they do contribute to it indirectly.  Firstly, when there is
>  an address dependency from a marked load R to a plain store W,
>  followed by smp_wmb() and then a marked store W', the LKMM creates a
>  ppo link from R to W'.  The reasoning behind this is perhaps a little
> @@ -2553,6 +2553,13 @@ for this source code in which W' could execute before R.  Just as with
>  pre-bounding by address dependencies, it is possible for the compiler
>  to undermine this relation if sufficient care is not taken.
>  
> +Secondly, plain accesses can carry dependencies: if a data dependency

When a colon is followed by a clause (as opposed to a list), it is 
customary to capitalize the first letter of that clause, just like we 
capitalize the first letter at the start of a sentence.

> +links a marked load R to a store W, and the store is read by a load R'
> +from the same thread, then the data loaded by R' depends on the data
> +loaded originally by R; thus if R' is linked to any access X by a

This is a run-on sentence.  Change the semicolon to a period and start a 
new sentence there.

> +dependency, R is also linked to access X by the same dependency,
> +in particular even if any of W' or R' are plain.

IMO the "in particular" isn't needed.  I'd change this to:

	even if W' or R' (or both!) is plain.

which seems a bit punchier.

> +
>  There are a few oddball fences which need special treatment:
>  smp_mb__before_atomic(), smp_mb__after_atomic(), and
>  smp_mb__after_spinlock().  The LKMM uses fence events with special
> diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> index 5be86b1025e8..f8ec21dd6b7b 100644
> --- a/tools/memory-model/linux-kernel.bell
> +++ b/tools/memory-model/linux-kernel.bell
> @@ -82,3 +82,10 @@ flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>  let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
>  		LKR | LKW | UL | LF | RL | RU
>  let Plain = M \ Marked
> +
> +(* Redefine dependencies to include dependencies carried
> + * through unmarked accesses *)

The style used in these files for multi-line comments is:

(*
 * blah blah blah
 * blah blah blah
 *)

On the other hand, if you change the second "dependencies" to "ones" and 
"unmarked" to "plain", maybe the whole thing will fit on one line.

> +let carry-dep = (data ; rfi)*
> +let addr = carry-dep ; addr
> +let ctrl = carry-dep ; ctrl
> +let data = carry-dep ; data
> diff --git a/tools/memory-model/litmus-tests/dep+plain.litmus b/tools/memory-model/litmus-tests/dep+plain.litmus
> new file mode 100644
> index 000000000000..c4f974b935c5
> --- /dev/null
> +++ b/tools/memory-model/litmus-tests/dep+plain.litmus
> @@ -0,0 +1,34 @@
> +C dep+plain
> +
> +(*
> + * Result: Never
> + *
> + * This litmus test demonstrates that in LKMM, plain accesses
> + * carry dependencies much like accesses to registers:

See the earlier recommendation about colons followed by clauses.

> + * the data stored to *z1 and *z2 by P0() originates from P0()'s
> + * READ_ONCE(), and therefore using that data to compute the
> + * conditional of P0()'s if-statement creates a control dependency
> + * from that READ_ONCE() to P0()'s WRITE_ONCE() which is inside
> + * the if-statement.

We don't need to tell the reader that P0's WRITE_ONCE is inside the "if" 
statement; it's pretty obvious.

> + *

Unnecessary blank line.

> + *)
> +
> +{}
> +
> +P0(int *x, int *y, int *z1, int *z2)
> +{
> +	int a = READ_ONCE(*x);
> +	*z1 = a;
> +	*z2 = *z1;
> +	if (*z2 == 1){
> +		WRITE_ONCE(*y, 1);
> +	}

Kernel programming style says that these {}'s should be omitted.

However, if you replaced the whole conditional with a simple

	WRITE_ONCE(*y, *z2);

then the litmus test would become an example of OOTA!

> +}
> +
> +P1(int *x, int *y)
> +{
> +	int r = smp_load_acquire(y);
> +	smp_store_release(x, r);
> +}
> +
> +exists (x=1 /\ y=1)
> -- 
> 2.17.1

Alan
  
Alan Stern Dec. 1, 2022, 8:21 p.m. UTC | #2
On Thu, Dec 01, 2022 at 05:21:45PM +0000, Jonas Oberhauser wrote:
> Thanks a lot for the helpful and detailed comments!
> Three minor points before I send a new patch:
> 
> > even if W' or R' (or both!) is plain.	
> 
> The "is" sounds slightly weird to me in the sentence because the last part I read is 
> "(or both!)", so I would slightly prefer "are" here.

People are pretty casual about subject-verb number agreement these days 
(there's a growing tendency in English for people to make the verb agree 
with the last noun occurring in the subject rather than the subject as a 
whole), so that should be okay.

> > On the other hand, if you change the second "dependencies" to "ones" and "unmarked" to "plain", maybe the whole thing will fit on one line.
> 
> It fits even if I changed the second dependencies to "those" instead of "ones", i.e.,
> (* Redefine dependencies to include those carried through plain accesses *)
> 
> which I would prefer.

Fine.

> > if you replaced the whole conditional with a simple
> >	WRITE_ONCE(*y, *z2);
> > then the litmus test would become an example of OOTA!
> 
> In my opinion it is already an example of OOTA, which I would define as an
>    rfi | ctrl | addr | data | fence
> cycle.

That's not an unreasonable point of view (if you put rfe rather than 
rfi), but to me OOTA suggests something more: a value arising as if by 
magic rather than as a result of a computation.  In your version of the 
litmus test there is WRITE_ONCE(*y, 1), so it's a little understandable 
that you could end up with 1 as the final values of x and y.  But in my 
version, no values get computed anywhere, so the final value of x and y 
might just as easily be 1 or 56789 -- it literally arises "out of thin 
air".

> Let me know if you agree with these deviations from your suggestion
> and have a great time,

Yes; with those changes you can add:

Reviewed-by: Alan Stern <stern@rowland.harvard.edu>

> jonas
> 
> PS: 
> > When a colon is followed by a clause (as opposed to a list), it is customary to capitalize the first letter of that clause, just like we capitalize the first letter at the start of a sentence.
> 
> In German, we also capitalize after a colon; but my English teachers used to deduct many points throughout my adolescent life whenever I capitalized like that. I still remember some of that red ink with near perfect clarity. So I eventually really took it to heart and started pedantically not-capitalizing after every colon.
> Now the only time it ever mattered in my adult life, I find that I should do it German Style (or, as I just learned, APA & AP Style).
> I suppose life is that way sometimes.

Indeed.

Alan
  
Alan Stern Dec. 2, 2022, 8:22 p.m. UTC | #3
On Fri, Dec 02, 2022 at 05:22:57PM +0000, Jonas Oberhauser wrote:
> > but to me OOTA suggests something more: a value arising as if by 
> > magic rather than as a result of a computation.  In your version of 
> > the litmus test there is WRITE_ONCE(*y, 1), so it's a little 
> > understandable that you could end up with 1 as the final values of x 
> > and y.  But in my version, no values get computed anywhere, so the 
> > final value of x and y might just as easily be 1 or 56789 -- it 
> > literally arises "out of thin air".
> 
> Maybe one can distinguish further between OOTA values (which are 
> arbitrary, not-computed values) and more generally OOTA behaviors.
> 
> How do you feel about examples like the one below:

There's something wrong with this example.

> void *y[2];
> void *x[2] = { (void*)&y[1], (void*)&y[0] };
> 
> P0() {
>     void **t = (void**)(x[0]);

Now t holds a pointer to y[1].

>     *t = (void*)(t-1);

And now y[1] holds a pointer to y[0].

> }
> P1() {
>     void **u = (void**)(x[1]);	

Now u holds a pointer to y[0].

>     *u = (void*)(u+1);

And now y[0] holds a pointer to y[1].

> }
> 
> In this test case the locations x[0] and x[1] exist in the program and 
> are accessed, but their addresses are never (explicitly) taken or 
> stored anywhere.

Although they are dereferened.

> Nevertheless t=&x[1] and u=&x[0] could happen in an appropriately weak 
> memory model (if the data races make you unhappy, you can add relaxed 
> atomic/marked accesses); but not arbitrary values --- if t is not 
> &x[1], it must be &y[1].

I don't see how.  The comments I added above show what values t and u 
must hold, regardless of how the program executes.  The contents of x[] 
never get changed, so there's no question about the values of t and u.

> To me, OOTA suggests simply that the computation can not happen 
> "organically" in a step-by-step way, but can only pop into existence 
> as a whole, "out of thin air" (unless one allows for very aggressive 
> speculation and rollback).

All right, this is more a matter of personal taste and interpretation.  
Is it the computation or the values that pops into existence?  You can 
think of these OOTA computations as arising in a (sort of) ordinary 
step-by-step way, provided you allow loads to read from stores that 
haven't happened yet (a very aggressive form of speculation indeed!).

> In this context I always picture the famous Baron Münchhausen, who 
> pulled himself from mire by his own hair. (Which is an obviously false 
> story because gentlemen at that time were wearing wigs, and a wig 
> could not possibly carry his weight...)

There is a comparable American expression, "pull oneself up by one's 
bootstraps", from which is derived the term "boot" for starting up a 
computer.  :-)

Alan
  
Alan Stern Dec. 3, 2022, 3:20 p.m. UTC | #4
On Sat, Dec 03, 2022 at 11:47:06AM +0000, Jonas Oberhauser wrote:
> 
> 
> -----Original Message-----
> From: Alan Stern [mailto:stern@rowland.harvard.edu] 
> Sent: Friday, December 2, 2022 9:22 PM
> 
> > > void *y[2];
> > > void *x[2] = { (void*)&y[1], (void*)&y[0] };
> > > 
> > > P0() {
> > >     void **t = (void**)(x[0]);
> 
> > Now t holds a pointer to y[1].
> 
> Unfortunately, this kind of inductive reasoning (arguing about what happens based on what happened "before") is not possible with memory models that allow OOTA; as you put it, one must allow for loads reading from stores that haven't happened yet.
> One such store (I promise!(*)) is a store to x[0] which writes &x[1]. Let's consider the alternative universe where we read from this future store, so now t holds a pointer to x[1].
> 
> > >     *t = (void*)(t-1);
> 
> > And now y[1] holds a pointer to y[0].
> 
> In our alternative universe, x[1] now holds a pointer to x[0].
> 
> 
> > > }
> > > P1() {
> > >     void **u = (void**)(x[1]);	
> 
> > Now u holds a pointer to y[0].
> 
> In our alternative universe, u holds the pointer to x[0] stored by P0().
> 
> > >     *u = (void*)(u+1);
> 
> > And now y[0] holds a pointer to y[1].
> 
> In our alternative universe, now x[0] holds a pointer to x[1]. Behold, the store I promised would happen!
> 
> > > }
> 
> > The contents of x[] never get changed, so there's no question about the values of t and u.
> 
> They might get changed, by the stores *t=... and *u=...

Okay, now I understand.  Yes, this counts as an example of values 
appearing out of thin air, even though the values are constrained.  The 
same sort of thing can happen in a less exotic form, such as:

	P0		P1
	------		-----------
	x = y;		y = x & ~1;

Then in an OOTA execution, x and y can end up holding any _even_ value.  
Yes, constraining a value to be even is less drastic than constraining 
it to be &x[0] as in your example, but it's still a constraint.  One 
can easily extend this example in various directions.

Alan
  
Paul E. McKenney Dec. 3, 2022, 7:04 p.m. UTC | #5
On Sat, Dec 03, 2022 at 11:47:06AM +0000, Jonas Oberhauser wrote:
> 
> 
> -----Original Message-----
> From: Alan Stern [mailto:stern@rowland.harvard.edu] 
> Sent: Friday, December 2, 2022 9:22 PM
> 
> > > void *y[2];
> > > void *x[2] = { (void*)&y[1], (void*)&y[0] };
> > > 
> > > P0() {
> > >     void **t = (void**)(x[0]);
> 
> > Now t holds a pointer to y[1].
> 
> Unfortunately, this kind of inductive reasoning (arguing about what happens based on what happened "before") is not possible with memory models that allow OOTA; as you put it, one must allow for loads reading from stores that haven't happened yet.
> One such store (I promise!(*)) is a store to x[0] which writes &x[1]. Let's consider the alternative universe where we read from this future store, so now t holds a pointer to x[1].
> 
> > >     *t = (void*)(t-1);
> 
> > And now y[1] holds a pointer to y[0].
> 
> In our alternative universe, x[1] now holds a pointer to x[0].
> 
> 
> > > }
> > > P1() {
> > >     void **u = (void**)(x[1]);	
> 
> > Now u holds a pointer to y[0].
> 
> In our alternative universe, u holds the pointer to x[0] stored by P0().
> 
> > >     *u = (void*)(u+1);
> 
> > And now y[0] holds a pointer to y[1].
> 
> In our alternative universe, now x[0] holds a pointer to x[1]. Behold, the store I promised would happen!
> 
> > > }
> 
> > The contents of x[] never get changed, so there's no question about the values of t and u.
> 
> They might get changed, by the stores *t=... and *u=...
> 
> Have fun,
> Jonas
> 
> (*= because this example is provided free of charge, there is no actual promise, to the extent permitted by applicable law)

And another reason why I tend to err on marking more accesses rather
than marking fewer.  You never know when some "clever" compiler writer
might add a really strange optimization...

							Thanx, Paul
  

Patch

diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
index ee819a402b69..41f75dff0791 100644
--- a/tools/memory-model/Documentation/explanation.txt
+++ b/tools/memory-model/Documentation/explanation.txt
@@ -2544,7 +2544,7 @@  smp_store_release() -- which is basically how the Linux kernel treats
 them.
 
 Although we said that plain accesses are not linked by the ppo
-relation, they do contribute to it indirectly.  Namely, when there is
+relation, they do contribute to it indirectly.  Firstly, when there is
 an address dependency from a marked load R to a plain store W,
 followed by smp_wmb() and then a marked store W', the LKMM creates a
 ppo link from R to W'.  The reasoning behind this is perhaps a little
@@ -2553,6 +2553,13 @@  for this source code in which W' could execute before R.  Just as with
 pre-bounding by address dependencies, it is possible for the compiler
 to undermine this relation if sufficient care is not taken.
 
+Secondly, plain accesses can carry dependencies: if a data dependency
+links a marked load R to a store W, and the store is read by a load R'
+from the same thread, then the data loaded by R' depends on the data
+loaded originally by R; thus if R' is linked to any access X by a
+dependency, R is also linked to access X by the same dependency,
+in particular even if any of W' or R' are plain.
+
 There are a few oddball fences which need special treatment:
 smp_mb__before_atomic(), smp_mb__after_atomic(), and
 smp_mb__after_spinlock().  The LKMM uses fence events with special
diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index 5be86b1025e8..f8ec21dd6b7b 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -82,3 +82,10 @@  flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
 let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
 		LKR | LKW | UL | LF | RL | RU
 let Plain = M \ Marked
+
+(* Redefine dependencies to include dependencies carried
+ * through unmarked accesses *)
+let carry-dep = (data ; rfi)*
+let addr = carry-dep ; addr
+let ctrl = carry-dep ; ctrl
+let data = carry-dep ; data
diff --git a/tools/memory-model/litmus-tests/dep+plain.litmus b/tools/memory-model/litmus-tests/dep+plain.litmus
new file mode 100644
index 000000000000..c4f974b935c5
--- /dev/null
+++ b/tools/memory-model/litmus-tests/dep+plain.litmus
@@ -0,0 +1,34 @@ 
+C dep+plain
+
+(*
+ * Result: Never
+ *
+ * This litmus test demonstrates that in LKMM, plain accesses
+ * carry dependencies much like accesses to registers:
+ * the data stored to *z1 and *z2 by P0() originates from P0()'s
+ * READ_ONCE(), and therefore using that data to compute the
+ * conditional of P0()'s if-statement creates a control dependency
+ * from that READ_ONCE() to P0()'s WRITE_ONCE() which is inside
+ * the if-statement.
+ *
+ *)
+
+{}
+
+P0(int *x, int *y, int *z1, int *z2)
+{
+	int a = READ_ONCE(*x);
+	*z1 = a;
+	*z2 = *z1;
+	if (*z2 == 1){
+		WRITE_ONCE(*y, 1);
+	}
+}
+
+P1(int *x, int *y)
+{
+	int r = smp_load_acquire(y);
+	smp_store_release(x, r);
+}
+
+exists (x=1 /\ y=1)