[RFC] Should writes to /dev/urandom immediately affect reads?

Message ID 20230920060615.GA2739@sol.localdomain
State New
Headers
Series [RFC] Should writes to /dev/urandom immediately affect reads? |

Commit Message

Eric Biggers Sept. 20, 2023, 6:06 a.m. UTC
  Hi Jason!

I'd like to revive the topic of whether writes to /dev/urandom and /dev/random
should immediately reseed the CRNG, as I'm not sure that the past discussions on
this really resolved the issue.  Currently, writes to /dev/{u,}random do *not*
reseed the CRNG, and thus the RNG's output isn't affected by preceding writes
until a reseed happens for an unrelated reason such as the periodic timer.
(Exception: if the state is CRNG_EMPTY, writes do immediately take effect.)

(Note, I'm *not* talking about whether entropy should be credited for writes to
/dev/{u,}random.  It shouldn't be credited, and it isn't.  That's a separate
topic, though it got somewhat conflated with this one in past discussions.)

As you know, the current behavior means that the "obvious" way of managing a
seed file, i.e. 'cat seed_file > /dev/urandom; cat /dev/urandom > seed_file',
may reduce the entropy of the seed file.  Your suggested approach, SeedRNG
(https://git.zx2c4.com/seedrng) works around this by using hashing.  Still, the
current behavior makes things difficult and surprising for users.  The random(4)
man page still recommends the "wrong" way.  Also, IIUC, even if you run SeedRNG,
if you then do something that uses the RNG like generating a cryptographic key,
it's possible that the seed file contributed nothing to it.  The faster
reseeding schedule in v5.18+ does make this less likely, but it can happen.

This topic was discussed in early 2022 in the thread
"random: allow writes to /dev/urandom to influence fast init"
(https://lore.kernel.org/lkml/20220322191436.110963-1-Jason@zx2c4.com/T/#u).
Ultimately, it was decided not to change the current behavior.

I'm wondering if this should be revisited, for a couple reasons.

First, two months after that discussion, commit e85c0fc1d94c ("random: do not
pretend to handle premature next security model") was merged.  I *think* that
eliminated the security reason to delay reseeding on writes to /dev/{u,}random,
as now "premature next" attacks are no longer under consideration.

Second, AFAICT, the current behavior actually hasn't always been the behavior,
contrary to previous statements/discussions which claimed that it's been the
behavior "forever".  I did a bit of research, and whether writes to
/dev/{u,}random take effect immediately actually has a bit of history:

Starting in 1995 (v1.3.30, random.c created): writes take effect immediately
Starting in 1999 (v2.3.16, secondary pool added): writes don't take effect immediately
Starting in 2007 (v2.6.22, commit 7f397dcdb78d): writes take effect immediately
Starting in 2016 (v4.8, ChaCha20 CRNG added): writes don't take effect immediately

So the current "era" of behavior started in 2016, with 1999-2007 matching
current behavior too.  However, the 1999-2007 era ended with commit 7f397dcdb78d
("random: fix seeding with zero entropy") that intentionally changed this
behavior.  Interestingly, discussion of this patch does not appear in any
mailing list archive, though a bug report "Seeding /dev/random not working"
(https://lore.kernel.org/lkml/de6d2b4f0705290453kec2b050pb4d0bebc256b87a5@mail.gmail.com/T/#u)
can be found earlier in the same day the commit was made.  Also, CVE-2007-2453
was allocated for this fix (though the CVE also covers a different RNG bug too).
This suggests that at the time, writes not taking effect immediately was
considered a security vulnerability that needed to be fixed immediately.

commit e192be9d9a30 ("random: replace non-blocking pool with a Chacha20-based
CRNG") then started the current "era" in 2016, returning to the behavior that
had been considered a vulnerability and fixed in 2007.  In contrast to the 2007
change, it's not clear that the 2016 change in behavior was intentional.  It
happened in the big commit that introduced ChaCha20 to the RNG.  (And yes, even
2016 is still a long time ago in kernel terms, but it's not *that* long ago.)

Anyway, I'm wondering if the current behavior is really what we want, especially
considering that the "premature next" security model is no longer targetted.
Should we make writes to /dev/{u,}random immediately reseed the CRNG so that
users don't need to do the dance with hashing the old and new seeds together?

Anything I'm just completely missing?  Performance concerns about reseeding?
Some attack that we're still trying to prevent by delaying the reseed?

This would be the potential change, BTW:
  

Comments

Linus Torvalds Sept. 20, 2023, 6:48 p.m. UTC | #1
On Tue, 19 Sept 2023 at 23:06, Eric Biggers <ebiggers@kernel.org> wrote:
>
> This would be the potential change, BTW:

Entirely regardless of your fundamental question, no, that's not the
potential change.

That causes a crng_reseed() even if the write fails completely and
returns -EFAULT.

So at a *minimum*, I'd expect the patch to be be something like

        memzero_explicit(block, sizeof(block));
-       return ret ? ret : -EFAULT;
+       if (!ret)
+               return -EFAULT;
+       crng_reseed(NULL);
+       return ret;

but even then I'd ask

 - wouldn't we want some kind of minimum check?

 - do we really trust writes to add any actual entropy at all and at what point?

which are admittedly likely the same question just in different guises.

Also, are there any relevant architectures where
"try_to_generate_entropy()" doesn't work? IOW, why do you even care?

                Linus
  
Eric Biggers Sept. 20, 2023, 7:32 p.m. UTC | #2
Hi Linus,

On Wed, Sep 20, 2023 at 11:48:26AM -0700, Linus Torvalds wrote:
> On Tue, 19 Sept 2023 at 23:06, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > This would be the potential change, BTW:
> 
> Entirely regardless of your fundamental question, no, that's not the
> potential change.
> 
> That causes a crng_reseed() even if the write fails completely and
> returns -EFAULT.
> 
> So at a *minimum*, I'd expect the patch to be be something like
> 
>         memzero_explicit(block, sizeof(block));
> -       return ret ? ret : -EFAULT;
> +       if (!ret)
> +               return -EFAULT;
> +       crng_reseed(NULL);
> +       return ret;
> 
> but even then I'd ask
> 
>  - wouldn't we want some kind of minimum check?
> 
>  - do we really trust writes to add any actual entropy at all and at what point?
> 
> which are admittedly likely the same question just in different guises.

Whether to credit entropy for writes to /dev/{u,}random is an unrelated topic,
and the answer is clearly "no, we must not, and we never have" (as I mentioned
in the second paragraph of my email).  I understand the last discussion
https://lore.kernel.org/lkml/20220322191436.110963-1-Jason@zx2c4.com/T/#u
diverged into both topics, but they're not directly related.  Reseeding the CRNG
just makes it up to date with the entropy pool; nothing more than that.

Yes, obviously there's no point in reseeding if nothing actually got added, so
we could skip the reseed in that case if we want to.

> 
> Also, are there any relevant architectures where
> "try_to_generate_entropy()" doesn't work? IOW, why do you even care?
> 

There are, as shown by the fact that the full unification of /dev/urandom and
/dev/random failed yet again.  But similarly, that's unrelated.  The actual
question, which I'm attempting to start a discussion about without getting
sidetracked into questions that may seem related but actually aren't, is simply
whether writes to /dev/{u,}random should immediately affect reads.

- Eric
  
Linus Torvalds Sept. 20, 2023, 7:42 p.m. UTC | #3
On Wed, 20 Sept 2023 at 12:32, Eric Biggers <ebiggers@kernel.org> wrote:
>
> >
> > Also, are there any relevant architectures where
> > "try_to_generate_entropy()" doesn't work? IOW, why do you even care?
> >
>
> There are, as shown by the fact that the full unification of /dev/urandom and
> /dev/random failed yet again.

No, no. That only showed that such architectures exist. It didn't show
that any *relevant* architectures exist.

The ones reported on were 32-bit arm, m68k, microblaze, sparc32,
xtensa.. Maybe others.

> But similarly, that's unrelated.

They are related in the sense fo "why do you actually *care*?"

Because I don't see why any of this actually matters.

               Linus
  
Eric Biggers Sept. 20, 2023, 8:21 p.m. UTC | #4
On Wed, Sep 20, 2023 at 12:42:54PM -0700, Linus Torvalds wrote:
> On Wed, 20 Sept 2023 at 12:32, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > >
> > > Also, are there any relevant architectures where
> > > "try_to_generate_entropy()" doesn't work? IOW, why do you even care?
> > >
> >
> > There are, as shown by the fact that the full unification of /dev/urandom and
> > /dev/random failed yet again.
> 
> No, no. That only showed that such architectures exist. It didn't show
> that any *relevant* architectures exist.
> 
> The ones reported on were 32-bit arm, m68k, microblaze, sparc32,
> xtensa.. Maybe others.
> 
> > But similarly, that's unrelated.
> 
> They are related in the sense fo "why do you actually *care*?"
> 
> Because I don't see why any of this actually matters.
> 

It seems that what you're claiming (in addition to the RNG always being
initialized quickly on platforms that are "relevant", whatever that means) is
that once the RNG is "initialized", there's no need to reseed it anymore.  This
was covered extensively in previous discussions.  It's true in a theoretical
sense, but in practice the "initialized" state is just an approximation.  In
practice, the RNG might have collected either more *or less* entropy than the
needed 256 bits.  *Probably* more, since many of the entropy sources are
estimated conservatively, but it never knows for sure.  That's why the RNG still
reseeds itself periodically.  The question is, given that, shouldn't the RNG
also reseed right away when userspace explicitly adds something to it.  After
all, regardless of which of the mythical Two Big States (initialized or
uninitialized) the kernel considers the RNG to be in, userspace almost certianly
wants the data it wrote to /dev/{u,}random to actually be used.

It's true that the earlier in the boot it is, the more important reseeding is in
general, and the RNG's reseeding schedule already reflects that by doing faster
reseeds for the first 5 minutes of uptime.  We *could* do something similar and
only do the immediate reseed for the first 5 minutes of uptime.  I expect it's
not worth the complexity vs. unconditionally doing it, but it's an option.

BTW, previously you were supportive of using new entropy immediately:
https://lore.kernel.org/linux-crypto/CAHk-=wiq3bKDdt7noWOaMnDL-yYfFHb1CEsNkk8huq4O7ByetA@mail.gmail.com
So it seems that you've changed your opinion?

- Eric
  
Linus Torvalds Sept. 20, 2023, 8:32 p.m. UTC | #5
On Wed, 20 Sept 2023 at 13:21, Eric Biggers <ebiggers@kernel.org> wrote:
>
> It seems that what you're claiming (in addition to the RNG always being
> initialized quickly on platforms that are "relevant", whatever that means) is
> that once the RNG is "initialized", there's no need to reseed it anymore.

No. You are literally putting words in my mouth that I at no point
even implied. You're making up an argument.

I *LITERALLY* am asking a very simple question: WHO DO YOU EVEN CARE
ABOUT THIS "IMMEDIATE" EFFECT.

Give me a real reason. Give me *any* reason.

Don't try to turn this into some other discussion. I'm asking WHY DOES
ANY OF THIS MATTER?

The immediacy has changed several times, as you yourself lined up. And
as far as I can tell, none of this matter in the least.

> The question is, given that, shouldn't the RNG also reseed right
> away when userspace explicitly adds something to it

I don't see that there is any "given" at all.

We do re-seed regularly. I'm not arguing against that.

I'm literally arguing against applying random changes without giving
any actual reason for them.

Which is why I'm asking "why do you care"? Give em a *reason*. Why
would a user space write matter at all?

It was why I also asked about entropy. Because *if* you argue that the
user-space write contains entropy, then that would be a reason.

You didn't.

You argue that the current behavior hasn't been the universal behavior. I agree.

But considering that we've switched behaviors apparently at least
three times, and at no point did it make any difference, my argument
is really that without a *REASON*, why would we switch behavior *four*
times?

Is it just "four changes is better than three"?

             Linus
  
Linus Torvalds Sept. 20, 2023, 8:41 p.m. UTC | #6
On Wed, 20 Sept 2023 at 13:32, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> It was why I also asked about entropy. Because *if* you argue that the
> user-space write contains entropy, then that would be a reason.

To clarify - the jitter entropy question was related to that same
basic issue: if this was meant to be a way to mitigate the lack of
jitter entropy on some platform you care about, that would again
possibly be a reason to care.

Considering that we apparently haven't cared for the last 7 years, I'm
still a bit surprised, but whatever.

What I *don't* want is just more voodoo discussions about /dev/*random
behavior that doesn't have a technical reason for it.

                Linus
  
Eric Biggers Sept. 20, 2023, 8:45 p.m. UTC | #7
On Wed, Sep 20, 2023 at 01:32:29PM -0700, Linus Torvalds wrote:
> On Wed, 20 Sept 2023 at 13:21, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > It seems that what you're claiming (in addition to the RNG always being
> > initialized quickly on platforms that are "relevant", whatever that means) is
> > that once the RNG is "initialized", there's no need to reseed it anymore.
> 
> No. You are literally putting words in my mouth that I at no point
> even implied. You're making up an argument.
> 
> I *LITERALLY* am asking a very simple question: WHO DO YOU EVEN CARE
> ABOUT THIS "IMMEDIATE" EFFECT.
> 
> Give me a real reason. Give me *any* reason.
> 
> Don't try to turn this into some other discussion. I'm asking WHY DOES
> ANY OF THIS MATTER?
> 
> The immediacy has changed several times, as you yourself lined up. And
> as far as I can tell, none of this matter in the least.
> 
> > The question is, given that, shouldn't the RNG also reseed right
> > away when userspace explicitly adds something to it
> 
> I don't see that there is any "given" at all.
> 
> We do re-seed regularly. I'm not arguing against that.
> 
> I'm literally arguing against applying random changes without giving
> any actual reason for them.
> 
> Which is why I'm asking "why do you care"? Give em a *reason*. Why
> would a user space write matter at all?
> 
> It was why I also asked about entropy. Because *if* you argue that the
> user-space write contains entropy, then that would be a reason.
> 
> You didn't.
> 
> You argue that the current behavior hasn't been the universal behavior. I agree.
> 
> But considering that we've switched behaviors apparently at least
> three times, and at no point did it make any difference, my argument
> is really that without a *REASON*, why would we switch behavior *four*
> times?
> 
> Is it just "four changes is better than three"?

See my first email where I explained the problems with the current behavior.
Especially the third paragraph.

I'll just wait until Jason has a chance to reply.  This discussion clearly isn't
achieving anything with just us two.

- Eric
  
Linus Torvalds Sept. 20, 2023, 8:48 p.m. UTC | #8
On Wed, 20 Sept 2023 at 13:45, Eric Biggers <ebiggers@kernel.org> wrote:
>
> See my first email where I explained the problems with the current behavior.
> Especially the third paragraph.

I really don't think that's the obvious way at all. Anybody who treats
a seed file that way just doesn't care, and whipped up a (bad) shell
script randomly.

Whatever.

             Linus
  
Theodore Ts'o Sept. 21, 2023, 1:30 p.m. UTC | #9
On Wed, Sep 20, 2023 at 01:48:55PM -0700, Linus Torvalds wrote:
> On Wed, 20 Sept 2023 at 13:45, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > See my first email where I explained the problems with the current behavior.
> > Especially the third paragraph.
> 
> I really don't think that's the obvious way at all. Anybody who treats
> a seed file that way just doesn't care, and whipped up a (bad) shell
> script randomly.

The shell script (and documentation in the kernel man pages suggesting
the shell script) is basically historical, and obsolete.  It was
needed back when we weren't as aggressively seeding the RNG at boot
time, before we unified /dev/urandom and /dev/random.  These days, I
really don't think it matters all that much.

The main threat we've historically been concerned is badly designed
IOT devices (remember, the 'S' in IOT stands for security) which
generates a long-term cryptographic key within milliseconds of the
initial power-on, which led to such hillarious results as all HP
Printers publically on the Internet having one of 256 possible private
keys.  In those sorts of situations, there *was* no seed file, and
even if there were, it would be identical across all of the IOT's
initially imaged root file system.

I do have one slight concern about unconditionally reseeding whenever
there is a write to /dev/[u]random, whih is in the purely hypothetical
scenario mostly of interest to academics writing crypto papers, where
we assume the attacker has stolen the full internal state of the RNG,
if the attacker is constantly writing a small amount of known data to
/dev/random, and monitoring its output, it would be disabling the
"catastrophic reseed" part of the design, and so it might make it
easier for the attacker to maintain accurate knowledge of the internal
state of the RNG over long period of time.  So a perfectionist would
probably put a timeout where writing to /dev/urandom would result in a
reseed every N minutes or some such.

But honestly?  I'm not convinced it's worth it; devices/systems where
this matter are probably not getting security updates *anyway*, so the
much simpler way the NSA/KGB/MSS would attack the device is paying a
few thousand dollars for a zero-day, and breaking kernel security
that way.

Cheers,

						- Ted
  
David Laight Sept. 21, 2023, 9:57 p.m. UTC | #10
From: Linus Torvalds
> Sent: 20 September 2023 21:41
> 
> On Wed, 20 Sept 2023 at 13:32, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > It was why I also asked about entropy. Because *if* you argue that the
> > user-space write contains entropy, then that would be a reason.
> 
> To clarify - the jitter entropy question was related to that same
> basic issue: if this was meant to be a way to mitigate the lack of
> jitter entropy on some platform you care about, that would again
> possibly be a reason to care.
> 
> Considering that we apparently haven't cared for the last 7 years, I'm
> still a bit surprised, but whatever.
> 
> What I *don't* want is just more voodoo discussions about /dev/*random
> behavior that doesn't have a technical reason for it.

This might all be related to an ongoing repeat of the 'how to initialise
/dev/urandom' on the busybox list.

Such systems are much more likely to be running completely jitter-free
cpu that boot from embedded serial flash with absolutely constant
access times.
The only way you are going to get any entropy early on is to have
saved it from the previous boot.
I don't think it makes any real sense so save it too early - you just
end up with an encoded 'count of the number of boots'.

At the moment it is pretty hard to add the saved entropy.
And you do want it to be used immediately - because what the
kernel has it likely to be pretty limited.

Now, once the startup scripts have run you might decide that an
immediate rehash isn't needed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 3cb37760dfec2..97ea2938c715c 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1420,6 +1420,7 @@  static ssize_t write_pool_user(struct iov_iter *iter)
 			cond_resched();
 		}
 	}
+	crng_reseed(NULL);
 
 	memzero_explicit(block, sizeof(block));
 	return ret ? ret : -EFAULT;