[RFC,0/4] random: a simple vDSO mechanism for reseeding userspace CSPRNGs

Message ID cover.1673539719.git.ydroneaud@opteya.com
Headers
Series random: a simple vDSO mechanism for reseeding userspace CSPRNGs |

Message

Yann Droneaud Jan. 12, 2023, 5:02 p.m. UTC
  Hi,

Here's my humble hack at improving kernel for a faster secure arc4random()
userspace implementation, by allowing userspace to buffer getrandom()
generated entropy, discarding it as the kernel's own CSPRNG is reseeded.

It's largely built upon the vDSO work of Jason A. Donenfeld, as part of
its latest patchset "[PATCH v14 0/7] implement getrandom() in vDSO" [1]
but it's made simpler by making available only one of the missing tools
for the userspace to properly buffer the output of getrandom().

Using MADV_WIPEONFORK and mlock(), userspace can reasonably offer forward
secrecy*, until something like VM_DROPPABLE[2] is provided by the kernel,
to allow for the buffer memory to never, ever be written to the disk
before its used, being inherited accross fork(), and isn't limited by
RLIMIT_MEMLOCK.

 * provided userspace can mlock() the memory and calls mlock() on buffer
   after fork, as memory locks are not inherited accross fork().

As it's a hack, it's far from perfect. The main drawback I see is the
case where fresh entropy has to be discarded as the kernel's CSPRNG
generation is updated as the result of calling getrandom() to generate
the mentionned entropy. The workaround, is to limit the amount of fresh
entropy fetched when a kernel's CSPRNG generation change is detected,
and to increase the amount the data retrieved with getrandom() when
generation doesn't change between calls.

Performance wise, the improvements are here, as one can check with the
test program provided:

    getrandom(,,GRND_TIMESTAMP) test
    getrandom() support GRND_TIMESTAMP
    found getrandom() in vDSO at 0x7ffc3efccc60
    == direct syscall getrandom(), 16777216 u32, 2.866324020 s,   5.853 M u32/s, 170.846 ns/u32
    == direct vDSO getrandom(),    16777216 u32, 2.883473280 s,   5.818 M u32/s, 171.868 ns/u32
    == pooled syscall getrandom(), 16777216 u32, 1.152421219 s,  14.558 M u32/s,  68.690 ns/u32, (0 bytes discarded)
    == pooled vDSO getrandom(),    16777216 u32, 0.162477863 s, 103.258 M u32/s,   9.684 ns/u32, (0 bytes discarded)

With the requirement to mlock() the memory page(s) used to buffer
getrandom() output, I'm not sure userspace could afford to allocate
4KBytes per thread, before being hit by RLIMIT_MEMLOCK (or worse,
OOM killer). Thus, some form of sharing between threads would be
needed, which would require locking, reducing the performances
shown above.

Also I haven't studied the security impact of making the kernel base
CSPRNG seed generation available to userspace. It can be made more
opaque if needed.

Regards.

[1] https://lore.kernel.org/all/20230101162910.710293-1-Jason@zx2c4.com/
[2] https://lore.kernel.org/all/20230101162910.710293-3-Jason@zx2c4.com/

Jason A. Donenfeld (2):
  random: introduce generic vDSO getrandom(,, GRND_TIMESTAMP) fast path
  x86: vdso: Wire up getrandom() vDSO implementation.

Yann Droneaud (2):
  random: introduce getrandom() GRND_TIMESTAMP
  testing: add a getrandom() GRND_TIMESTAMP vDSO demonstration/benchmark

 MAINTAINERS                                   |   1 +
 arch/x86/Kconfig                              |   1 +
 arch/x86/entry/vdso/Makefile                  |   3 +-
 arch/x86/entry/vdso/vdso.lds.S                |   2 +
 arch/x86/entry/vdso/vgetrandom.c              |  17 +
 arch/x86/include/asm/vdso/getrandom.h         |  42 +++
 arch/x86/include/asm/vdso/vsyscall.h          |   2 +
 arch/x86/include/asm/vvar.h                   |  16 +
 drivers/char/random.c                         |  52 ++-
 include/linux/random.h                        |  31 ++
 include/uapi/linux/random.h                   |   2 +
 include/vdso/datapage.h                       |   9 +
 lib/vdso/Kconfig                              |   5 +
 lib/vdso/getrandom.c                          |  51 +++
 tools/testing/crypto/getrandom/Makefile       |   4 +
 .../testing/crypto/getrandom/test-getrandom.c | 307 ++++++++++++++++++
 16 files changed, 543 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/entry/vdso/vgetrandom.c
 create mode 100644 arch/x86/include/asm/vdso/getrandom.h
 create mode 100644 lib/vdso/getrandom.c
 create mode 100644 tools/testing/crypto/getrandom/Makefile
 create mode 100644 tools/testing/crypto/getrandom/test-getrandom.c
  

Comments

Jason A. Donenfeld Jan. 12, 2023, 5:07 p.m. UTC | #1
Sorry Yann, but I'm not interested in this approach, and I don't think
reviewing the details of it are a good allocation of time. I don't
want to lock the kernel into having specific reseeding semantics that
are a contract with userspace, which is what this approach does.
Please just let me iterate on my original patchset for a little bit,
without adding more junk to the already overly large conversation.
  
Yann Droneaud Jan. 12, 2023, 7:55 p.m. UTC | #2
Hi

12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit:
 
> Sorry Yann, but I'm not interested in this approach, and I don't think
> reviewing the details of it are a good allocation of time. I don't
> want to lock the kernel into having specific reseeding semantics that
> are a contract with userspace, which is what this approach does.

This patch adds a mean for the kernel to tell userspace: between the
last time you call us with getrandom(timestamp,, GRND_TIMESTAMP),
something happened that trigger an update to the opaque cookie given
to getrandom(timestamp, GRND_TIMESTAMP). When such update happen,
userspace is advised to discard buffered random data and retry.

The meaning of the timestamp cookie is up to the kernel, and can be
changed anytime. Userspace is not expected to read the content of this
blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP):
 -1 : not supported
  0 : cookie not updated, no need to discard buffered data
 >0 : cookie updated, userspace should discard buffered data

For the cookie, I've used a single u64, but two u64 could be a better start,
providing room for implementing improved behavior in future kernel versions.

> Please just let me iterate on my original patchset for a little bit,
> without adding more junk to the already overly large conversation.

I like the simplicity of my so called "junk". It's streamlined, doesn't
require a new syscall, doesn't require a new copy of ChaCha20 code.

I'm sorry it doesn't fit your expectations.

Regards.
  
H. Peter Anvin Jan. 14, 2023, 2:22 a.m. UTC | #3
On 1/12/23 11:55, Yann Droneaud wrote:
> Hi
> 
> 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit:
>   
>> Sorry Yann, but I'm not interested in this approach, and I don't think
>> reviewing the details of it are a good allocation of time. I don't
>> want to lock the kernel into having specific reseeding semantics that
>> are a contract with userspace, which is what this approach does.
> 
> This patch adds a mean for the kernel to tell userspace: between the
> last time you call us with getrandom(timestamp,, GRND_TIMESTAMP),
> something happened that trigger an update to the opaque cookie given
> to getrandom(timestamp, GRND_TIMESTAMP). When such update happen,
> userspace is advised to discard buffered random data and retry.
> 
> The meaning of the timestamp cookie is up to the kernel, and can be
> changed anytime. Userspace is not expected to read the content of this
> blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP):
>   -1 : not supported
>    0 : cookie not updated, no need to discard buffered data
>   >0 : cookie updated, userspace should discard buffered data
> 
> For the cookie, I've used a single u64, but two u64 could be a better start,
> providing room for implementing improved behavior in future kernel versions.
> 
>> Please just let me iterate on my original patchset for a little bit,
>> without adding more junk to the already overly large conversation.
> 
> I like the simplicity of my so called "junk". It's streamlined, doesn't
> require a new syscall, doesn't require a new copy of ChaCha20 code.
> 
> I'm sorry it doesn't fit your expectations.
> 

Why would anything more than a 64-bit counter be ever necessary? It only 
needs to be incremented.

Let user space manage keeping track of the cookie matching its own 
buffers. You do NOT want this to be stateful, because that's just 
begging for multiple libraries to step on each other.

Export the cookie from the vdso and volià, a very cheap check around any 
user space randomness buffer will work:

	static clone_cookie_t last_cookie;
	clone_cookie_t this_cookie;

	this_cookie = get_clone_cookie();
	do {
		while (this_cookie != last_cookie) {
			last_cookie = this_cookie;
			reinit_randomness();
			this_cookie = get_clone_cookie();
		}

		extract_randomness_from_buffer();
		this_cookie = get_clone_cookie();
	} while (this_cookie != last_cookie);

	last_cookie = this_cookie;

	-hpa
  
Andy Lutomirski Jan. 16, 2023, 7:49 p.m. UTC | #4
> On Jan 13, 2023, at 7:16 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> 
> On 1/12/23 11:55, Yann Droneaud wrote:
>> Hi
>> 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit:
>>  
>>> Sorry Yann, but I'm not interested in this approach, and I don't think
>>> reviewing the details of it are a good allocation of time. I don't
>>> want to lock the kernel into having specific reseeding semantics that
>>> are a contract with userspace, which is what this approach does.
>> This patch adds a mean for the kernel to tell userspace: between the
>> last time you call us with getrandom(timestamp,, GRND_TIMESTAMP),
>> something happened that trigger an update to the opaque cookie given
>> to getrandom(timestamp, GRND_TIMESTAMP). When such update happen,
>> userspace is advised to discard buffered random data and retry.
>> The meaning of the timestamp cookie is up to the kernel, and can be
>> changed anytime. Userspace is not expected to read the content of this
>> blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP):
>>  -1 : not supported
>>   0 : cookie not updated, no need to discard buffered data
>>  >0 : cookie updated, userspace should discard buffered data
>> For the cookie, I've used a single u64, but two u64 could be a better start,
>> providing room for implementing improved behavior in future kernel versions.
>>> Please just let me iterate on my original patchset for a little bit,
>>> without adding more junk to the already overly large conversation.
>> I like the simplicity of my so called "junk". It's streamlined, doesn't
>> require a new syscall, doesn't require a new copy of ChaCha20 code.
>> I'm sorry it doesn't fit your expectations.
> 
> Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented.

This is completely broken with CRIU or, for that matter, with VM forking.

> 
> Let user space manage keeping track of the cookie matching its own buffers. You do NOT want this to be stateful, because that's just begging for multiple libraries to step on each other.
> 
> Export the cookie from the vdso and volià, a very cheap check around any user space randomness buffer will work:
> 
>    static clone_cookie_t last_cookie;
>    clone_cookie_t this_cookie;
> 
>    this_cookie = get_clone_cookie();
>    do {
>        while (this_cookie != last_cookie) {
>            last_cookie = this_cookie;
>            reinit_randomness();
>            this_cookie = get_clone_cookie();
>        }
> 
>        extract_randomness_from_buffer();
>        this_cookie = get_clone_cookie();
>    } while (this_cookie != last_cookie);
> 
>    last_cookie = this_cookie;
> 
>    -hpa
  
H. Peter Anvin Jan. 17, 2023, 8:35 a.m. UTC | #5
On January 16, 2023 11:49:42 AM PST, Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>> On Jan 13, 2023, at 7:16 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> 
>> On 1/12/23 11:55, Yann Droneaud wrote:
>>> Hi
>>> 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit:
>>>  
>>>> Sorry Yann, but I'm not interested in this approach, and I don't think
>>>> reviewing the details of it are a good allocation of time. I don't
>>>> want to lock the kernel into having specific reseeding semantics that
>>>> are a contract with userspace, which is what this approach does.
>>> This patch adds a mean for the kernel to tell userspace: between the
>>> last time you call us with getrandom(timestamp,, GRND_TIMESTAMP),
>>> something happened that trigger an update to the opaque cookie given
>>> to getrandom(timestamp, GRND_TIMESTAMP). When such update happen,
>>> userspace is advised to discard buffered random data and retry.
>>> The meaning of the timestamp cookie is up to the kernel, and can be
>>> changed anytime. Userspace is not expected to read the content of this
>>> blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP):
>>>  -1 : not supported
>>>   0 : cookie not updated, no need to discard buffered data
>>>  >0 : cookie updated, userspace should discard buffered data
>>> For the cookie, I've used a single u64, but two u64 could be a better start,
>>> providing room for implementing improved behavior in future kernel versions.
>>>> Please just let me iterate on my original patchset for a little bit,
>>>> without adding more junk to the already overly large conversation.
>>> I like the simplicity of my so called "junk". It's streamlined, doesn't
>>> require a new syscall, doesn't require a new copy of ChaCha20 code.
>>> I'm sorry it doesn't fit your expectations.
>> 
>> Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented.
>
>This is completely broken with CRIU or, for that matter, with VM forking.
>
>> 
>> Let user space manage keeping track of the cookie matching its own buffers. You do NOT want this to be stateful, because that's just begging for multiple libraries to step on each other.
>> 
>> Export the cookie from the vdso and volià, a very cheap check around any user space randomness buffer will work:
>> 
>>    static clone_cookie_t last_cookie;
>>    clone_cookie_t this_cookie;
>> 
>>    this_cookie = get_clone_cookie();
>>    do {
>>        while (this_cookie != last_cookie) {
>>            last_cookie = this_cookie;
>>            reinit_randomness();
>>            this_cookie = get_clone_cookie();
>>        }
>> 
>>        extract_randomness_from_buffer();
>>        this_cookie = get_clone_cookie();
>>    } while (this_cookie != last_cookie);
>> 
>>    last_cookie = this_cookie;
>> 
>>    -hpa
>

For those you would randomize the counter.
  
Yann Droneaud Jan. 19, 2023, 11:19 a.m. UTC | #6
Hi,

16 janvier 2023 à 20:50 "Andy Lutomirski" <luto@amacapital.net> a écrit:
> > On Jan 13, 2023, at 7:16 PM, H. Peter Anvin <hpa@zytor.com> wrote:  
> >  On 1/12/23 11:55, Yann Droneaud wrote:
> > >  12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit:
> > > 
> > 
> >  Sorry Yann, but I'm not interested in this approach, and I don't think
> >  reviewing the details of it are a good allocation of time. I don't
> >  want to lock the kernel into having specific reseeding semantics that
> >  are a contract with userspace, which is what this approach does.
> > 
> > > 
> > > This patch adds a mean for the kernel to tell userspace: between the
> > >  last time you call us with getrandom(timestamp,, GRND_TIMESTAMP),
> > >  something happened that trigger an update to the opaque cookie given
> > >  to getrandom(timestamp, GRND_TIMESTAMP). When such update happen,
> > >  userspace is advised to discard buffered random data and retry.
> > >  The meaning of the timestamp cookie is up to the kernel, and can be
> > >  changed anytime. Userspace is not expected to read the content of this
> > >  blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP):
> > >  -1 : not supported
> > >  0 : cookie not updated, no need to discard buffered data
> > >  >0 : cookie updated, userspace should discard buffered data
> > >  For the cookie, I've used a single u64, but two u64 could be a better start,
> > >  providing room for implementing improved behavior in future kernel versions.
> > > 
> > 
> >  Please just let me iterate on my original patchset for a little bit,
> >  without adding more junk to the already overly large conversation.
> > 
> > > 
> > > I like the simplicity of my so called "junk". It's streamlined, doesn't
> > >  require a new syscall, doesn't require a new copy of ChaCha20 code.
> > >  I'm sorry it doesn't fit your expectations.
> > > 
> > 
> >  
> >  Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented.
> > 
> 
> This is completely broken with CRIU or, for that matter, with VM forking.
>

Which raise the question of the support of CRIU with Jason's vDSO proposal.

AFAIK CRIU handle vDSO[1] by interposing symbols so that, on restore, the process
will call the interposed functions, which will resolve the new vDSO's functions.

vgetrandom_alloc() would have been called before the checkpoint, allocating one
opaque state of size x. After the restore, the vDSO's getrandom() would be given
this opaque state, expecting it having size y. As the content of the opaque state
should have been cleared per MADV_WIPEONFORK, there's nothing in the state that
could help vDSO's getrandom() to achieve backward compatibility.

I think backward compatibility can be achieved by adding an opaque state size
argument to vDSO's getrandom().

What to think Jason ?

[1] https://criu.org/Vdso

Regards.