From patchwork Mon Nov 21 15:29:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 23919 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1670345wrr; Mon, 21 Nov 2022 07:53:52 -0800 (PST) X-Google-Smtp-Source: AA0mqf6gPGqvLo+SnQIMyxYyUuSSaOc9BUU95/rT5DYt0Hofr/WCn+ESrVzGvtL3FZv2UBvQPCe5 X-Received: by 2002:a17:906:c404:b0:7ad:821f:a3e5 with SMTP id u4-20020a170906c40400b007ad821fa3e5mr105242ejz.554.1669046032285; Mon, 21 Nov 2022 07:53:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669046032; cv=none; d=google.com; s=arc-20160816; b=RmqDqUHMwiU09RwAn6BNwauKuVtm9g/Tng81gfnCGW9q0n2vvlOTq4ZOaS7TNE1Na2 TW4WA9rPSjOXnIvHBTFzws7mb8EqZklZA/LUM6j9xYJ9ZIvML9hhkqO4fK7ihsyPlAXF M8T9tLTqxxUZcEFN7s4k93NUXcH23uspK/YIhNa0bzJRXML1B64OAZlqjQMM1SYU77B9 gj4OkG77qMoysjoFAtRXpIEQtsijv11u7bjnfzInqq8IHVxwYhr9dvVHJonHoa53ysUH NYinrKilNb8Ayn/Xp/ZFjQehiIDyR6P1AYUjwOlaZ4Wqa8QSwlqRRBpydlgJYXMonDAc sqBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6pfXxk1R5HZcT70fmvjGw8rCbaYrNrcgLTI5YfCNj44=; b=0OoTU/UvSDS60wAIV4FtlVP259f1Jed0UYvMtutJeFQD/FFx0pXXivrhfQsTXZFr1A /uw8IpVCJUBsLdDV6Ms+rQ9+RQrUojIa+7Og4qUBdbK4qdNl3CKfKPhJVF7nbcvZKzgy nJO3oPPoJbbHdaBU0As65OfEgipebqWy0atMcOgJlXo8Fln8ZSWJuDxhX1h90JzAADZL Z1Mukcb/nMqjpm9Qz/5gF26Q/hCkIFHVTG10jqFEnlN24QCbvqjujODIbric/gwB4H5P 8meyWdzgc9bNV6rcbD7jZ3BqSy9Hgxq3Qo1GrKssX+SQnZ6uO1GoTup6zfBqoocFSpAs 0P4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=TCq7za53; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l10-20020a1709061c4a00b007707ab4be28si7978045ejg.972.2022.11.21.07.53.26; Mon, 21 Nov 2022 07:53:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=TCq7za53; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232506AbiKUPai (ORCPT + 99 others); Mon, 21 Nov 2022 10:30:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232499AbiKUPaF (ORCPT ); Mon, 21 Nov 2022 10:30:05 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73087CDFDE; Mon, 21 Nov 2022 07:29:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C3D2FB810BD; Mon, 21 Nov 2022 15:29:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 63C44C433D6; Mon, 21 Nov 2022 15:29:22 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="TCq7za53" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1669044561; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6pfXxk1R5HZcT70fmvjGw8rCbaYrNrcgLTI5YfCNj44=; b=TCq7za532DNUP4uLI6+EosIR88ZQ39rWhpp8JkScr5OHUoL83IYTLkz9ZgR2ecP23xqEuR AqpPhJN89Caxoq3uANqEgc8La/iVfiK1A3/P2gmXtyfM6l1duDTBtpeV1HytwQ2QXZ44TU 8gK4Nx1tgCAzsM9h6gUsMbEEmOjQ1Sc= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id d23c2dbe (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 21 Nov 2022 15:29:21 +0000 (UTC) From: "Jason A. Donenfeld" To: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de Cc: "Jason A. Donenfeld" , linux-crypto@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , Carlos O'Donell Subject: [PATCH v6 1/3] random: add vgetrandom_alloc() syscall Date: Mon, 21 Nov 2022 16:29:07 +0100 Message-Id: <20221121152909.3414096-2-Jason@zx2c4.com> In-Reply-To: <20221121152909.3414096-1-Jason@zx2c4.com> References: <20221121152909.3414096-1-Jason@zx2c4.com> MIME-Version: 1.0 X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750121612443523336?= X-GMAIL-MSGID: =?utf-8?q?1750121612443523336?= The vDSO getrandom() works over an opaque per-thread state of an unexported size, which must be marked as MADV_WIPEONFORK and be mlock()'d for proper operation. Over time, the nuances of these allocations may change or grow or even differ based on architectural features. The syscall has the signature: void *vgetrandom_alloc([inout] size_t *num, [out] size_t *size_per_each, unsigned int flags); This takes the desired number of opaque states in `num`, and returns a pointer to an array of opaque states, the number actually allocated back in `num`, and the size in bytes of each one in `size_per_each`, enabling a libc to slice up the returned array into a state per each thread. (The `flags` argument is always zero for now.) Libc is expected to allocate a chunk of these on first use, and then dole them out to threads as they're created, allocating more when needed. The following commit shows an example of this, being used in conjunction with the getrandom() vDSO function. We very intentionally do *not* leave state allocation for vDSO getrandom() up to userspace itself, but rather provide this new syscall for such allocations. vDSO getrandom() must not store its state in just any old memory address, but rather just ones that the kernel specially allocates for it, leaving the particularities of those allocations up to the kernel. Signed-off-by: Jason A. Donenfeld --- MAINTAINERS | 1 + arch/x86/Kconfig | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/x86/include/asm/unistd.h | 1 + drivers/char/random.c | 59 +++++++++++++++++++++++++ include/uapi/asm-generic/unistd.h | 7 ++- kernel/sys_ni.c | 3 ++ lib/vdso/getrandom.h | 23 ++++++++++ scripts/checksyscalls.sh | 4 ++ tools/include/uapi/asm-generic/unistd.h | 7 ++- 10 files changed, 105 insertions(+), 2 deletions(-) create mode 100644 lib/vdso/getrandom.h diff --git a/MAINTAINERS b/MAINTAINERS index 256f03904987..843dd6a49538 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17287,6 +17287,7 @@ T: git https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git S: Maintained F: drivers/char/random.c F: drivers/virt/vmgenid.c +F: lib/vdso/getrandom.h RAPIDIO SUBSYSTEM M: Matt Porter diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 67745ceab0db..331e21ba961a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -59,6 +59,7 @@ config X86 # select ACPI_LEGACY_TABLES_LOOKUP if ACPI select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI + select ADVISE_SYSCALLS if X86_64 select ARCH_32BIT_OFF_T if X86_32 select ARCH_CLOCKSOURCE_INIT select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index c84d12608cd2..0186f173f0e8 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -372,6 +372,7 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common vgetrandom_alloc sys_vgetrandom_alloc # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/x86/include/asm/unistd.h b/arch/x86/include/asm/unistd.h index 761173ccc33c..1bf509eaeff1 100644 --- a/arch/x86/include/asm/unistd.h +++ b/arch/x86/include/asm/unistd.h @@ -27,6 +27,7 @@ # define __ARCH_WANT_COMPAT_SYS_PWRITEV64 # define __ARCH_WANT_COMPAT_SYS_PREADV64V2 # define __ARCH_WANT_COMPAT_SYS_PWRITEV64V2 +# define __ARCH_WANT_VGETRANDOM_ALLOC # define X32_NR_syscalls (__NR_x32_syscalls) # define IA32_NR_syscalls (__NR_ia32_syscalls) diff --git a/drivers/char/random.c b/drivers/char/random.c index 65ee69896967..9b64db52849f 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -8,6 +8,7 @@ * into roughly six sections, each with a section header: * * - Initialization and readiness waiting. + * - vDSO support helpers. * - Fast key erasure RNG, the "crng". * - Entropy accumulation and extraction routines. * - Entropy collection routines. @@ -39,6 +40,7 @@ #include #include #include +#include #include #include #include @@ -59,6 +61,7 @@ #include #include #include +#include "../../lib/vdso/getrandom.h" /********************************************************************* * @@ -146,6 +149,62 @@ EXPORT_SYMBOL(wait_for_random_bytes); __func__, (void *)_RET_IP_, crng_init) + +/******************************************************************** + * + * vDSO support helpers. + * + * The actual vDSO function is defined over in lib/vdso/getrandom.c, + * but this section contains the kernel-mode helpers to support that. + * + ********************************************************************/ + +#ifdef __ARCH_WANT_VGETRANDOM_ALLOC +/* + * The vgetrandom() function in userspace requires an opaque state, which this + * function provides to userspace, by mapping a certain number of special pages + * into the calling process. It takes a hint as to the number of opaque states + * desired, and returns the number of opaque states actually allocated, the + * size of each one in bytes, and the address of the first state. + */ +SYSCALL_DEFINE3(vgetrandom_alloc, unsigned long __user *, num, + unsigned long __user *, size_per_each, unsigned int, flags) +{ + unsigned long alloc_size; + unsigned long num_states; + unsigned long pages_addr; + int ret; + + if (flags) + return -EINVAL; + + if (get_user(num_states, num)) + return -EFAULT; + + num_states = clamp(num_states, 1UL, (SIZE_MAX & PAGE_MASK) / sizeof(struct vgetrandom_state)); + alloc_size = PAGE_ALIGN(num_states * sizeof(struct vgetrandom_state)); + + if (put_user(alloc_size / sizeof(struct vgetrandom_state), num) || + put_user(sizeof(struct vgetrandom_state), size_per_each)) + return -EFAULT; + + pages_addr = vm_mmap(NULL, 0, alloc_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, 0); + if (IS_ERR_VALUE(pages_addr)) + return pages_addr; + + ret = do_madvise(current->mm, pages_addr, alloc_size, MADV_WIPEONFORK); + if (ret < 0) + goto err_unmap; + + return pages_addr; + +err_unmap: + vm_munmap(pages_addr, alloc_size); + return ret; +} +#endif + /********************************************************************* * * Fast key erasure RNG, the "crng". diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 45fa180cc56a..77b6debe7e18 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -886,8 +886,13 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv) #define __NR_set_mempolicy_home_node 450 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) +#ifdef __ARCH_WANT_VGETRANDOM_ALLOC +#define __NR_vgetrandom_alloc 451 +__SYSCALL(__NR_vgetrandom_alloc, sys_vgetrandom_alloc) +#endif + #undef __NR_syscalls -#define __NR_syscalls 451 +#define __NR_syscalls 452 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 860b2dcf3ac4..f28196cb919b 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -360,6 +360,9 @@ COND_SYSCALL(pkey_free); /* memfd_secret */ COND_SYSCALL(memfd_secret); +/* random */ +COND_SYSCALL(vgetrandom_alloc); + /* * Architecture specific weak syscall entries. */ diff --git a/lib/vdso/getrandom.h b/lib/vdso/getrandom.h new file mode 100644 index 000000000000..c7f727db2aaa --- /dev/null +++ b/lib/vdso/getrandom.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _VDSO_LIB_GETRANDOM_H +#define _VDSO_LIB_GETRANDOM_H + +#include + +struct vgetrandom_state { + union { + struct { + u8 batch[CHACHA_BLOCK_SIZE * 3 / 2]; + u32 key[CHACHA_KEY_SIZE / sizeof(u32)]; + }; + u8 batch_key[CHACHA_BLOCK_SIZE * 2]; + }; + unsigned long generation; + u8 pos; +}; + +#endif /* _VDSO_LIB_GETRANDOM_H */ diff --git a/scripts/checksyscalls.sh b/scripts/checksyscalls.sh index f33e61aca93d..7f7928c6487f 100755 --- a/scripts/checksyscalls.sh +++ b/scripts/checksyscalls.sh @@ -44,6 +44,10 @@ cat << EOF #define __IGNORE_memfd_secret #endif +#ifndef __ARCH_WANT_VGETRANDOM_ALLOC +#define __IGNORE_vgetrandom_alloc +#endif + /* Missing flags argument */ #define __IGNORE_renameat /* renameat2 */ diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h index 45fa180cc56a..77b6debe7e18 100644 --- a/tools/include/uapi/asm-generic/unistd.h +++ b/tools/include/uapi/asm-generic/unistd.h @@ -886,8 +886,13 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv) #define __NR_set_mempolicy_home_node 450 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) +#ifdef __ARCH_WANT_VGETRANDOM_ALLOC +#define __NR_vgetrandom_alloc 451 +__SYSCALL(__NR_vgetrandom_alloc, sys_vgetrandom_alloc) +#endif + #undef __NR_syscalls -#define __NR_syscalls 451 +#define __NR_syscalls 452 /* * 32 bit systems traditionally used different From patchwork Mon Nov 21 15:29:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 23918 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1670295wrr; Mon, 21 Nov 2022 07:53:46 -0800 (PST) X-Google-Smtp-Source: AA0mqf655IjHlhVljugQZpomTdNU0SLZbnVJieP4o2wLIaeelblEmKjQwMo6T10fFuozKeO54tpr X-Received: by 2002:a05:6402:f15:b0:458:5987:7203 with SMTP id i21-20020a0564020f1500b0045859877203mr16514809eda.161.1669046026316; Mon, 21 Nov 2022 07:53:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669046026; cv=none; d=google.com; s=arc-20160816; b=tKlD4es/xaT+7EVAdixw993I9JtmdBJxpUGThczF/YAJPLA3J12nagbwBSeyJGIUv7 F56TQpuTCBwRHSRlvRkcJOJxvNqn2zlCjRaIrl1FRQu02J1/wtmoKk2u5vvf+ZJlEwuk TwPOqKuaBRLnTN/Zu3Xp4WSivArYn7mWiPjzGFnHBfFxMUxBUwd6h5687hW9KCkIKJEU /kSW7SHoZ7SpKb99KfayXdcFeCdJo2TBiN+1uRiUD88RH0Iok5oxmJtYInN9MBX0UXCo mfHTQn1WtfggM4q4hDPbQYn070B4A3Q85nycR7tBkyd7w0qWiNZJRoR4K27Xg0ku5bEb ziag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/vXP4K74jouidKj6YyQk861xEzk+uZm8S16mZ0gI/IA=; b=S2+3zxg/ikzahbaKFwpN+p9FkrgYgkwjiPh2EuBUfKOhfng8rc97pf7SduWRT9f4Tm 7TR4SC3bTBx9SpcveXv3cjxkF2x5nYr2rGuwLdtIp3HZsKDR6xaCS9Dw9s51WaQZb0IU //B1i/r8mi/hmTiV+Vabul0zXvROqJVtR38VU394n2XZDzjoq9ZORd7OIyInmpZ8052Q cyXBhbGU6YTNEXIXXp9T+IsWhxHW0lyUns5PxqSRRMdMaTJBiN87jdSu++9XkbKRHfcp OvvOdqD6p52sit2Jjp2R+gHmHBd5ebH58uv/hQmXiytpXnHE5tbfDbBMtSd4nZXaiwwZ tsSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=jfDgxhLH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fy15-20020a1709069f0f00b007ae0e8f5993si7960199ejc.252.2022.11.21.07.53.20; Mon, 21 Nov 2022 07:53:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=jfDgxhLH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232370AbiKUPad (ORCPT + 99 others); Mon, 21 Nov 2022 10:30:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232193AbiKUPaH (ORCPT ); Mon, 21 Nov 2022 10:30:07 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E224ECFA4A; Mon, 21 Nov 2022 07:29:28 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 36D80612E1; Mon, 21 Nov 2022 15:29:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B2678C433D6; Mon, 21 Nov 2022 15:29:26 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="jfDgxhLH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1669044565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/vXP4K74jouidKj6YyQk861xEzk+uZm8S16mZ0gI/IA=; b=jfDgxhLHi37mjmXYCW6TEH8mfuAquyHy3t/6iqyw+CZcolPdnNzPrfOBg/AcdR6AGanYe0 8tJzudu0a9jndOxmFaDZo5YHAyOqWBaskCYKFaltJCyeQpp/xqkSVrpMkORniKtXMWGfl6 238BAipTVebi0fl0dEtiAH7RflxuY9U= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 7d407d8b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 21 Nov 2022 15:29:25 +0000 (UTC) From: "Jason A. Donenfeld" To: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de Cc: "Jason A. Donenfeld" , linux-crypto@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , Carlos O'Donell Subject: [PATCH v6 2/3] random: introduce generic vDSO getrandom() implementation Date: Mon, 21 Nov 2022 16:29:08 +0100 Message-Id: <20221121152909.3414096-3-Jason@zx2c4.com> In-Reply-To: <20221121152909.3414096-1-Jason@zx2c4.com> References: <20221121152909.3414096-1-Jason@zx2c4.com> MIME-Version: 1.0 X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS,T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750121606053236408?= X-GMAIL-MSGID: =?utf-8?q?1750121606053236408?= Provide a generic C vDSO getrandom() implementation, which operates on an opaque state returned by vgetrandom_alloc() and produces random bytes the same way as getrandom(). This has a the API signature: ssize_t vgetrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state); The return value and the first 3 arguments are the same as ordinary getrandom(), while the last argument is a pointer to the opaque allocated state. Were all four arguments passed to the getrandom() syscall, nothing different would happen, and the functions would have the exact same behavior. The actual vDSO RNG algorithm implemented is the same one implemented by drivers/char/random.c, using the same fast-erasure techniques as that. Should the in-kernel implementation change, so too will the vDSO one. It requires an implementation of ChaCha20 that does not use any stack, in order to maintain forward secrecy, so this is left as an architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm to implement on a variety of architectures, so this shouldn't be too onerous. Initially, the state is keyless, and so the first call makes a getrandom() syscall to generate that key, and then uses it for subsequent calls. By keeping track of a generation counter, it knows when its key is invalidated and it should fetch a new one using the syscall. Later, more than just a generation counter might be used. Since MADV_WIPEONFORK is set on the opaque state, the key and related state is wiped during a fork(), so secrets don't roll over into new processes, and the same state doesn't accidentally generate the same random stream. The generation counter, as well, is always >0, so that the 0 counter is a useful indication of a fork() or otherwise uninitialized state. If the kernel RNG is not yet initialized, then the vDSO always calls the syscall, because that behavior cannot be emulated in userspace, but fortunately that state is short lived and only during early boot. If it has been initialized, then there is no need to inspect the `flags` argument, because the behavior does not change post-initialization regardless of the `flags` value. Since the opaque state passed to it is mutated, vDSO getrandom() is not reentrant, when used with the same opaque state, which libc should be mindful of. Together with the previous commit that introduces vgetrandom_alloc(), this functionality is intended to be integrated into libc's thread management. As an illustrative example, the following code might be used to do the same outside of libc. All of the static functions are to be considered implementation private, including the vgetrandom_alloc() syscall wrapper, which generally shouldn't be exposed outside of libc, with the non-static vgetrandom() function at the end being the exported interface. The various pthread-isms are expected to be elided into libc internals. This per-thread allocation scheme is very naive and does not shrink; other implementations may choose to be more complex. static void *vgetrandom_alloc(size_t *num, size_t *size_per_each, unsigned int flags) { unsigned long ret = syscall(__NR_vgetrandom_alloc, num, size_per_each, flags); return ret > -4096UL ? NULL : (void *)ret; } static struct { pthread_mutex_t lock; void **states; size_t len, cap; } grnd_allocator = { .lock = PTHREAD_MUTEX_INITIALIZER }; static void *vgetrandom_get_state(void) { void *state = NULL; pthread_mutex_lock(&grnd_allocator.lock); if (!grnd_allocator.len) { size_t new_cap, size_per_each, num = 16; /* Just a hint. */ void *new_block = vgetrandom_alloc(&num, &size_per_each, 0), *new_states; if (!new_block) goto out; new_cap = grnd_allocator.cap + num; new_states = reallocarray(grnd_allocator.states, new_cap, sizeof(*grnd_allocator.states)); if (!new_states) { munmap(new_block, num * size_per_each); goto out; } grnd_allocator.cap = new_cap; grnd_allocator.states = new_states; for (size_t i = 0; i < num; ++i) { grnd_allocator.states[i] = new_block; new_block += size_per_each; } grnd_allocator.len = num; } state = grnd_allocator.states[--grnd_allocator.len]; out: pthread_mutex_unlock(&grnd_allocator.lock); return state; } static void vgetrandom_put_state(void *state) { if (!state) return; pthread_mutex_lock(&grnd_allocator.lock); grnd_allocator.states[grnd_allocator.len++] = state; pthread_mutex_unlock(&grnd_allocator.lock); } static struct { ssize_t(*fn)(void *buf, size_t len, unsigned long flags, void *state); pthread_key_t key; pthread_once_t initialized; } grnd_ctx = { .initialized = PTHREAD_ONCE_INIT }; static void vgetrandom_init(void) { if (pthread_key_create(&grnd_ctx.key, vgetrandom_put_state) != 0) return; grnd_ctx.fn = __vdsosym("LINUX_2.6", "__vdso_getrandom"); } ssize_t vgetrandom(void *buf, size_t len, unsigned long flags) { void *state; pthread_once(&grnd_ctx.initialized, vgetrandom_init); if (!grnd_ctx.fn) return getrandom(buf, len, flags); state = pthread_getspecific(grnd_ctx.key); if (!state) { state = vgetrandom_get_state(); if (pthread_setspecific(grnd_ctx.key, state) != 0) { vgetrandom_put_state(state); state = NULL; } if (!state) return getrandom(buf, len, flags); } return grnd_ctx.fn(buf, len, flags, state); } Signed-off-by: Jason A. Donenfeld --- MAINTAINERS | 1 + drivers/char/random.c | 9 ++++ include/vdso/datapage.h | 6 +++ lib/vdso/Kconfig | 5 ++ lib/vdso/getrandom.c | 113 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 134 insertions(+) create mode 100644 lib/vdso/getrandom.c diff --git a/MAINTAINERS b/MAINTAINERS index 843dd6a49538..e0aa33f54c57 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17287,6 +17287,7 @@ T: git https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git S: Maintained F: drivers/char/random.c F: drivers/virt/vmgenid.c +F: lib/vdso/getrandom.c F: lib/vdso/getrandom.h RAPIDIO SUBSYSTEM diff --git a/drivers/char/random.c b/drivers/char/random.c index 9b64db52849f..5b51e1cb0fcf 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -61,6 +61,9 @@ #include #include #include +#ifdef CONFIG_HAVE_VDSO_GETRANDOM +#include +#endif #include "../../lib/vdso/getrandom.h" /********************************************************************* @@ -307,6 +310,9 @@ static void crng_reseed(struct work_struct *work) if (next_gen == ULONG_MAX) ++next_gen; WRITE_ONCE(base_crng.generation, next_gen); +#ifdef CONFIG_HAVE_VDSO_GETRANDOM + smp_store_release(&_vdso_rng_data.generation, next_gen + 1); +#endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY; spin_unlock_irqrestore(&base_crng.lock, flags); @@ -756,6 +762,9 @@ static void __cold _credit_init_bits(size_t bits) crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */ if (static_key_initialized) execute_in_process_context(crng_set_ready, &set_ready); +#ifdef CONFIG_HAVE_VDSO_GETRANDOM + smp_store_release(&_vdso_rng_data.is_ready, true); +#endif wake_up_interruptible(&crng_init_wait); kill_fasync(&fasync, SIGIO, POLL_IN); pr_notice("crng init done\n"); diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h index 73eb622e7663..cbacfd923a5c 100644 --- a/include/vdso/datapage.h +++ b/include/vdso/datapage.h @@ -109,6 +109,11 @@ struct vdso_data { struct arch_vdso_data arch_data; }; +struct vdso_rng_data { + unsigned long generation; + bool is_ready; +}; + /* * We use the hidden visibility to prevent the compiler from generating a GOT * relocation. Not only is going through a GOT useless (the entry couldn't and @@ -120,6 +125,7 @@ struct vdso_data { */ extern struct vdso_data _vdso_data[CS_BASES] __attribute__((visibility("hidden"))); extern struct vdso_data _timens_data[CS_BASES] __attribute__((visibility("hidden"))); +extern struct vdso_rng_data _vdso_rng_data __attribute__((visibility("hidden"))); /* * The generic vDSO implementation requires that gettimeofday.h diff --git a/lib/vdso/Kconfig b/lib/vdso/Kconfig index d883ac299508..c35fac664574 100644 --- a/lib/vdso/Kconfig +++ b/lib/vdso/Kconfig @@ -30,4 +30,9 @@ config GENERIC_VDSO_TIME_NS Selected by architectures which support time namespaces in the VDSO +config HAVE_VDSO_GETRANDOM + bool + help + Selected by architectures that support vDSO getrandom(). + endif diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c new file mode 100644 index 000000000000..da5ad9b193b2 --- /dev/null +++ b/lib/vdso/getrandom.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include +#include +#include +#include +#include +#include "getrandom.h" + +static void memcpy_and_zero(void *dst, void *src, size_t len) +{ +#define CASCADE(type) \ + while (len >= sizeof(type)) { \ + __put_unaligned_t(type, __get_unaligned_t(type, src), dst); \ + __put_unaligned_t(type, 0, src); \ + dst += sizeof(type); \ + src += sizeof(type); \ + len -= sizeof(type); \ + } +#if IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) +#if BITS_PER_LONG == 64 + CASCADE(u64); +#endif + CASCADE(u32); + CASCADE(u16); +#endif + CASCADE(u8); +#undef CASCADE +} + +static __always_inline ssize_t +__cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_t len, + unsigned int flags, void *opaque_state) +{ + ssize_t ret = min_t(size_t, MAX_RW_COUNT, len); + struct vgetrandom_state *state = opaque_state; + unsigned long current_generation; + void *orig_buffer = buffer; + size_t orig_len = len; + u32 counter[2] = { 0 }; + size_t batch_len, nblocks; + + /* + * If the kernel isn't yet initialized, then the various flags might have some effect + * that we can't emulate in userspace, so use the syscall. Otherwise, the flags have + * no effect, and can continue. + */ + if (unlikely(!rng_info->is_ready)) + return getrandom_syscall(orig_buffer, orig_len, flags); + + if (unlikely(!len)) + return 0; + +retry_generation: + current_generation = READ_ONCE(rng_info->generation); + if (unlikely(state->generation != current_generation)) { + /* Write the generation before filling the key, in case there's a fork before. */ + WRITE_ONCE(state->generation, current_generation); + /* If the generation is wrong, the kernel has reseeded, so we should too. */ + if (getrandom_syscall(state->key, sizeof(state->key), 0) != sizeof(state->key)) + return getrandom_syscall(orig_buffer, orig_len, flags); + /* Set state->pos so that the batch is considered emptied. */ + state->pos = sizeof(state->batch); + } + + len = ret; +more_batch: + /* First use whatever is left from the last call. */ + batch_len = min_t(size_t, sizeof(state->batch) - state->pos, len); + if (batch_len) { + /* Zero out bytes as they're copied out, to preserve forward secrecy. */ + memcpy_and_zero(buffer, state->batch + state->pos, batch_len); + state->pos += batch_len; + buffer += batch_len; + len -= batch_len; + } + if (!len) { + /* + * Since rng_info->generation will never be 0, we re-read state->generation, + * rather than using the local current_generation variable, to learn whether + * we forked. Primarily, though, this indicates whether the rng itself has + * reseeded, in which case we should generate a new key and start over. + */ + if (unlikely(READ_ONCE(state->generation) != READ_ONCE(rng_info->generation))) { + buffer = orig_buffer; + goto retry_generation; + } + return ret; + } + + /* Generate blocks of rng output directly into the buffer while there's enough left. */ + nblocks = len / CHACHA_BLOCK_SIZE; + if (nblocks) { + __arch_chacha20_blocks_nostack(buffer, state->key, counter, nblocks); + buffer += nblocks * CHACHA_BLOCK_SIZE; + len -= nblocks * CHACHA_BLOCK_SIZE; + } + + /* Refill the batch and then overwrite the key, in order to preserve forward secrecy. */ + __arch_chacha20_blocks_nostack(state->batch_key, state->key, counter, 2); + state->pos = 0; + goto more_batch; +} + +static __always_inline ssize_t +__cvdso_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state) +{ + return __cvdso_getrandom_data(__arch_get_vdso_rng_data(), buffer, len, flags, opaque_state); +} From patchwork Mon Nov 21 15:29:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 23920 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1670416wrr; Mon, 21 Nov 2022 07:53:59 -0800 (PST) X-Google-Smtp-Source: AA0mqf73ohZV3L+TpI1yX8Q+m2BsB4OLhIW2DzJfugKw2ozdYTjHJ+ORHRa9G70y3A0wK8/g9B3G X-Received: by 2002:a17:906:4351:b0:78d:513d:f447 with SMTP id z17-20020a170906435100b0078d513df447mr2137177ejm.708.1669046038931; Mon, 21 Nov 2022 07:53:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669046038; cv=none; d=google.com; s=arc-20160816; b=pLMTXqunEN+p5Bvnq3eEZNrIwWwLeXyt+Obm1SAVfITS5FfUSav8f7wKGlfKpKA/Cs bd/uOSfILRum4qtzDOdk3BNYNd5+FF0OfmxkZxbOtfeGw1dx42PBYp8QpoQEnYc7F06M nZqyG4i14++Z4VuLw2x/eZPmrT2WzLNTkbFFy12L8fOgL/aXp4P9zmF4bBA3RPNUdq/K 3D+ZMmVK2apxR6N5XBS73AOmjG5GOcI6gGjinSpa3Dl4cJqNpyM3pkFm1aVjXh72oi0O PrDhIq7nL7l0idth0KBTNAz9VQofwJoreRJjIXrKH5yab73EGch3KUZmX6WEWdDX/wZt YOyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bUe/UPdVanyT01WYSBYtxq9AiXBRH1RhNA3ekgRQKdg=; b=LtEkQTraVui5Ctb6emlFOZeLV/seVRuB7uSv27iJ4LQ+XmFf++fVTA8elFxCXzPcWa wTWmQH9wqSVcgncmD5Xn+J+g9Cg5WopJPPPtaxvZizAWeL24l8oA1ZS/s9gqESd0nmuW 3eBjz5MMfZtaOAt9P4/weJs4xK+cXameJdJnh7AtZ0F7jvxQJ4FLs6h2KfLfuAq7gp6O OVXrfeZ4HjyemqL18wCJ46oJUhprQTNEbuYpbEmgpTSnVrAWEXqHBnZHdKy+kfCnJmb5 1v3gh0ckR/oupDsItcptps4+AOcXks3GoHeB+7wpQ9vbBxb2UdkQD/sDcnWDp+TeX34q 09eg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=PrOQPOSL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u20-20020a170906069400b007add6c835a9si8159622ejb.867.2022.11.21.07.53.31; Mon, 21 Nov 2022 07:53:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=PrOQPOSL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232518AbiKUPbE (ORCPT + 99 others); Mon, 21 Nov 2022 10:31:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232514AbiKUPaO (ORCPT ); Mon, 21 Nov 2022 10:30:14 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03717CFA7E; Mon, 21 Nov 2022 07:29:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 625EC612DA; Mon, 21 Nov 2022 15:29:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDD6AC433D6; Mon, 21 Nov 2022 15:29:30 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="PrOQPOSL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1669044569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bUe/UPdVanyT01WYSBYtxq9AiXBRH1RhNA3ekgRQKdg=; b=PrOQPOSL+jFkJcjruOfktxO0tqBqSV4JIcPXODWTs8apRamvl+Y5WlbOIyFMyMG7hlHpKJ ZNeDq9qKwrLe2cOI1DvtEwdxx7Mjjq2ww1Fi6Is+hLK6nPGmVAEqx1PS+qL3j4yJ0IJLLd UKcUdfP7j1BFLEOhpYAmkWzueTWmQjk= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 2d21d89e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 21 Nov 2022 15:29:29 +0000 (UTC) From: "Jason A. Donenfeld" To: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de Cc: "Jason A. Donenfeld" , linux-crypto@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , Carlos O'Donell Subject: [PATCH v6 3/3] x86: vdso: Wire up getrandom() vDSO implementation Date: Mon, 21 Nov 2022 16:29:09 +0100 Message-Id: <20221121152909.3414096-4-Jason@zx2c4.com> In-Reply-To: <20221121152909.3414096-1-Jason@zx2c4.com> References: <20221121152909.3414096-1-Jason@zx2c4.com> MIME-Version: 1.0 X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750121619037733427?= X-GMAIL-MSGID: =?utf-8?q?1750121619037733427?= Hook up the generic vDSO implementation to the x86 vDSO data page. Since the existing vDSO infrastructure is heavily based on the timekeeping functionality, which works over arrays of bases, a new macro is introduced for vvars that are not arrays. The vDSO function requires a ChaCha20 implementation that does not write to the stack, yet can still do an entire ChaCha20 permutation, so provide this using SSE2, since this is userland code that must work on all x86-64 processors. Signed-off-by: Jason A. Donenfeld --- arch/x86/Kconfig | 1 + arch/x86/entry/vdso/Makefile | 3 +- arch/x86/entry/vdso/vdso.lds.S | 2 + arch/x86/entry/vdso/vgetrandom-chacha.S | 181 ++++++++++++++++++++++++ arch/x86/entry/vdso/vgetrandom.c | 18 +++ arch/x86/include/asm/vdso/getrandom.h | 49 +++++++ arch/x86/include/asm/vdso/vsyscall.h | 2 + arch/x86/include/asm/vvar.h | 16 +++ 8 files changed, 271 insertions(+), 1 deletion(-) create mode 100644 arch/x86/entry/vdso/vgetrandom-chacha.S create mode 100644 arch/x86/entry/vdso/vgetrandom.c create mode 100644 arch/x86/include/asm/vdso/getrandom.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 331e21ba961a..b64b1b1274ae 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -270,6 +270,7 @@ config X86 select HAVE_UNSTABLE_SCHED_CLOCK select HAVE_USER_RETURN_NOTIFIER select HAVE_GENERIC_VDSO + select HAVE_VDSO_GETRANDOM if X86_64 select HOTPLUG_SMT if SMP select IRQ_FORCED_THREADING select NEED_PER_CPU_EMBED_FIRST_CHUNK diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 3e88b9df8c8f..2de64e52236a 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -27,7 +27,7 @@ VDSO32-$(CONFIG_X86_32) := y VDSO32-$(CONFIG_IA32_EMULATION) := y # files to link into the vdso -vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o +vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o vgetrandom.o vgetrandom-chacha.o vobjs32-y := vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o vobjs32-y += vdso32/vclock_gettime.o vobjs-$(CONFIG_X86_SGX) += vsgx.o @@ -104,6 +104,7 @@ CFLAGS_REMOVE_vclock_gettime.o = -pg CFLAGS_REMOVE_vdso32/vclock_gettime.o = -pg CFLAGS_REMOVE_vgetcpu.o = -pg CFLAGS_REMOVE_vsgx.o = -pg +CFLAGS_REMOVE_vgetrandom.o = -pg # # X32 processes use x32 vDSO to access 64bit kernel data. diff --git a/arch/x86/entry/vdso/vdso.lds.S b/arch/x86/entry/vdso/vdso.lds.S index 4bf48462fca7..1919cc39277e 100644 --- a/arch/x86/entry/vdso/vdso.lds.S +++ b/arch/x86/entry/vdso/vdso.lds.S @@ -28,6 +28,8 @@ VERSION { clock_getres; __vdso_clock_getres; __vdso_sgx_enter_enclave; + getrandom; + __vdso_getrandom; local: *; }; } diff --git a/arch/x86/entry/vdso/vgetrandom-chacha.S b/arch/x86/entry/vdso/vgetrandom-chacha.S new file mode 100644 index 000000000000..bc563d95b976 --- /dev/null +++ b/arch/x86/entry/vdso/vgetrandom-chacha.S @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include + +.section .rodata.cst16.CONSTANTS, "aM", @progbits, 16 +.align 16 +CONSTANTS: .octa 0x6b20657479622d323320646e61707865 +.text + +/* + * Very basic SSE2 implementation of ChaCha20. Produces a given positive number + * of blocks of output with a nonce of 0, taking an input key and 8-byte + * counter. Importantly does not spill to the stack. Its arguments are: + * + * rdi: output bytes + * rsi: 32-byte key input + * rdx: 8-byte counter input/output + * rcx: number of 64-byte blocks to write to output + */ +SYM_FUNC_START(chacha20_blocks_nostack) + FRAME_BEGIN + +#define output %rdi +#define key %rsi +#define counter %rdx +#define nblocks %rcx +#define i %al +#define state0 %xmm0 +#define state1 %xmm1 +#define state2 %xmm2 +#define state3 %xmm3 +#define copy0 %xmm4 +#define copy1 %xmm5 +#define copy2 %xmm6 +#define copy3 %xmm7 +#define temp %xmm8 +#define one %xmm9 + + /* copy0 = "expand 32-byte k" */ + movaps CONSTANTS(%rip),copy0 + /* copy1,copy2 = key */ + movdqu 0x00(key),copy1 + movdqu 0x10(key),copy2 + /* copy3 = counter || zero nonce */ + movq 0x00(counter),copy3 + /* one = 1 || 0 */ + movq $1,%rax + movq %rax,one + +.Lblock: + /* state0,state1,state2,state3 = copy0,copy1,copy2,copy3 */ + movdqa copy0,state0 + movdqa copy1,state1 + movdqa copy2,state2 + movdqa copy3,state3 + + movb $10,i +.Lpermute: + /* state0 += state1, state3 = rotl32(state3 ^ state0, 16) */ + paddd state1,state0 + pxor state0,state3 + movdqa state3,temp + pslld $16,temp + psrld $16,state3 + por temp,state3 + + /* state2 += state3, state1 = rotl32(state1 ^ state2, 12) */ + paddd state3,state2 + pxor state2,state1 + movdqa state1,temp + pslld $12,temp + psrld $20,state1 + por temp,state1 + + /* state0 += state1, state3 = rotl32(state3 ^ state0, 8) */ + paddd state1,state0 + pxor state0,state3 + movdqa state3,temp + pslld $8,temp + psrld $24,state3 + por temp,state3 + + /* state2 += state3, state1 = rotl32(state1 ^ state2, 7) */ + paddd state3,state2 + pxor state2,state1 + movdqa state1,temp + pslld $7,temp + psrld $25,state1 + por temp,state1 + + /* state1 = shuffle32(state1, MASK(0, 3, 2, 1)) */ + pshufd $0x39,state1,state1 + /* state2 = shuffle32(state2, MASK(1, 0, 3, 2)) */ + pshufd $0x4e,state2,state2 + /* state3 = shuffle32(state3, MASK(2, 1, 0, 3)) */ + pshufd $0x93,state3,state3 + + /* state0 += state1, state3 = rotl32(state3 ^ state0, 16) */ + paddd state1,state0 + pxor state0,state3 + movdqa state3,temp + pslld $16,temp + psrld $16,state3 + por temp,state3 + + /* state2 += state3, state1 = rotl32(state1 ^ state2, 12) */ + paddd state3,state2 + pxor state2,state1 + movdqa state1,temp + pslld $12,temp + psrld $20,state1 + por temp,state1 + + /* state0 += state1, state3 = rotl32(state3 ^ state0, 8) */ + paddd state1,state0 + pxor state0,state3 + movdqa state3,temp + pslld $8,temp + psrld $24,state3 + por temp,state3 + + /* state2 += state3, state1 = rotl32(state1 ^ state2, 7) */ + paddd state3,state2 + pxor state2,state1 + movdqa state1,temp + pslld $7,temp + psrld $25,state1 + por temp,state1 + + /* state1 = shuffle32(state1, MASK(2, 1, 0, 3)) */ + pshufd $0x93,state1,state1 + /* state2 = shuffle32(state2, MASK(1, 0, 3, 2)) */ + pshufd $0x4e,state2,state2 + /* state3 = shuffle32(state3, MASK(0, 3, 2, 1)) */ + pshufd $0x39,state3,state3 + + decb i + jnz .Lpermute + + /* output0 = state0 + copy0 */ + paddd copy0,state0 + movdqu state0,0x00(output) + /* output1 = state1 + copy1 */ + paddd copy1,state1 + movdqu state1,0x10(output) + /* output2 = state2 + copy2 */ + paddd copy2,state2 + movdqu state2,0x20(output) + /* output3 = state3 + copy3 */ + paddd copy3,state3 + movdqu state3,0x30(output) + + /* ++copy3.counter */ + paddq one,copy3 + + /* output += 64, --nblocks */ + addq $64,output + decq nblocks + jnz .Lblock + + /* counter = copy3.counter */ + movq copy3,0x00(counter) + + /* Zero out all the regs, in case nothing uses these again. */ + pxor state0,state0 + pxor state1,state1 + pxor state2,state2 + pxor state3,state3 + pxor copy0,copy0 + pxor copy1,copy1 + pxor copy2,copy2 + pxor copy3,copy3 + pxor temp,temp + + FRAME_END + RET +SYM_FUNC_END(chacha20_blocks_nostack) diff --git a/arch/x86/entry/vdso/vgetrandom.c b/arch/x86/entry/vdso/vgetrandom.c new file mode 100644 index 000000000000..c7a2476d5d8a --- /dev/null +++ b/arch/x86/entry/vdso/vgetrandom.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2022 Jason A. Donenfeld . All Rights Reserved. + */ +#include +#include + +#include "../../../../lib/vdso/getrandom.c" + +ssize_t __vdso_getrandom(void *buffer, size_t len, unsigned int flags, void *state); + +ssize_t __vdso_getrandom(void *buffer, size_t len, unsigned int flags, void *state) +{ + return __cvdso_getrandom(buffer, len, flags, state); +} + +ssize_t getrandom(void *, size_t, unsigned int, void *) + __attribute__((weak, alias("__vdso_getrandom"))); diff --git a/arch/x86/include/asm/vdso/getrandom.h b/arch/x86/include/asm/vdso/getrandom.h new file mode 100644 index 000000000000..099aca58ef20 --- /dev/null +++ b/arch/x86/include/asm/vdso/getrandom.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 Jason A. Donenfeld . All Rights Reserved. + */ +#ifndef __ASM_VDSO_GETRANDOM_H +#define __ASM_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +#include +#include + +static __always_inline ssize_t +getrandom_syscall(void *buffer, size_t len, unsigned int flags) +{ + long ret; + + asm ("syscall" : "=a" (ret) : + "0" (__NR_getrandom), "D" (buffer), "S" (len), "d" (flags) : + "rcx", "r11", "memory"); + + return ret; +} + +#define __vdso_rng_data (VVAR(_vdso_rng_data)) + +static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) +{ + if (__vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) + return (void *)&__vdso_rng_data + + ((void *)&__timens_vdso_data - (void *)&__vdso_data); + return &__vdso_rng_data; +} + +/* + * Generates a given positive number of block of ChaCha20 output with nonce=0, + * and does not write to any stack or memory outside of the parameters passed + * to it. This way, we don't need to worry about stack data leaking into forked + * child processes. + */ +static __always_inline void __arch_chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks) +{ + extern void chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks); + return chacha20_blocks_nostack(dst_bytes, key, counter, nblocks); +} + +#endif /* !__ASSEMBLY__ */ + +#endif /* __ASM_VDSO_GETRANDOM_H */ diff --git a/arch/x86/include/asm/vdso/vsyscall.h b/arch/x86/include/asm/vdso/vsyscall.h index be199a9b2676..71c56586a22f 100644 --- a/arch/x86/include/asm/vdso/vsyscall.h +++ b/arch/x86/include/asm/vdso/vsyscall.h @@ -11,6 +11,8 @@ #include DEFINE_VVAR(struct vdso_data, _vdso_data); +DEFINE_VVAR_SINGLE(struct vdso_rng_data, _vdso_rng_data); + /* * Update the vDSO data page to keep in sync with kernel timekeeping. */ diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h index 183e98e49ab9..9d9af37f7cab 100644 --- a/arch/x86/include/asm/vvar.h +++ b/arch/x86/include/asm/vvar.h @@ -26,6 +26,8 @@ */ #define DECLARE_VVAR(offset, type, name) \ EMIT_VVAR(name, offset) +#define DECLARE_VVAR_SINGLE(offset, type, name) \ + EMIT_VVAR(name, offset) #else @@ -37,6 +39,10 @@ extern char __vvar_page; extern type timens_ ## name[CS_BASES] \ __attribute__((visibility("hidden"))); \ +#define DECLARE_VVAR_SINGLE(offset, type, name) \ + extern type vvar_ ## name \ + __attribute__((visibility("hidden"))); \ + #define VVAR(name) (vvar_ ## name) #define TIMENS(name) (timens_ ## name) @@ -44,12 +50,22 @@ extern char __vvar_page; type name[CS_BASES] \ __attribute__((section(".vvar_" #name), aligned(16))) __visible +#define DEFINE_VVAR_SINGLE(type, name) \ + type name \ + __attribute__((section(".vvar_" #name), aligned(16))) __visible + #endif /* DECLARE_VVAR(offset, type, name) */ DECLARE_VVAR(128, struct vdso_data, _vdso_data) +#if !defined(_SINGLE_DATA) +#define _SINGLE_DATA +DECLARE_VVAR_SINGLE(640, struct vdso_rng_data, _vdso_rng_data) +#endif + #undef DECLARE_VVAR +#undef DECLARE_VVAR_SINGLE #endif