Message ID | 20240125062739.1339782-16-debug@rivosinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2553:b0:103:945f:af90 with SMTP id p19csp1454234dyi; Wed, 24 Jan 2024 22:34:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IHIv9DoFshPM4BtgxxW0m1mqKHqX42Heu2pTEKvpWo36L4rbxSjEWC0bHlQhaGuWbTxB1Z9 X-Received: by 2002:a05:622a:1ba3:b0:42a:67d2:a5a0 with SMTP id bp35-20020a05622a1ba300b0042a67d2a5a0mr565064qtb.53.1706164468818; Wed, 24 Jan 2024 22:34:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706164468; cv=pass; d=google.com; s=arc-20160816; b=Xm+PozCWwHfl2hMEEmu4wvAHgKNzxaHg05f+7v3cME1ifaJs3TpXEHJnE/07/+RBAJ b7Y1leqnOwpw7zV+lFwEH07LwAA73jk8m2zIJKJ49EzBKanON1JaMRFonHRyXD46Uner RrTheiLko3oFfFx0zB+YcVyugsLeNJTxYE5ezS4jJB/+XOa13UpEg1/Qrcj5N8l/e/YB GEhywq4xrDUBUGjxnGrJinDie2msTKzayA3A2YHGtdUkosH2HtoweWNygPTiKQOqi3eS BJCmvXRo7kKk2qIgyFMGuQZrALUhw/cNSIFIM9YEw3DYGetad2Ymtfe3TzcHYoDuT04o BfUQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=t1B3LjRC8FPiusELMKgJyB91QjKz3YM9voAFk/nY+UE=; fh=oy7V1wWqiaug32hsWRXm98kKNp5NFPTWbY7PiGr4deM=; b=Srdy4pmetzcdL73jyHVLu8ffy92TGFRB/LDmKZPndEAt1G4dDoqgDMQO23RwXSwCGS u6wyIcF/wioLlBvvnmqvW0xn7nycHT4l7ikTfyqx+mruyxBvGX/idNFO8f5ECG+P+g+F eLCRlPqr4rntd9jCzEGVtzOXD7KYjeRaOe566gRhQeqrPhLsCYwBpB+TVlYQxUCpE4dv GZtBowkiUfYGtSZ+4zf09XMMkbePkbTM1Cejp9TMVc6+WAsYHFbELrEYcBIe2yAaXLoH SuEygwob+s/AcGefK4h3A/9Pz7RzTr5KiD34a1trfvQKbCeZrDxmeLbBgeszKWttTfno c9Bw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=gr9PwrMW; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id i10-20020ac85e4a000000b004299b7236b8si12301148qtx.632.2024.01.24.22.34.28 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jan 2024 22:34:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=gr9PwrMW; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38023-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 8D11D1C24F2B for <ouuuleilei@gmail.com>; Thu, 25 Jan 2024 06:34:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E5CEA12E49; Thu, 25 Jan 2024 06:29:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="gr9PwrMW" Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 185231AACD for <linux-kernel@vger.kernel.org>; Thu, 25 Jan 2024 06:29:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706164194; cv=none; b=NdZEm1gzKl33hjF+JgMB2dXkfvwvGc9QDfS3FR705WeQsYzvfARQpR7qY4toRyhBIO4jKVrIZNaTfhJYo3zmfEIN5qIt5X0pWhd0y9LqSZm8LFka5aG0/zEehWTCTm40386oj0POqAK3TC4Dspmw2dnrMORrb88R5FelPY9iSRU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706164194; c=relaxed/simple; bh=7JK+V6JPNZ2v8RShHgQu8PnlcNSkr3EnoD4AbOkSp74=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nGroXx1cxM/gtHcQYiJqQYYmpoCUnsujA+DirKcL/Th8wwOi+KTOfx/blaEIpwLf4buHx4tiTGCZYseX4HIUfzhkBBrtzo3FMZg01ZCwW7Q/RJ7F+1ixeTyVyhc1xlJC8i6D7SBiZjL/Hl28zM+IY91lultBK6+jKMCVpBHuAzE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com; spf=pass smtp.mailfrom=rivosinc.com; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b=gr9PwrMW; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-6ddc0c02593so782015b3a.3 for <linux-kernel@vger.kernel.org>; Wed, 24 Jan 2024 22:29:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1706164192; x=1706768992; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=t1B3LjRC8FPiusELMKgJyB91QjKz3YM9voAFk/nY+UE=; b=gr9PwrMWKRBjZZbg+P7gvP/YWNf8eia6KAf7JWHHUNFnwRzHGNHuEwnGZZKxHs5QLw bWNKZ97IcqcWu0ZW0Sob68cbwnrVXOWS0MIEtd0vI5gY2N9iJO2jQJex2fKzh/284ywF X2619yleiFs6BNg76ER1MTlMd4oqQ23Pts6vDAhzpXL41exoBvj6ZfR4qDeETP3xe85j eXnVlWvyh9ASXXbzOfJNDE9qB/pU5lkilwNKB3VTXEc12HTX+GM9aI5e81Bmbo/MrV6e 1KmClIh5nQtC/8Xezw84PuJBVcWQ0Xf2SH3pU0Kp3Xx9Vobi1TyzSSJYpqkXhXym/tsU +J7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706164192; x=1706768992; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t1B3LjRC8FPiusELMKgJyB91QjKz3YM9voAFk/nY+UE=; b=v3wK6V8AH4ncPZFc0FxCC+ahC5fsmqNf9q4o7TLor6qDX5CehI0uVQa9xov75tpzAC iLf0ld+lJOgONvl8wySd/nsxkxP1To3X5/4XDfzzz7Tvo6MGz+9P1ZUY/sII6ZGN6/kk Wp6gIFsiiZQFeZvClXuWsxbsrKMqGksp8FIbXjZ67Pygb/0OuJRP/FnzNzfjkOnGrtMm uy/lbWC05NZoFlX1hq7j6DkfyBgTys+h5ghkiF8JL1z5Pr6dle2gCdfxrFyOQrlr+iLh AJU+uY1zkDEhZaA9EC4ZTsRaRctPB3Qencb4ChslvgADV1CT/gd296RGXwrozrjjCLAx TX0g== X-Gm-Message-State: AOJu0YydJ3vLsIxwh4LQ7rqtDoq9c0FSloo1AecqcpjxvtR3WdyWKmEJ Rj1LY2q6VNfD3x3MdsJMs4gQ9Zjjwcn2ZlRIDm4cFE9iC/y5e3ViOENPEXxJYXc= X-Received: by 2002:aa7:99cd:0:b0:6dd:c3fd:45fb with SMTP id v13-20020aa799cd000000b006ddc3fd45fbmr215232pfi.24.1706164192388; Wed, 24 Jan 2024 22:29:52 -0800 (PST) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id t19-20020a056a00139300b006dd870b51b8sm3201139pfg.126.2024.01.24.22.29.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jan 2024 22:29:52 -0800 (PST) From: debug@rivosinc.com To: rick.p.edgecombe@intel.com, broonie@kernel.org, Szabolcs.Nagy@arm.com, kito.cheng@sifive.com, keescook@chromium.org, ajones@ventanamicro.com, paul.walmsley@sifive.com, palmer@dabbelt.com, conor.dooley@microchip.com, cleger@rivosinc.com, atishp@atishpatra.org, alex@ghiti.fr, bjorn@rivosinc.com, alexghiti@rivosinc.com Cc: corbet@lwn.net, aou@eecs.berkeley.edu, oleg@redhat.com, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, shuah@kernel.org, brauner@kernel.org, debug@rivosinc.com, guoren@kernel.org, samitolvanen@google.com, evan@rivosinc.com, xiao.w.wang@intel.com, apatel@ventanamicro.com, mchitale@ventanamicro.com, waylingii@gmail.com, greentime.hu@sifive.com, heiko@sntech.de, jszhang@kernel.org, shikemeng@huaweicloud.com, david@redhat.com, charlie@rivosinc.com, panqinglin2020@iscas.ac.cn, willy@infradead.org, vincent.chen@sifive.com, andy.chiu@sifive.com, gerg@kernel.org, jeeheng.sia@starfivetech.com, mason.huo@starfivetech.com, ancientmodern4@gmail.com, mathis.salmen@matsal.de, cuiyunhui@bytedance.com, bhe@redhat.com, chenjiahao16@huawei.com, ruscur@russell.cc, bgray@linux.ibm.com, alx@kernel.org, baruch@tkos.co.il, zhangqing@loongson.cn, catalin.marinas@arm.com, revest@chromium.org, josh@joshtriplett.org, joey.gouly@arm.com, shr@devkernel.io, omosnace@redhat.com, ojeda@kernel.org, jhubbard@nvidia.com, linux-doc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [RFC PATCH v1 15/28] riscv/mm: Implement map_shadow_stack() syscall Date: Wed, 24 Jan 2024 22:21:40 -0800 Message-ID: <20240125062739.1339782-16-debug@rivosinc.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240125062739.1339782-1-debug@rivosinc.com> References: <20240125062739.1339782-1-debug@rivosinc.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789043114184897819 X-GMAIL-MSGID: 1789043114184897819 |
Series |
riscv control-flow integrity for usermode
|
|
Commit Message
Deepak Gupta
Jan. 25, 2024, 6:21 a.m. UTC
From: Deepak Gupta <debug@rivosinc.com> As discussed extensively in the changelog for the addition of this syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the existing mmap() and madvise() syscalls do not map entirely well onto the security requirements for guarded control stacks since they lead to windows where memory is allocated but not yet protected or stacks which are not properly and safely initialised. Instead a new syscall map_shadow_stack() has been defined which allocates and initialises a shadow stack page. This patch implements this syscall for riscv. riscv doesn't require token to be setup by kernel because user mode can do that by itself. However to provide compatiblity and portability with other architectues, user mode can specify token set flag. Signed-off-by: Deepak Gupta <debug@rivosinc.com> --- arch/riscv/kernel/Makefile | 2 + arch/riscv/kernel/usercfi.c | 150 ++++++++++++++++++++++++++++++++ include/uapi/asm-generic/mman.h | 1 + 3 files changed, 153 insertions(+) create mode 100644 arch/riscv/kernel/usercfi.c
Comments
On Wed, Jan 24, 2024 at 10:21:40PM -0800, debug@rivosinc.com wrote: > From: Deepak Gupta <debug@rivosinc.com> > > As discussed extensively in the changelog for the addition of this > syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the > existing mmap() and madvise() syscalls do not map entirely well onto the > security requirements for guarded control stacks since they lead to > windows where memory is allocated but not yet protected or stacks which > are not properly and safely initialised. Instead a new syscall > map_shadow_stack() has been defined which allocates and initialises a > shadow stack page. > > This patch implements this syscall for riscv. riscv doesn't require token > to be setup by kernel because user mode can do that by itself. However to > provide compatiblity and portability with other architectues, user mode can > specify token set flag. > > Signed-off-by: Deepak Gupta <debug@rivosinc.com> > --- > arch/riscv/kernel/Makefile | 2 + > arch/riscv/kernel/usercfi.c | 150 ++++++++++++++++++++++++++++++++ > include/uapi/asm-generic/mman.h | 1 + > 3 files changed, 153 insertions(+) > create mode 100644 arch/riscv/kernel/usercfi.c > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index fee22a3d1b53..8c668269e886 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -102,3 +102,5 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ > > obj-$(CONFIG_64BIT) += pi/ > obj-$(CONFIG_ACPI) += acpi.o > + > +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o > diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c > new file mode 100644 > index 000000000000..35ede2cbc05b > --- /dev/null > +++ b/arch/riscv/kernel/usercfi.c > @@ -0,0 +1,150 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2023 Rivos, Inc. Nit: Should be updated to 2024 > + * Deepak Gupta <debug@rivosinc.com> > + */ > + > +#include <linux/sched.h> > +#include <linux/bitops.h> > +#include <linux/types.h> > +#include <linux/mm.h> > +#include <linux/mman.h> > +#include <linux/uaccess.h> > +#include <linux/sizes.h> > +#include <linux/user.h> > +#include <linux/syscalls.h> > +#include <linux/prctl.h> > +#include <asm/csr.h> > +#include <asm/usercfi.h> > + > +#define SHSTK_ENTRY_SIZE sizeof(void *) > + > +/* > + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen > + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to > + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow > + * stack. > + */ > +static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) > +{ > + /* > + * In case ssamoswap faults, return -1. > + * Never expect -1 on shadow stack. Expect return addresses and zero > + */ > + unsigned long swap = -1; > + > + __enable_user_access(); > + asm_volatile_goto( > + ".option push\n" > + ".option arch, +zicfiss\n" > +#ifdef CONFIG_64BIT > + "1: ssamoswap.d %0, %2, %1\n" > +#else > + "1: ssamoswap.w %0, %2, %1\n" A SSAMOSWAP macro that conditionally defines this would be cleaner > +#endif > + _ASM_EXTABLE(1b, %l[fault]) > + RISCV_ACQUIRE_BARRIER > + ".option pop\n" > + : "=r" (swap), "+A" (*addr) I just ran into this on one of my patches that not every compiler supports output args in asm goto blocks. You need to guard this with the kconfig option CC_HAS_ASM_GOTO_TIED_OUTPUT. Unfortunately, that means that this code needs two versions, or you can choose to gate CFI behind this option, it's supported by recent versions of GCC/CLANG. For readability it is also nice to use labels for the asm variables such as `"=r" (swap)` can be `[swap] "=r" (swap)` and then replace %0 with %[swap]. - Charlie > + : "r" (val) > + : "memory" > + : fault > + ); > + __disable_user_access(); > + return swap; > +fault: > + __disable_user_access(); > + return -1; > +} > + > +/* > + * Create a restore token on the shadow stack. A token is always XLEN wide > + * and aligned to XLEN. > + */ > +static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) > +{ > + unsigned long addr; > + > + /* Token must be aligned */ > + if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) > + return -EINVAL; > + > + /* On RISC-V we're constructing token to be function of address itself */ > + addr = ssp - SHSTK_ENTRY_SIZE; > + > + if (amo_user_shstk((unsigned long __user *)addr, (unsigned long) ssp) == -1) > + return -EFAULT; > + > + if (token_addr) > + *token_addr = addr; > + > + return 0; > +} > + > +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, > + unsigned long token_offset, > + bool set_tok) > +{ > + int flags = MAP_ANONYMOUS | MAP_PRIVATE; > + struct mm_struct *mm = current->mm; > + unsigned long populate, tok_loc = 0; > + > + if (addr) > + flags |= MAP_FIXED_NOREPLACE; > + > + mmap_write_lock(mm); > + addr = do_mmap(NULL, addr, size, PROT_SHADOWSTACK, flags, > + VM_SHADOW_STACK, 0, &populate, NULL); > + mmap_write_unlock(mm); > + > + if (!set_tok || IS_ERR_VALUE(addr)) > + goto out; > + > + if (create_rstor_token(addr + token_offset, &tok_loc)) { > + vm_munmap(addr, size); > + return -EINVAL; > + } > + > + addr = tok_loc; > + > +out: > + return addr; > +} > + > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) > +{ > + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; > + unsigned long aligned_size = 0; > + > + if (!cpu_supports_shadow_stack()) > + return -EOPNOTSUPP; > + > + /* Anything other than set token should result in invalid param */ > + if (flags & ~SHADOW_STACK_SET_TOKEN) > + return -EINVAL; > + > + /* > + * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is available > + * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction > + * itself. This provides static property on register programming and writes to CSR can't > + * be unintentional from programmer's perspective. As long as programmer has guarded areas > + * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since > + * CSR_SSP is writeable by user mode, it itself can setup a shadow stack token subsequent > + * to allocation. Although in order to provide portablity with other architecture (because > + * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token > + * flag in flags and if provided in flags, setup a token at the base. > + */ > + > + /* If there isn't space for a token */ > + if (set_tok && size < SHSTK_ENTRY_SIZE) > + return -ENOSPC; > + > + if (addr && (addr % PAGE_SIZE)) > + return -EINVAL; > + > + aligned_size = PAGE_ALIGN(size); > + if (aligned_size < size) > + return -EOVERFLOW; > + > + return allocate_shadow_stack(addr, aligned_size, size, set_tok); > +} > diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h > index 57e8195d0b53..0c0ac6214de6 100644 > --- a/include/uapi/asm-generic/mman.h > +++ b/include/uapi/asm-generic/mman.h > @@ -19,4 +19,5 @@ > #define MCL_FUTURE 2 /* lock all future mappings */ > #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ > > +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ > #endif /* __ASM_GENERIC_MMAN_H */ > -- > 2.43.0 >
On Thu, Jan 25, 2024 at 01:24:16PM -0800, Charlie Jenkins wrote: >On Wed, Jan 24, 2024 at 10:21:40PM -0800, debug@rivosinc.com wrote: >> From: Deepak Gupta <debug@rivosinc.com> >> >> As discussed extensively in the changelog for the addition of this >> syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the >> existing mmap() and madvise() syscalls do not map entirely well onto the >> security requirements for guarded control stacks since they lead to >> windows where memory is allocated but not yet protected or stacks which >> are not properly and safely initialised. Instead a new syscall >> map_shadow_stack() has been defined which allocates and initialises a >> shadow stack page. >> >> This patch implements this syscall for riscv. riscv doesn't require token >> to be setup by kernel because user mode can do that by itself. However to >> provide compatiblity and portability with other architectues, user mode can >> specify token set flag. >> >> Signed-off-by: Deepak Gupta <debug@rivosinc.com> >> --- >> arch/riscv/kernel/Makefile | 2 + >> arch/riscv/kernel/usercfi.c | 150 ++++++++++++++++++++++++++++++++ >> include/uapi/asm-generic/mman.h | 1 + >> 3 files changed, 153 insertions(+) >> create mode 100644 arch/riscv/kernel/usercfi.c >> >> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile >> index fee22a3d1b53..8c668269e886 100644 >> --- a/arch/riscv/kernel/Makefile >> +++ b/arch/riscv/kernel/Makefile >> @@ -102,3 +102,5 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ >> >> obj-$(CONFIG_64BIT) += pi/ >> obj-$(CONFIG_ACPI) += acpi.o >> + >> +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o >> diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c >> new file mode 100644 >> index 000000000000..35ede2cbc05b >> --- /dev/null >> +++ b/arch/riscv/kernel/usercfi.c >> @@ -0,0 +1,150 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* >> + * Copyright (C) 2023 Rivos, Inc. >Nit: Should be updated to 2024 noted >> + * Deepak Gupta <debug@rivosinc.com> >> + */ >> + >> +#include <linux/sched.h> >> +#include <linux/bitops.h> >> +#include <linux/types.h> >> +#include <linux/mm.h> >> +#include <linux/mman.h> >> +#include <linux/uaccess.h> >> +#include <linux/sizes.h> >> +#include <linux/user.h> >> +#include <linux/syscalls.h> >> +#include <linux/prctl.h> >> +#include <asm/csr.h> >> +#include <asm/usercfi.h> >> + >> +#define SHSTK_ENTRY_SIZE sizeof(void *) >> + >> +/* >> + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen >> + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to >> + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow >> + * stack. >> + */ >> +static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) >> +{ >> + /* >> + * In case ssamoswap faults, return -1. >> + * Never expect -1 on shadow stack. Expect return addresses and zero >> + */ >> + unsigned long swap = -1; >> + >> + __enable_user_access(); >> + asm_volatile_goto( >> + ".option push\n" >> + ".option arch, +zicfiss\n" >> +#ifdef CONFIG_64BIT >> + "1: ssamoswap.d %0, %2, %1\n" >> +#else >> + "1: ssamoswap.w %0, %2, %1\n" > >A SSAMOSWAP macro that conditionally defines this would be cleaner Yes I need to do that. Infact I need to gate CONFIG_RISCV_USER_CFI behind some riscv-gnu toolchain version as well. Becuase not all toolchain versions will recognize this. > >> +#endif >> + _ASM_EXTABLE(1b, %l[fault]) >> + RISCV_ACQUIRE_BARRIER >> + ".option pop\n" >> + : "=r" (swap), "+A" (*addr) > >I just ran into this on one of my patches that not every compiler >supports output args in asm goto blocks. You need to guard this with the >kconfig option CC_HAS_ASM_GOTO_TIED_OUTPUT. Unfortunately, that means >that this code needs two versions, or you can choose to gate CFI behind >this option, it's supported by recent versions of GCC/CLANG. Thanks. I'll gate behind CC_HAS_ASM_GOTO_TIED_OUTPUT. Earlier versions of GCC/CLANG won't have CFI support in them anyways. > >For readability it is also nice to use labels for the asm variables such >as `"=r" (swap)` can be `[swap] "=r" (swap)` and then replace %0 with >%[swap]. noted, will do that. I copied it from gcc asm snippet `amoswap` somewhere in kernel. Goes without saying, I am terrible with gcc asm syntax. > >- Charlie > >> + : "r" (val) >> + : "memory" >> + : fault
On Wed, Jan 24, 2024 at 10:21:40PM -0800, debug@rivosinc.com wrote: > As discussed extensively in the changelog for the addition of this > syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the > existing mmap() and madvise() syscalls do not map entirely well onto the > security requirements for guarded control stacks since they lead to > windows where memory is allocated but not yet protected or stacks which > are not properly and safely initialised. Instead a new syscall > map_shadow_stack() has been defined which allocates and initialises a > shadow stack page. While I agree that this is very well written you probably want to update the references to guarded control stacks to whatever the RISC-V term is :P > --- a/include/uapi/asm-generic/mman.h > +++ b/include/uapi/asm-generic/mman.h > @@ -19,4 +19,5 @@ > #define MCL_FUTURE 2 /* lock all future mappings */ > #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ > > +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ > #endif /* __ASM_GENERIC_MMAN_H */ For arm64 I also added a SHADOW_STACK_SET_MARKER for adding a top of stack marker, did you have any thoughts on that for RISC-V? I think x86 were considering adding it too, it'd be good if we could get things consistent.
On Wed, 2024-01-24 at 22:21 -0800, debug@rivosinc.com wrote: > From: Deepak Gupta <debug@rivosinc.com> > > As discussed extensively in the changelog for the addition of this > syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the > existing mmap() and madvise() syscalls do not map entirely well onto > the > security requirements for guarded control stacks since they lead to > windows where memory is allocated but not yet protected or stacks > which > are not properly and safely initialised. Instead a new syscall > map_shadow_stack() has been defined which allocates and initialises a > shadow stack page. > > This patch implements this syscall for riscv. riscv doesn't require > token > to be setup by kernel because user mode can do that by itself. > However to > provide compatiblity and portability with other architectues, user > mode can > specify token set flag. A lot of this code look very familiar. We'll have to think about at what point we could pull some of it into the code kernel. I think if we had an arch write_user_shstk(), most of the code could be shared here.
On Tue, Feb 06, 2024 at 04:01:28PM +0000, Mark Brown wrote: >On Wed, Jan 24, 2024 at 10:21:40PM -0800, debug@rivosinc.com wrote: > >> As discussed extensively in the changelog for the addition of this >> syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the >> existing mmap() and madvise() syscalls do not map entirely well onto the >> security requirements for guarded control stacks since they lead to >> windows where memory is allocated but not yet protected or stacks which >> are not properly and safely initialised. Instead a new syscall >> map_shadow_stack() has been defined which allocates and initialises a >> shadow stack page. > >While I agree that this is very well written you probably want to update >the references to guarded control stacks to whatever the RISC-V term is :P Noted. I'll do that in next patchset. > >> --- a/include/uapi/asm-generic/mman.h >> +++ b/include/uapi/asm-generic/mman.h >> @@ -19,4 +19,5 @@ >> #define MCL_FUTURE 2 /* lock all future mappings */ >> #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ >> >> +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ >> #endif /* __ASM_GENERIC_MMAN_H */ > >For arm64 I also added a SHADOW_STACK_SET_MARKER for adding a top of >stack marker, did you have any thoughts on that for RISC-V? I think x86 >were considering adding it too, it'd be good if we could get things >consistent. Please correct me on this. A token at the top which can't be consumed to restore but *just* purely as marker, right? It's a good design basic with not a lot of cost. I think risc-v should be able to converge on that.
On Fri, Feb 09, 2024 at 08:44:53PM +0000, Edgecombe, Rick P wrote: >On Wed, 2024-01-24 at 22:21 -0800, debug@rivosinc.com wrote: >> From: Deepak Gupta <debug@rivosinc.com> >> >> As discussed extensively in the changelog for the addition of this >> syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the >> existing mmap() and madvise() syscalls do not map entirely well onto >> the >> security requirements for guarded control stacks since they lead to >> windows where memory is allocated but not yet protected or stacks >> which >> are not properly and safely initialised. Instead a new syscall >> map_shadow_stack() has been defined which allocates and initialises a >> shadow stack page. >> >> This patch implements this syscall for riscv. riscv doesn't require >> token >> to be setup by kernel because user mode can do that by itself. >> However to >> provide compatiblity and portability with other architectues, user >> mode can >> specify token set flag. > >A lot of this code look very familiar. We'll have to think about at >what point we could pull some of it into the code kernel. > >I think if we had an arch write_user_shstk(), most of the code could be >shared here. Yes it is. I'll think a little bit more on this on next set of patchsets when I send.
On Wed, Feb 21, 2024 at 04:47:11PM -0800, Deepak Gupta wrote: > On Tue, Feb 06, 2024 at 04:01:28PM +0000, Mark Brown wrote: > > > +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ > > For arm64 I also added a SHADOW_STACK_SET_MARKER for adding a top of > > stack marker, did you have any thoughts on that for RISC-V? I think x86 > > were considering adding it too, it'd be good if we could get things > > consistent. > Please correct me on this. A token at the top which can't be consumed to restore > but *just* purely as marker, right? Yes, for arm64 we just leave a zero word (which can't be a valid token) above the stack switch token, that does mean you can't exactly tell that the top of stack marker is there unless there's also a stack switch token below it. > It's a good design basic with not a lot of cost. > I think risc-v should be able to converge on that. Great.
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index fee22a3d1b53..8c668269e886 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -102,3 +102,5 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ obj-$(CONFIG_64BIT) += pi/ obj-$(CONFIG_ACPI) += acpi.o + +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c new file mode 100644 index 000000000000..35ede2cbc05b --- /dev/null +++ b/arch/riscv/kernel/usercfi.c @@ -0,0 +1,150 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023 Rivos, Inc. + * Deepak Gupta <debug@rivosinc.com> + */ + +#include <linux/sched.h> +#include <linux/bitops.h> +#include <linux/types.h> +#include <linux/mm.h> +#include <linux/mman.h> +#include <linux/uaccess.h> +#include <linux/sizes.h> +#include <linux/user.h> +#include <linux/syscalls.h> +#include <linux/prctl.h> +#include <asm/csr.h> +#include <asm/usercfi.h> + +#define SHSTK_ENTRY_SIZE sizeof(void *) + +/* + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow + * stack. + */ +static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) +{ + /* + * In case ssamoswap faults, return -1. + * Never expect -1 on shadow stack. Expect return addresses and zero + */ + unsigned long swap = -1; + + __enable_user_access(); + asm_volatile_goto( + ".option push\n" + ".option arch, +zicfiss\n" +#ifdef CONFIG_64BIT + "1: ssamoswap.d %0, %2, %1\n" +#else + "1: ssamoswap.w %0, %2, %1\n" +#endif + _ASM_EXTABLE(1b, %l[fault]) + RISCV_ACQUIRE_BARRIER + ".option pop\n" + : "=r" (swap), "+A" (*addr) + : "r" (val) + : "memory" + : fault + ); + __disable_user_access(); + return swap; +fault: + __disable_user_access(); + return -1; +} + +/* + * Create a restore token on the shadow stack. A token is always XLEN wide + * and aligned to XLEN. + */ +static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) +{ + unsigned long addr; + + /* Token must be aligned */ + if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) + return -EINVAL; + + /* On RISC-V we're constructing token to be function of address itself */ + addr = ssp - SHSTK_ENTRY_SIZE; + + if (amo_user_shstk((unsigned long __user *)addr, (unsigned long) ssp) == -1) + return -EFAULT; + + if (token_addr) + *token_addr = addr; + + return 0; +} + +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, + unsigned long token_offset, + bool set_tok) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + struct mm_struct *mm = current->mm; + unsigned long populate, tok_loc = 0; + + if (addr) + flags |= MAP_FIXED_NOREPLACE; + + mmap_write_lock(mm); + addr = do_mmap(NULL, addr, size, PROT_SHADOWSTACK, flags, + VM_SHADOW_STACK, 0, &populate, NULL); + mmap_write_unlock(mm); + + if (!set_tok || IS_ERR_VALUE(addr)) + goto out; + + if (create_rstor_token(addr + token_offset, &tok_loc)) { + vm_munmap(addr, size); + return -EINVAL; + } + + addr = tok_loc; + +out: + return addr; +} + +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) +{ + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; + unsigned long aligned_size = 0; + + if (!cpu_supports_shadow_stack()) + return -EOPNOTSUPP; + + /* Anything other than set token should result in invalid param */ + if (flags & ~SHADOW_STACK_SET_TOKEN) + return -EINVAL; + + /* + * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is available + * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction + * itself. This provides static property on register programming and writes to CSR can't + * be unintentional from programmer's perspective. As long as programmer has guarded areas + * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since + * CSR_SSP is writeable by user mode, it itself can setup a shadow stack token subsequent + * to allocation. Although in order to provide portablity with other architecture (because + * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token + * flag in flags and if provided in flags, setup a token at the base. + */ + + /* If there isn't space for a token */ + if (set_tok && size < SHSTK_ENTRY_SIZE) + return -ENOSPC; + + if (addr && (addr % PAGE_SIZE)) + return -EINVAL; + + aligned_size = PAGE_ALIGN(size); + if (aligned_size < size) + return -EOVERFLOW; + + return allocate_shadow_stack(addr, aligned_size, size, set_tok); +} diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h index 57e8195d0b53..0c0ac6214de6 100644 --- a/include/uapi/asm-generic/mman.h +++ b/include/uapi/asm-generic/mman.h @@ -19,4 +19,5 @@ #define MCL_FUTURE 2 /* lock all future mappings */ #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ #endif /* __ASM_GENERIC_MMAN_H */