From patchwork Sat Feb 3 12:25:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 196295 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:9bc1:b0:106:209c:c626 with SMTP id op1csp1001751dyc; Sat, 3 Feb 2024 04:58:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IGups3D9KevHf+pXI/x4JbH0xe05LV/+kzx1kQ6dBymP7L1fPu/u7LVXbUFnYxT+OzGuQ5y X-Received: by 2002:a17:907:6d0b:b0:a36:7327:410a with SMTP id sa11-20020a1709076d0b00b00a367327410amr4207581ejc.58.1706965086650; Sat, 03 Feb 2024 04:58:06 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706965086; cv=pass; d=google.com; s=arc-20160816; b=aZ0WKkC+zlRflKWWjchcF4C0mZc9S8QqLXxrPbFN181N/TsXoSyPKMjlD0N7Kmv5Ms XNNrR781l0jwIS/QjbTxPGG5RM6zO6bLVcfpsNkPaPVRPOt0hHeIfAb9wqEEYqHUjbU0 0j+WdqJb/MOEIPwxI9RvV2NrcpreryMcD3/3KlHdYB8U6UkeHdmfkOlei1skvC+4fytw r4AZRiUECDhgDaXRS054d/2rf7f7nML+I3R8CBUyZTPV7t/Z3ZmQZHRg1lQAkMV1YLe+ PnYpzDuCvDRF0JGMzlszXXc7O6M7HZriA1PNaEtO1rnEGB7vVJEfTtR1J4mN9wRFEZIC Ynow== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=yfROQPcXDVIB71bS35faVxk7xn+z5Xv6pFXSd0VATng=; fh=rphXIu6oWGRyZ/za4ANRrY72GIOMWeIGKQXl6g+v+9Q=; b=WqzFiqI6TFbfAQ1T1ZHM9YxHU55a6kF2CEj4qbfbU9ZiA7LLOXHrHG+stIqfdtwKk5 fx2hVlnOMQxZynfnVO0zLPx4Jy55tb2mUHXwPOgrhdpxnZ97WDS314nV7f6f49KKbtD1 fiUv3PoYLFSiqbLCKPH68oysCdMRLn2GGILLQFfKksNsXQdLctQnKs7IAzSjF+fM1ANp gh65TSov8D/w7v85tGGsh/e8oJZzqUNWCZG+2QddlttL0gLi8Dr+QtwuXCNPPRmaJjq3 6F7k4GNko+OCabUoJ+t4nB/u7ZcIu3EaOcXIPHCbHhXLoYsMBTiqGQvOvCgTvbSkUoj1 is1Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cyNcuS0y; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-51062-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-51062-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Forwarded-Encrypted: i=1; AJvYcCW32BnF1DgzcZNyQhPqln/8QTNcZps9Yn1ljWdKq3Qx/xdKepV/yvVNxlQ5fcwJa+q36bxSALaJM7JJEKQxhj28XPtl7g== Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id pk7-20020a170906d7a700b00a36c41d19ddsi1773437ejb.916.2024.02.03.04.58.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 03 Feb 2024 04:58:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-51062-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cyNcuS0y; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-51062-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-51062-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 04FB21F263FF for ; Sat, 3 Feb 2024 12:38:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 936E373165; Sat, 3 Feb 2024 12:32:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cyNcuS0y" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB8E45F48F; Sat, 3 Feb 2024 12:31:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706963516; cv=none; b=JpVkatNMDU7TDo309anhVpQuM6GN1GuCcUVUMxGFxJImV59WDSSzoOUcev7fc7AKx1CCdAVE0c/rwDSD54hZTZmtatUGvNmdzfwGw3jfJRa1UOQqT3bLCKfhn1eJV2Zo59c+Ya7bQWA3IOrsAyK/hiNku5S1I7P54ySCw/tlw8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706963516; c=relaxed/simple; bh=0Pd0/0g5ocUKEQPs+CZnjpGh8a2yrzXoWJs/48QxMEE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LusQrer36KAoEedHk8WFq/6S77xsMvULmAomJIzg7hly0/cThLGaXZArm/JBIqZ69ZoxXxmFfEKrfQOIM0mNw4lE5bHwgWY2R0ru1HBHkNmBGrFKUjMiOh38fbydSXsM5RcLnKYdRyeHlH8NmEgD0L1yGOp5qVtrx8x/IJnFhQc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cyNcuS0y; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BDE6C43609; Sat, 3 Feb 2024 12:31:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706963516; bh=0Pd0/0g5ocUKEQPs+CZnjpGh8a2yrzXoWJs/48QxMEE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=cyNcuS0yepL5PtUjvbc/4GFBi2f/FzV2Rq68YWIi6IyGPqro5gwLsktAkGRmwPo1a DPZhVCUVajR9H8/X/Mfrr5g/19Z9FqBbjSBiqN49LicQByhoDK78Kzey/i8uaKehHs LTRp/whj0BtqnzkxIl1rHWzXdbCik8Y0arm5N5cnVYOH4p9zdydt9tsy7G5jSIPUlE ghVEwmHGsW1EjfuxRTurzDKjJr6RXkxUft3/jf/K1OSaiK4+nwoWVbcIiXmp8OuW1J s8rV2N0mpTA5rWd/gCd4jovg5NSF9MQOzC4oohJwqa/2pI0gR/8eV2YU/LnDYnWV6Q JBz+FjYiklaJA== From: Mark Brown Date: Sat, 03 Feb 2024 12:25:46 +0000 Subject: [PATCH v8 20/38] arm64/gcs: Ensure that new threads have a GCS Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240203-arm64-gcs-v8-20-c9fec77673ef@kernel.org> References: <20240203-arm64-gcs-v8-0-c9fec77673ef@kernel.org> In-Reply-To: <20240203-arm64-gcs-v8-0-c9fec77673ef@kernel.org> To: Catalin Marinas , Will Deacon , Jonathan Corbet , Andrew Morton , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Arnd Bergmann , Oleg Nesterov , Eric Biederman , Kees Cook , Shuah Khan , "Rick P. Edgecombe" , Deepak Gupta , Ard Biesheuvel , Szabolcs Nagy Cc: "H.J. Lu" , Paul Walmsley , Palmer Dabbelt , Albert Ou , Florian Weimer , Christian Brauner , Thiago Jung Bauermann , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Mark Brown X-Mailer: b4 0.13-dev-a684c X-Developer-Signature: v=1; a=openpgp-sha256; l=8134; i=broonie@kernel.org; h=from:subject:message-id; bh=0Pd0/0g5ocUKEQPs+CZnjpGh8a2yrzXoWJs/48QxMEE=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBlvjDgHmbpnJ6Kt5FNSJQp2DtC/IJQdbRg85cvhx1K Miz2+d6JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCZb4w4AAKCRAk1otyXVSH0Nt4B/ wL1XPKLA8oon7Ntls/p18FPxhxgITjHFhOhCLATvZLdx2/CGCM3XeV/oZsdqUKAMfvNSiqdk2Hge2L fXSpLOz8lwnFEMhBV3cPt2SQCcib5865B65V4ujM7VbyryF0RdOZBJ41g7KrO1xusFXtASE/2LgTRg cLOGH05SSH2Iikwkk6HQbyru9QOvtzsuefI+bj077rEszYry0kVn8SgYKDtyKel1hhgX47wO6gRvy/ LH1qXI2PclfTBlVo+H8Uqr5qA2Vnn4yZpnxjUdpiKQnpuBPZKk3k3h+YZur6WqivBxp086HvEi+kZZ zA7yjuAcOJULaSipntqexmuqh00q3A X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789882622596110655 X-GMAIL-MSGID: 1789882622596110655 When a new thread is created by a thread with GCS enabled the GCS needs to be specified along with the regular stack. clone3() has been extended to support this case, allowing userspace to explicitly specify the size and location of the GCS. The specified GCS must have a valid GCS token at the top of the stack, as though userspace were pivoting to the new GCS. This will be consumed on use. At present we do not atomically consume the token, this will be addressed in a future revision. Unfortunately plain clone() is not extensible and existing clone3() users will not specify a stack so all existing code would be broken if we mandated specifying the stack explicitly. For compatibility with these cases and also x86 (which did not initially implement clone3() support for shadow stacks) if no GCS is specified we will allocate one thread so when a thread is created which has GCS enabled allocate one for it. We follow the extensively discussed x86 implementation and allocate min(RLIMIT_STACK, 4G). Since the GCS only stores the call stack and not any variables this should be more than sufficient for most applications. GCSs allocated via this mechanism then it will be freed when the thread exits, those explicitly configured by the user will not. Signed-off-by: Mark Brown --- arch/arm64/include/asm/gcs.h | 9 ++++ arch/arm64/kernel/process.c | 29 +++++++++++ arch/arm64/mm/gcs.c | 117 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 155 insertions(+) diff --git a/arch/arm64/include/asm/gcs.h b/arch/arm64/include/asm/gcs.h index 04594ef59dad..c1f274fdb9c0 100644 --- a/arch/arm64/include/asm/gcs.h +++ b/arch/arm64/include/asm/gcs.h @@ -8,6 +8,8 @@ #include #include +struct kernel_clone_args; + static inline void gcsb_dsync(void) { asm volatile(".inst 0xd503227f" : : : "memory"); @@ -58,6 +60,8 @@ static inline bool task_gcs_el0_enabled(struct task_struct *task) void gcs_set_el0_mode(struct task_struct *task); void gcs_free(struct task_struct *task); void gcs_preserve_current_state(void); +unsigned long gcs_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args); #else @@ -69,6 +73,11 @@ static inline bool task_gcs_el0_enabled(struct task_struct *task) static inline void gcs_set_el0_mode(struct task_struct *task) { } static inline void gcs_free(struct task_struct *task) { } static inline void gcs_preserve_current_state(void) { } +static inline unsigned long gcs_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args) +{ + return -ENOTSUPP; +} #endif diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index fd80b43c2969..8bd66cde0a86 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -285,9 +285,32 @@ static void flush_gcs(void) write_sysreg_s(0, SYS_GCSPR_EL0); } +static int copy_thread_gcs(struct task_struct *p, + const struct kernel_clone_args *args) +{ + unsigned long gcs; + + gcs = gcs_alloc_thread_stack(p, args); + if (IS_ERR_VALUE(gcs)) + return PTR_ERR((void *)gcs); + + p->thread.gcs_el0_mode = current->thread.gcs_el0_mode; + p->thread.gcs_el0_locked = current->thread.gcs_el0_locked; + + /* Ensure the current state of the GCS is seen by CoW */ + gcsb_dsync(); + + return 0; +} + #else static void flush_gcs(void) { } +static int copy_thread_gcs(struct task_struct *p, + const struct kernel_clone_args *args) +{ + return 0; +} #endif @@ -303,6 +326,7 @@ void flush_thread(void) void arch_release_task_struct(struct task_struct *tsk) { fpsimd_release_task(tsk); + gcs_free(tsk); } int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) @@ -369,6 +393,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) unsigned long stack_start = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); + int ret; memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context)); @@ -410,6 +435,10 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.uw.tp_value = tls; p->thread.tpidr2_el0 = 0; } + + ret = copy_thread_gcs(p, args); + if (ret != 0) + return ret; } else { /* * A kthread has no context to ERET to, so ensure any buggy diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c index b0a67efc522b..3cbc3a3d4bc7 100644 --- a/arch/arm64/mm/gcs.c +++ b/arch/arm64/mm/gcs.c @@ -8,6 +8,113 @@ #include #include +static unsigned long alloc_gcs(unsigned long addr, unsigned long size, + unsigned long token_offset, bool set_res_tok) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + struct mm_struct *mm = current->mm; + unsigned long mapped_addr, unused; + + if (addr) + flags |= MAP_FIXED_NOREPLACE; + + mmap_write_lock(mm); + mapped_addr = do_mmap(NULL, addr, size, PROT_READ | PROT_WRITE, flags, + VM_SHADOW_STACK, 0, &unused, NULL); + mmap_write_unlock(mm); + + return mapped_addr; +} + +static unsigned long gcs_size(unsigned long size) +{ + if (size) + return PAGE_ALIGN(size); + + /* Allocate RLIMIT_STACK/2 with limits of PAGE_SIZE..2G */ + size = PAGE_ALIGN(min_t(unsigned long long, + rlimit(RLIMIT_STACK) / 2, SZ_2G)); + return max(PAGE_SIZE, size); +} + +static bool gcs_consume_token(struct task_struct *tsk, unsigned long user_addr) +{ + unsigned long expected = GCS_CAP(user_addr); + unsigned long val; + int ret = 0; + + /* This should really be an atomic cpmxchg. It is not. */ + __get_user_error(val, (__user unsigned long *)user_addr, ret); + if (ret != 0) + return false; + + if (val != expected) + return false; + + put_user_gcs(0, (__user unsigned long*)user_addr, &ret); + + return ret == 0; +} + +unsigned long gcs_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args) +{ + unsigned long addr, size, gcspr_el0; + + /* If the user specified a GCS use it. */ + if (args->shadow_stack_size) { + if (!system_supports_gcs()) + return (unsigned long)ERR_PTR(-EINVAL); + + addr = args->shadow_stack; + size = args->shadow_stack_size; + + /* + * There should be a token, there might be an end of + * stack marker. + */ + gcspr_el0 = addr + size - (2 * sizeof(u64)); + if (!gcs_consume_token(tsk, gcspr_el0)) { + gcspr_el0 += sizeof(u64); + if (!gcs_consume_token(tsk, gcspr_el0)) + return (unsigned long)ERR_PTR(-EINVAL); + } + + /* Userspace is responsible for unmapping */ + tsk->thread.gcspr_el0 = gcspr_el0 + sizeof(u64); + } else { + + /* + * Otherwise fall back to legacy clone() support and + * implicitly allocate a GCS if we need a new one. + */ + + if (!system_supports_gcs()) + return 0; + + if (!task_gcs_el0_enabled(tsk)) + return 0; + + if ((args->flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM) { + tsk->thread.gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); + return 0; + } + + size = args->stack_size; + + size = gcs_size(size); + addr = alloc_gcs(0, size, 0, 0); + if (IS_ERR_VALUE(addr)) + return addr; + + tsk->thread.gcs_base = addr; + tsk->thread.gcs_size = size; + tsk->thread.gcspr_el0 = addr + size - sizeof(u64); + } + + return addr; +} + /* * Apply the GCS mode configured for the specified task to the * hardware. @@ -30,6 +137,16 @@ void gcs_set_el0_mode(struct task_struct *task) void gcs_free(struct task_struct *task) { + + /* + * When fork() with CLONE_VM fails, the child (tsk) already + * has a GCS allocated, and exit_thread() calls this function + * to free it. In this case the parent (current) and the + * child share the same mm struct. + */ + if (!task->mm || task->mm != current->mm) + return; + if (task->thread.gcs_base) vm_munmap(task->thread.gcs_base, task->thread.gcs_size);