Message ID | 20230227222957.24501-29-rick.p.edgecombe@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp2682825wrd; Mon, 27 Feb 2023 14:34:50 -0800 (PST) X-Google-Smtp-Source: AK7set+7Q3Ctsoh5AknQDACCXIhxigAzfy9aaesYAfGD6zwq5AKmLDZ9cqM6UMVLrszSuHctS3ZH X-Received: by 2002:aa7:cb83:0:b0:4af:64b8:262e with SMTP id r3-20020aa7cb83000000b004af64b8262emr1356834edt.6.1677537290232; Mon, 27 Feb 2023 14:34:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677537290; cv=none; d=google.com; s=arc-20160816; b=t6fY0wYUDlTYvXf0YE/kySTYsM6BhWJ2Ns/x25gRyjjxfVF4eX44OaV0MAxV55r1BP zBuUTQr8btka9CtOP+y3nGdjhRwSCynfXRwMqwFt+Go8NZTLc4YxYNhkpIhOxXtJ5icy gpZJQsKn8kACx4zP8kjvGTbSP+z70mIBIowkV5rnEs24ThuWUw0bb4okjvva692PiZ2F ag4h3RdWmX8NdwizO+UQWDRfdNmNsk0TTwxfFjzp96pI2SKh+3JfKAwcktbVnaUkJIWt qQB76f3heff6ZP330R6EuWLYEHv2wXpthl4iIcIwGkYc1madvFZyCev9dl8b2Vhfx6jY iFYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=1i/FzPbqzxTXG8dJ1u/9sE/uM7j0WUeqnnBeybnxIC8=; b=y5CI1d6fMxxV4akC+o0Wb0baVIGxYX0pYZYRdJ3N/OItEjPBLVMxQus4NikLtk0Hqv 78AIjpqQv3VFMbG+XT/+/kwveNtyf8EilvXmJ8+SMzhS5+7U3hn/3TjqxR9fNiFiXwUV +woc0LNOWrYK1Ui8oEWZhNAlaQ0XTZr8t76PO52+nVNJgxOHE1iKMKstcIQW3s4t3ck2 h7I1dHCJBq6uo9c+lvmYHiaZdLCVaUx+RbsWKUE+hvKnItwXqZxjmBsc4/QLdK03dWZT FQI+q2Tirt0a79rAQdKlzTH8DINPO5NXxPOKR6w55t8aDD+x4ddsuV5JJdZb66gFMdwl XyqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XRa40xyR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g15-20020a056402180f00b004aeeb0e1654si9067820edy.431.2023.02.27.14.34.23; Mon, 27 Feb 2023 14:34:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XRa40xyR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229898AbjB0Wde (ORCPT <rfc822;wenzhi022@gmail.com> + 99 others); Mon, 27 Feb 2023 17:33:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230199AbjB0WcU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 27 Feb 2023 17:32:20 -0500 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5CF829E3C; Mon, 27 Feb 2023 14:31:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677537118; x=1709073118; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=u1FQlCSURy2KLxFeOQDLZBSKCZJ7Cv/holb6byWhLUc=; b=XRa40xyR/0EdBFUkSO1MOuFkUaX/oTYk8Jpk/fA6JFP+pU+rTXENRTBs UjzTG5YLUyyxfAtil787bG43SicBJmh8Tzyw/SAwILNbAgOwWdZXWy/JY OlvFIDwoX/GsKPXNQ+BQiN6FBgLWVicLoiqAe7xT7li9haJJ1wYWBKKD9 3jRuqiQzBgL5NYxHLO/VcbVLIq9h4Y9LsLdaSM83eKybwu+axXypQado9 c6yzdTXzn0CHkiG65MRhTt+kOxu2Lu4/mu5w0rrX0riQKLeERILxzyVLJ C8mMSg1I+SHjPENpqy62CYVUaX4oJJNIP0aNLxNIgle6YUGDw7Up8xYhS w==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="313657669" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="313657669" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 14:31:28 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="848024703" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="848024703" Received: from leonqu-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.209.72.19]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 14:31:27 -0800 From: Rick Edgecombe <rick.p.edgecombe@intel.com> To: x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>, Andy Lutomirski <luto@kernel.org>, Balbir Singh <bsingharora@gmail.com>, Borislav Petkov <bp@alien8.de>, Cyrill Gorcunov <gorcunov@gmail.com>, Dave Hansen <dave.hansen@linux.intel.com>, Eugene Syromiatnikov <esyr@redhat.com>, Florian Weimer <fweimer@redhat.com>, "H . J . Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>, Kees Cook <keescook@chromium.org>, Mike Kravetz <mike.kravetz@oracle.com>, Nadav Amit <nadav.amit@gmail.com>, Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>, Peter Zijlstra <peterz@infradead.org>, Randy Dunlap <rdunlap@infradead.org>, Weijiang Yang <weijiang.yang@intel.com>, "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>, John Allen <john.allen@amd.com>, kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com, david@redhat.com, debug@rivosinc.com Cc: rick.p.edgecombe@intel.com Subject: [PATCH v7 28/41] x86: Introduce userspace API for shadow stack Date: Mon, 27 Feb 2023 14:29:44 -0800 Message-Id: <20230227222957.24501-29-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230227222957.24501-1-rick.p.edgecombe@intel.com> References: <20230227222957.24501-1-rick.p.edgecombe@intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759025341211678844?= X-GMAIL-MSGID: =?utf-8?q?1759025341211678844?= |
Series |
Shadow stacks for userspace
|
|
Commit Message
Edgecombe, Rick P
Feb. 27, 2023, 10:29 p.m. UTC
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Add three new arch_prctl() handles: - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified feature. Returns 0 on success or an error. - ARCH_SHSTK_LOCK prevents future disabling or enabling of the specified feature. Returns 0 on success or an error The features are handled per-thread and inherited over fork(2)/clone(2), but reset on exec(). This is preparation patch. It does not implement any features. Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [tweaked with feedback from tglx] Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> --- v4: - Remove references to CET and replace with shadow stack (Peterz) v3: - Move shstk.c Makefile changes earlier (Kees) - Add #ifdef around features_locked and features (Kees) - Encapsulate features reset earlier in reset_thread_features() so features and features_locked are not referenced in code that would be compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees) - Fix typo in commit log (Kees) - Switch arch_prctl() numbers to avoid conflict with LAM v2: - Only allow one enable/disable per call (tglx) - Return error code like a normal arch_prctl() (Alexander Potapenko) - Make CET only (tglx) --- arch/x86/include/asm/processor.h | 6 +++++ arch/x86/include/asm/shstk.h | 21 +++++++++++++++ arch/x86/include/uapi/asm/prctl.h | 6 +++++ arch/x86/kernel/Makefile | 2 ++ arch/x86/kernel/process_64.c | 7 ++++- arch/x86/kernel/shstk.c | 44 +++++++++++++++++++++++++++++++ 6 files changed, 85 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/shstk.h create mode 100644 arch/x86/kernel/shstk.c
Comments
On Mon, Feb 27, 2023 at 02:29:44PM -0800, Rick Edgecombe wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Add three new arch_prctl() handles: > > - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified > feature. Returns 0 on success or an error. "... or a negative value on error." > - ARCH_SHSTK_LOCK prevents future disabling or enabling of the > specified feature. Returns 0 on success or an error ditto. What is the use case of the feature locking? I'm under the simple assumption that once shstk is enabled for an app, it remains so. I guess my question is rather, what's the use case for enabling shadow stack and then disabling it later for an app...? > The features are handled per-thread and inherited over fork(2)/clone(2), > but reset on exec(). > > This is preparation patch. It does not implement any features. That belongs under the "---" line I guess. > Tested-by: Pengfei Xu <pengfei.xu@intel.com> > Tested-by: John Allen <john.allen@amd.com> > Tested-by: Kees Cook <keescook@chromium.org> > Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> > Reviewed-by: Kees Cook <keescook@chromium.org> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > [tweaked with feedback from tglx] > Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > > --- > v4: > - Remove references to CET and replace with shadow stack (Peterz) > > v3: > - Move shstk.c Makefile changes earlier (Kees) > - Add #ifdef around features_locked and features (Kees) > - Encapsulate features reset earlier in reset_thread_features() so > features and features_locked are not referenced in code that would be > compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees) > - Fix typo in commit log (Kees) > - Switch arch_prctl() numbers to avoid conflict with LAM > > v2: > - Only allow one enable/disable per call (tglx) > - Return error code like a normal arch_prctl() (Alexander Potapenko) > - Make CET only (tglx) > --- > arch/x86/include/asm/processor.h | 6 +++++ > arch/x86/include/asm/shstk.h | 21 +++++++++++++++ > arch/x86/include/uapi/asm/prctl.h | 6 +++++ > arch/x86/kernel/Makefile | 2 ++ > arch/x86/kernel/process_64.c | 7 ++++- > arch/x86/kernel/shstk.c | 44 +++++++++++++++++++++++++++++++ > 6 files changed, 85 insertions(+), 1 deletion(-) > create mode 100644 arch/x86/include/asm/shstk.h > create mode 100644 arch/x86/kernel/shstk.c ... > +long shstk_prctl(struct task_struct *task, int option, unsigned long features) > +{ > + if (option == ARCH_SHSTK_LOCK) { > + task->thread.features_locked |= features; > + return 0; > + } > + > + /* Don't allow via ptrace */ > + if (task != current) > + return -EINVAL; > + > + /* Do not allow to change locked features */ > + if (features & task->thread.features_locked) > + return -EPERM; > + > + /* Only support enabling/disabling one feature at a time. */ > + if (hweight_long(features) > 1) > + return -EINVAL; > + > + if (option == ARCH_SHSTK_DISABLE) { > + return -EINVAL; > + } {} braces left over from some previous version. Can go now.
On Wed, 2023-03-08 at 11:27 +0100, Borislav Petkov wrote: > On Mon, Feb 27, 2023 at 02:29:44PM -0800, Rick Edgecombe wrote: > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > Add three new arch_prctl() handles: > > > > - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified > > feature. Returns 0 on success or an error. > > "... or a negative value on error." Sure. > > > - ARCH_SHSTK_LOCK prevents future disabling or enabling of the > > specified feature. Returns 0 on success or an error > > ditto. > > What is the use case of the feature locking? > > I'm under the simple assumption that once shstk is enabled for an > app, > it remains so. I guess my question is rather, what's the use case for > enabling shadow stack and then disabling it later for an app...? This would be for things like the "permissive mode", where glibc determines that it has to do something like dlopen() an unsupporting DSO much later. But being able to late lock the features is required for the working behavior of glibc as well. Glibc enables shadow stack very early, then disables it later if it finds that any of the normal dynamic libraries don't support it. It only locks shadow stack after this point even in non-permissive mode. The selftest also does a lot of enabling and disabling. > > > The features are handled per-thread and inherited over > > fork(2)/clone(2), > > but reset on exec(). > > > > This is preparation patch. It does not implement any features. > > That belongs under the "---" line I guess. Oh, yes. > > > Tested-by: Pengfei Xu <pengfei.xu@intel.com> > > Tested-by: John Allen <john.allen@amd.com> > > Tested-by: Kees Cook <keescook@chromium.org> > > Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> > > Reviewed-by: Kees Cook <keescook@chromium.org> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > [tweaked with feedback from tglx] > > Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > > > > --- > > v4: > > - Remove references to CET and replace with shadow stack (Peterz) > > > > v3: > > - Move shstk.c Makefile changes earlier (Kees) > > - Add #ifdef around features_locked and features (Kees) > > - Encapsulate features reset earlier in reset_thread_features() so > > features and features_locked are not referenced in code that > > would be > > compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees) > > - Fix typo in commit log (Kees) > > - Switch arch_prctl() numbers to avoid conflict with LAM > > > > v2: > > - Only allow one enable/disable per call (tglx) > > - Return error code like a normal arch_prctl() (Alexander > > Potapenko) > > - Make CET only (tglx) > > --- > > arch/x86/include/asm/processor.h | 6 +++++ > > arch/x86/include/asm/shstk.h | 21 +++++++++++++++ > > arch/x86/include/uapi/asm/prctl.h | 6 +++++ > > arch/x86/kernel/Makefile | 2 ++ > > arch/x86/kernel/process_64.c | 7 ++++- > > arch/x86/kernel/shstk.c | 44 > > +++++++++++++++++++++++++++++++ > > 6 files changed, 85 insertions(+), 1 deletion(-) > > create mode 100644 arch/x86/include/asm/shstk.h > > create mode 100644 arch/x86/kernel/shstk.c > > ... > > > +long shstk_prctl(struct task_struct *task, int option, unsigned > > long features) > > +{ > > + if (option == ARCH_SHSTK_LOCK) { > > + task->thread.features_locked |= features; > > + return 0; > > + } > > + > > + /* Don't allow via ptrace */ > > + if (task != current) > > + return -EINVAL; > > + > > + /* Do not allow to change locked features */ > > + if (features & task->thread.features_locked) > > + return -EPERM; > > + > > + /* Only support enabling/disabling one feature at a time. */ > > + if (hweight_long(features) > 1) > > + return -EINVAL; > > + > > + if (option == ARCH_SHSTK_DISABLE) { > > + return -EINVAL; > > + } > > {} braces left over from some previous version. Can go now. > This was intentional, but I wasn't sure on it. It makes the diff cleaner in later patches, is the reason.
On Wed, Mar 08, 2023 at 11:32:36PM +0000, Edgecombe, Rick P wrote: > This would be for things like the "permissive mode", where glibc > determines that it has to do something like dlopen() an unsupporting > DSO much later. > > But being able to late lock the features is required for the working > behavior of glibc as well. Glibc enables shadow stack very early, then > disables it later if it finds that any of the normal dynamic libraries > don't support it. It only locks shadow stack after this point even in > non-permissive mode. So this all sounds weird. Especially from a user point of view. Now let's imagine there's a Linux user called Boris and he goes and buys a CPU which supports shadow stack, gets a distro which has shadow stack enabled. All good. Now, at some point he loads a program which pulls in an old library which hasn't been enabled for shadow stack yet. In the name of not breaking stuff, his glibc is configured in permissive mode by default so that program loads and shadow stack for it is disabled. And Boris doesn't even know and continues on his merry way thinking that he has all that cool ROP protection. So where is the knob that says, "disable permissive mode"? Or at least where does the user get a warning saying, "hey, this app doesn't do shadow stack and we disabled it for ya so that it can still work"? Or am I way off? I hope you're catching my drift. Because if there's no enforcement of shstk and we do this permissive mode by default, this whole overhead is just a unnecessary nuisance... But maybe that'll come later and I should keep going through the set... Thx.
On Thu, 2023-03-09 at 13:57 +0100, Borislav Petkov wrote: > So this all sounds weird. Especially from a user point of view. > > Now let's imagine there's a Linux user called Boris and he goes and > buys > a CPU which supports shadow stack, gets a distro which has shadow > stack > enabled. All good. > > Now, at some point he loads a program which pulls in an old library > which hasn't been enabled for shadow stack yet. > > In the name of not breaking stuff, his glibc is configured in > permissive > mode by default so that program loads and shadow stack for it is > disabled. > > And Boris doesn't even know and continues on his merry way thinking > that > he has all that cool ROP protection. There is a proc that shows if shadow stack is enabled in a thread. It does indeed come later in the series. > > So where is the knob that says, "disable permissive mode"? glibc has an environment variable that can change the loader's behavior. There is also a compile time config for the default mode. But this "permissive mode" is a glibc thing. The kernel doesn't implement it per-se, just provides building blocks. > > Or at least where does the user get a warning saying, "hey, this app > doesn't do shadow stack and we disabled it for ya so that it can > still > work"? > > Or am I way off? I don't think so. The whole "when to enable shadow stack" question is thornier than it might seem though, and what we have here is based on some trade offs in the details. > > I hope you're catching my drift. Because if there's no enforcement of > shstk and we do this permissive mode by default, this whole overhead > is > just a unnecessary nuisance... In the existing glibc patches, and this is highly related to glibc behavior because the decisions around enabling and locking have been pushed there, there are two reasons why shadow stack would get disabled on an supporting executable after it gets enabled. 1. An executable is loaded and one of the shared objects (the ones that come out of ldd) does not support shadow stack 2. An executable is loaded in permissive mode, and much later during execution dlopen()s a DSO that does not support shadow stack. One of the challenges with enabling shadow stack is you only start recording the shadow stack history when you enable it. If you enable it at some point, and then return from that function you underflow the shadow stack and get a violation. So if the shadow stack will be locked, it has to be enabled at the earliest point it might return to at some point (for example after returning from main()). So in 1, the existing logic of glibc is to enable shadow stack at the very beginning of the loader. Then go through the whole loading/linking process. If problems are found, disable shadow stack. If no problems are found, then lock it. I've wondered if this super early glibc enabling behavior is really needed and if they could enable it after processing the linked libraries in the elf. Then save the work of enabling and disabling shadow stack for situations that don't support it. To me this is the big wart in the whole thing, but I don't think the kernel can help resolve it. If glibc can enable later, then we can combine the locking and enabling into a single operation. But it only saves a syscall and it might prevent some other libc that needs to do things like glibc does currently, from being able to make it work at all. In 2, the enabling happens like normal and the locking is skipped, so that shadow stack can be enabled during a dlopen(). But glibc permissive mode promises more than it delivers. Since it can only disable shadow stack per-thread, it leaves the other threads enabled. Making a working permissive mode is sort of an unsolved problem. There are some proposals to make it work in just userspace, and some that would need additional kernel support. If you are interested I can go into why per-process disabling is not straightforward. So the locking is needed for the basic support in 1 and the weak permissive mode in 2 uses it. I am considering this series to support 1, but people may end up using 2 to get some permissive-ness. In general the idea of this API is to push the enabling decisions into userspace because that is where the information for making the decision is known. We previously tried to add some batch operations to improve the performance, but tglx had suggested to start with something simple. So we end up with this simple composable API.
On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote: > There is a proc that shows if shadow stack is enabled in a thread. It > does indeed come later in the series. Not good enough: 1. buried somewhere in proc where no one knows about it 2. it is per thread so user needs to grep *all* > ... We previously tried to add some batch operations to improve the > performance, but tglx had suggested to start with something simple. > So we end up with this simple composable API. I agree with starting simple and thanks for explaining this in detail. TBH, though, it already sounds like a mess to me. I guess a mess we'll have to deal with because there will always be this case of some shared object/lib not being enabled for shstk because of raisins. And TBH #2, I would've done it even simpler: if some shared object can't do shadow stack, we disable it for the whole process. I mean, what's the point? Only some of the stack is shadowed so an attacker could find a way to keep the process perhaps run this shstk-unsupporting shared object more/longer and ROP its way around the system. But I tend to oversimplify things sometimes so... What I'd like to have, though, is a kernel cmdline param which disables permissive mode and userspace can't do anything about it. So that once you boot your kernel, you can know that everything that runs on the machine has shstk and is properly protected. Also, it'll allow for faster fixing of all those shared objects to use shstk by way of political pressure. Thx.
+Joao regarding mixed mode designs On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote: > On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote: > > There is a proc that shows if shadow stack is enabled in a thread. > > It > > does indeed come later in the series. > > Not good enough: > > 1. buried somewhere in proc where no one knows about it > > 2. it is per thread so user needs to grep *all* See "x86: Expose thread features in /proc/$PID/status" for the patch. We could emit something in dmesg I guess? The logic would be: - Record the presence of elf SHSTK bit on exec - On shadow stack disable, if it had the elf bit, pr_info("bad!") > > > ... We previously tried to add some batch operations to improve > > the > > performance, but tglx had suggested to start with something > > simple. > > So we end up with this simple composable API. > > I agree with starting simple and thanks for explaining this in > detail. > > TBH, though, it already sounds like a mess to me. I guess a mess > we'll > have to deal with because there will always be this case of some > shared object/lib not being enabled for shstk because of raisins. The compatibility problems are totally the mess in this whole thing. When you try to look at a "permissive" mode that actually works it gets even more complex. Joao and I have been banging our heads on that problem for months. But there are some expected users of this that say: we compile and check our known set of binaries, we won't get any surprises. So it's more of a distro problem. > > And TBH #2, I would've done it even simpler: if some shared object > can't > do shadow stack, we disable it for the whole process. I mean, what's > the > point? You mean a late loaded dlopen()ed DSO? The enabling logic can't know this will happen ahead of time. If you mean if the shared objects in the elf all support shadow stack, then this is what happens. The complication is that the loader wants to enable shadow stack before it has checked the elf libs so it doesn't underflow the shadow stack when it returns from the function that does this checking. So it does: 1. Enable shadow stack 2. Call elf libs checking functions 3. If all good, lock shadow stack. Else, disable shadow stack. 4. Return from elf checking functions and if shstk is enabled, don't underflow because it was enabled in step 1 and we have return addresses from 2 on the shadow stack I'm wondering if this can't be improved in glibc to look like: 1. Check elf libs, and record it somewhere 2. Wait until just the right spot 3. If all good, enable and lock shadow stack. But it depends on the loader code design which I don't know well enough. > Only some of the stack is shadowed so an attacker could find > a way to keep the process perhaps run this shstk-unsupporting shared > object more/longer and ROP its way around the system. I hope non-permissive mode is the standard usage eventually. > > But I tend to oversimplify things sometimes so... > > What I'd like to have, though, is a kernel cmdline param which > disables > permissive mode and userspace can't do anything about it. So that > once > you boot your kernel, you can know that everything that runs on the > machine has shstk and is properly protected. Szabolcs Nagy was commenting something similar in another thread, for supporting kernel enforced security policies. I think the way to do it would have the kernel detect the the elf bit itself (like it used to) and enable shadow stack on exec. If you can't rely on userspace to call in to enable it, it's not clear at what point the kernel should check that it did. But then if you trigger off of the elf bit in the kernel, you get all the regression issues of the old glibcs at that point. But it is already an "I don't care if I crash" mode, so... I think if you trust your libc, glibc could implement this in userspace too. It would be useful even as as testing override. > > Also, it'll allow for faster fixing of all those shared objects to > use > shstk by way of political pressure.
On Thu, Mar 9, 2023 at 5:13 PM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > +Joao regarding mixed mode designs > > On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote: > > On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote: > > > There is a proc that shows if shadow stack is enabled in a thread. > > > It > > > does indeed come later in the series. > > > > Not good enough: > > > > 1. buried somewhere in proc where no one knows about it > > > > 2. it is per thread so user needs to grep *all* > > See "x86: Expose thread features in /proc/$PID/status" for the patch. > We could emit something in dmesg I guess? The logic would be: > - Record the presence of elf SHSTK bit on exec > - On shadow stack disable, if it had the elf bit, pr_info("bad!") > > > > > > ... We previously tried to add some batch operations to improve > > > the > > > performance, but tglx had suggested to start with something > > > simple. > > > So we end up with this simple composable API. > > > > I agree with starting simple and thanks for explaining this in > > detail. > > > > TBH, though, it already sounds like a mess to me. I guess a mess > > we'll > > have to deal with because there will always be this case of some > > shared object/lib not being enabled for shstk because of raisins. > > The compatibility problems are totally the mess in this whole thing. > When you try to look at a "permissive" mode that actually works it gets > even more complex. Joao and I have been banging our heads on that > problem for months. > > But there are some expected users of this that say: we compile and > check our known set of binaries, we won't get any surprises. So it's > more of a distro problem. > > > > > And TBH #2, I would've done it even simpler: if some shared object > > can't > > do shadow stack, we disable it for the whole process. I mean, what's > > the > > point? > > You mean a late loaded dlopen()ed DSO? The enabling logic can't know > this will happen ahead of time. > > If you mean if the shared objects in the elf all support shadow stack, > then this is what happens. The complication is that the loader wants to > enable shadow stack before it has checked the elf libs so it doesn't > underflow the shadow stack when it returns from the function that does > this checking. > > So it does: > 1. Enable shadow stack > 2. Call elf libs checking functions > 3. If all good, lock shadow stack. Else, disable shadow stack. > 4. Return from elf checking functions and if shstk is enabled, don't > underflow because it was enabled in step 1 and we have return addresses > from 2 on the shadow stack > > I'm wondering if this can't be improved in glibc to look like: > 1. Check elf libs, and record it somewhere > 2. Wait until just the right spot > 3. If all good, enable and lock shadow stack. I will try it out. > But it depends on the loader code design which I don't know well > enough. > > > Only some of the stack is shadowed so an attacker could find > > a way to keep the process perhaps run this shstk-unsupporting shared > > object more/longer and ROP its way around the system. > > I hope non-permissive mode is the standard usage eventually. > > > > > But I tend to oversimplify things sometimes so... > > > > What I'd like to have, though, is a kernel cmdline param which > > disables > > permissive mode and userspace can't do anything about it. So that > > once > > you boot your kernel, you can know that everything that runs on the > > machine has shstk and is properly protected. > > Szabolcs Nagy was commenting something similar in another thread, for > supporting kernel enforced security policies. I think the way to do it > would have the kernel detect the the elf bit itself (like it used to) > and enable shadow stack on exec. If you can't rely on userspace to call > in to enable it, it's not clear at what point the kernel should check > that it did. > > But then if you trigger off of the elf bit in the kernel, you get all > the regression issues of the old glibcs at that point. But it is > already an "I don't care if I crash" mode, so... > > I think if you trust your libc, glibc could implement this in userspace > too. It would be useful even as as testing override. > > > > > Also, it'll allow for faster fixing of all those shared objects to > > use > > shstk by way of political pressure.
On Fri, Mar 10, 2023 at 01:13:42AM +0000, Edgecombe, Rick P wrote: > See "x86: Expose thread features in /proc/$PID/status" for the patch. > We could emit something in dmesg I guess? The logic would be: dmesg is just flaky: ring buffer can get overwritten, users don't see it, ... > The compatibility problems are totally the mess in this whole thing. > When you try to look at a "permissive" mode that actually works it gets > even more complex. Joao and I have been banging our heads on that > problem for months. Oh yeah, I'm soo NOT jealous. :-\ > But there are some expected users of this that say: we compile and > check our known set of binaries, we won't get any surprises. So it's > more of a distro problem. I'm guessing what will happen here is that distros will gradually enable shstk and once it is ubiquitous, there will be no reason to disable it at all. > You mean a late loaded dlopen()ed DSO? The enabling logic can't know > this will happen ahead of time. No, I meant the case where you start with shstk enabled and later disable it when some lib does not support it. From now on that whole process is marked as "cannot use shstk anymore" and any other shared object that tries to use shstk simply doesn't get it. But meh, this makes the situation even more convoluted as the stuff that has loaded before the first shstk-not-supporting lib, already uses shstk. So you have half and half. What a mess. > I hope non-permissive mode is the standard usage eventually. Yah. > I think if you trust your libc, glibc could implement this in userspace > too. It would be useful even as as testing override. No, you cannot trust any userspace. And there are other libc's beside glibc. This should be a kernel parameter. I'm not saying we should do it now but we should do it at some point. So that user Boris again, he installs his new shiny distro, he checks that all the use cases and software he uses there is already shstk-enabled and then he goes and builds the kernel with CONFIG_X86_USER_SHADOW_STACK_STRICT=y or supplies a cmdline param and from now on, nothing can run without shstk. No checking, no trusting, no nothing. We fail any thread creation which doesn't init shstk. Thx.
On Thu, Mar 9, 2023 at 6:03 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Thu, Mar 9, 2023 at 5:13 PM Edgecombe, Rick P > <rick.p.edgecombe@intel.com> wrote: > > > > +Joao regarding mixed mode designs > > > > On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote: > > > On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote: > > > > There is a proc that shows if shadow stack is enabled in a thread. > > > > It > > > > does indeed come later in the series. > > > > > > Not good enough: > > > > > > 1. buried somewhere in proc where no one knows about it > > > > > > 2. it is per thread so user needs to grep *all* > > > > See "x86: Expose thread features in /proc/$PID/status" for the patch. > > We could emit something in dmesg I guess? The logic would be: > > - Record the presence of elf SHSTK bit on exec > > - On shadow stack disable, if it had the elf bit, pr_info("bad!") > > > > > > > > > ... We previously tried to add some batch operations to improve > > > > the > > > > performance, but tglx had suggested to start with something > > > > simple. > > > > So we end up with this simple composable API. > > > > > > I agree with starting simple and thanks for explaining this in > > > detail. > > > > > > TBH, though, it already sounds like a mess to me. I guess a mess > > > we'll > > > have to deal with because there will always be this case of some > > > shared object/lib not being enabled for shstk because of raisins. > > > > The compatibility problems are totally the mess in this whole thing. > > When you try to look at a "permissive" mode that actually works it gets > > even more complex. Joao and I have been banging our heads on that > > problem for months. > > > > But there are some expected users of this that say: we compile and > > check our known set of binaries, we won't get any surprises. So it's > > more of a distro problem. > > > > > > > > And TBH #2, I would've done it even simpler: if some shared object > > > can't > > > do shadow stack, we disable it for the whole process. I mean, what's > > > the > > > point? > > > > You mean a late loaded dlopen()ed DSO? The enabling logic can't know > > this will happen ahead of time. > > > > If you mean if the shared objects in the elf all support shadow stack, > > then this is what happens. The complication is that the loader wants to > > enable shadow stack before it has checked the elf libs so it doesn't > > underflow the shadow stack when it returns from the function that does > > this checking. > > > > So it does: > > 1. Enable shadow stack > > 2. Call elf libs checking functions > > 3. If all good, lock shadow stack. Else, disable shadow stack. > > 4. Return from elf checking functions and if shstk is enabled, don't > > underflow because it was enabled in step 1 and we have return addresses > > from 2 on the shadow stack > > > > I'm wondering if this can't be improved in glibc to look like: > > 1. Check elf libs, and record it somewhere > > 2. Wait until just the right spot > > 3. If all good, enable and lock shadow stack. > > I will try it out. > Currently glibc enables shadow stack as early as possible. There are only a few places where a function call in glibc never returns. We can enable shadow stack just before calling main. There are quite some code paths without shadow stack protection. Is this an issue? H.J.
On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote: > > > So it does: > > > 1. Enable shadow stack > > > 2. Call elf libs checking functions > > > 3. If all good, lock shadow stack. Else, disable shadow stack. > > > 4. Return from elf checking functions and if shstk is enabled, > > > don't > > > underflow because it was enabled in step 1 and we have return > > > addresses > > > from 2 on the shadow stack > > > > > > I'm wondering if this can't be improved in glibc to look like: > > > 1. Check elf libs, and record it somewhere > > > 2. Wait until just the right spot > > > 3. If all good, enable and lock shadow stack. > > > > I will try it out. > > > > Currently glibc enables shadow stack as early as possible. There > are only a few places where a function call in glibc never returns. > We can enable shadow stack just before calling main. There are > quite some code paths without shadow stack protection. Is this > an issue? Thanks for checking. Hmm, does the loader get attacked?
On Fri, Mar 10, 2023 at 12:27 PM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote: > > > > So it does: > > > > 1. Enable shadow stack > > > > 2. Call elf libs checking functions > > > > 3. If all good, lock shadow stack. Else, disable shadow stack. > > > > 4. Return from elf checking functions and if shstk is enabled, > > > > don't > > > > underflow because it was enabled in step 1 and we have return > > > > addresses > > > > from 2 on the shadow stack > > > > > > > > I'm wondering if this can't be improved in glibc to look like: > > > > 1. Check elf libs, and record it somewhere > > > > 2. Wait until just the right spot > > > > 3. If all good, enable and lock shadow stack. > > > > > > I will try it out. > > > > > > > Currently glibc enables shadow stack as early as possible. There > > are only a few places where a function call in glibc never returns. > > We can enable shadow stack just before calling main. There are > > quite some code paths without shadow stack protection. Is this > > an issue? > > Thanks for checking. Hmm, does the loader get attacked? Not I know of. But there are user codes from .init_array and .preinit_array which are executed before main. In theory, an attack can happen before main.
On Fri, 2023-03-10 at 12:43 -0800, H.J. Lu wrote: > On Fri, Mar 10, 2023 at 12:27 PM Edgecombe, Rick P > <rick.p.edgecombe@intel.com> wrote: > > > > On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote: > > > > > So it does: > > > > > 1. Enable shadow stack > > > > > 2. Call elf libs checking functions > > > > > 3. If all good, lock shadow stack. Else, disable shadow > > > > > stack. > > > > > 4. Return from elf checking functions and if shstk is > > > > > enabled, > > > > > don't > > > > > underflow because it was enabled in step 1 and we have return > > > > > addresses > > > > > from 2 on the shadow stack > > > > > > > > > > I'm wondering if this can't be improved in glibc to look > > > > > like: > > > > > 1. Check elf libs, and record it somewhere > > > > > 2. Wait until just the right spot > > > > > 3. If all good, enable and lock shadow stack. > > > > > > > > I will try it out. > > > > > > > > > > Currently glibc enables shadow stack as early as possible. There > > > are only a few places where a function call in glibc never > > > returns. > > > We can enable shadow stack just before calling main. There are > > > quite some code paths without shadow stack protection. Is this > > > an issue? > > > > Thanks for checking. Hmm, does the loader get attacked? > > Not I know of. But there are user codes from .init_array > and .preinit_array which are executed before main. In theory, > an attack can happen before main. Hmm, it would be nice to not add any startup overhead to non-shadow stack binaries. I guess it's a tradeoff. Might be worth asking around. But you can't just enable shadow stack before any user code? It has to go something like? 1. Execute init codes 2. Check elf libs 3. Enable SHSTK Or what if you just did the enable-disable dance if the execing binary itself has shadow stack. If it doesn't have shadow stack, the elf libs won't change the decision.
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8d73004e4cac..bd16e012b3e9 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -28,6 +28,7 @@ struct vm86; #include <asm/unwind_hints.h> #include <asm/vmxfeatures.h> #include <asm/vdso/processor.h> +#include <asm/shstk.h> #include <linux/personality.h> #include <linux/cache.h> @@ -475,6 +476,11 @@ struct thread_struct { */ u32 pkru; +#ifdef CONFIG_X86_USER_SHADOW_STACK + unsigned long features; + unsigned long features_locked; +#endif + /* Floating point and extended processor state */ struct fpu fpu; /* diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h new file mode 100644 index 000000000000..ec753809f074 --- /dev/null +++ b/arch/x86/include/asm/shstk.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_SHSTK_H +#define _ASM_X86_SHSTK_H + +#ifndef __ASSEMBLY__ +#include <linux/types.h> + +struct task_struct; + +#ifdef CONFIG_X86_USER_SHADOW_STACK +long shstk_prctl(struct task_struct *task, int option, unsigned long features); +void reset_thread_features(void); +#else +static inline long shstk_prctl(struct task_struct *task, int option, + unsigned long arg2) { return -EINVAL; } +static inline void reset_thread_features(void) {} +#endif /* CONFIG_X86_USER_SHADOW_STACK */ + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_X86_SHSTK_H */ diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 500b96e71f18..b2b3b7200b2d 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -20,4 +20,10 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +/* Don't use 0x3001-0x3004 because of old glibcs */ + +#define ARCH_SHSTK_ENABLE 0x5001 +#define ARCH_SHSTK_DISABLE 0x5002 +#define ARCH_SHSTK_LOCK 0x5003 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 92446f1dedd7..b366641703e3 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -146,6 +146,8 @@ obj-$(CONFIG_CALL_THUNKS) += callthunks.o obj-$(CONFIG_X86_CET) += cet.o +obj-$(CONFIG_X86_USER_SHADOW_STACK) += shstk.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 4e34b3b68ebd..71094c8a305f 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -514,6 +514,8 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip, load_gs_index(__USER_DS); } + reset_thread_features(); + loadsegment(fs, 0); loadsegment(es, _ds); loadsegment(ds, _ds); @@ -830,7 +832,10 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) case ARCH_MAP_VDSO_64: return prctl_map_vdso(&vdso_image_64, arg2); #endif - + case ARCH_SHSTK_ENABLE: + case ARCH_SHSTK_DISABLE: + case ARCH_SHSTK_LOCK: + return shstk_prctl(task, option, arg2); default: ret = -EINVAL; break; diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c new file mode 100644 index 000000000000..41ed6552e0a5 --- /dev/null +++ b/arch/x86/kernel/shstk.c @@ -0,0 +1,44 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * shstk.c - Intel shadow stack support + * + * Copyright (c) 2021, Intel Corporation. + * Yu-cheng Yu <yu-cheng.yu@intel.com> + */ + +#include <linux/sched.h> +#include <linux/bitops.h> +#include <asm/prctl.h> + +void reset_thread_features(void) +{ + current->thread.features = 0; + current->thread.features_locked = 0; +} + +long shstk_prctl(struct task_struct *task, int option, unsigned long features) +{ + if (option == ARCH_SHSTK_LOCK) { + task->thread.features_locked |= features; + return 0; + } + + /* Don't allow via ptrace */ + if (task != current) + return -EINVAL; + + /* Do not allow to change locked features */ + if (features & task->thread.features_locked) + return -EPERM; + + /* Only support enabling/disabling one feature at a time. */ + if (hweight_long(features) > 1) + return -EINVAL; + + if (option == ARCH_SHSTK_DISABLE) { + return -EINVAL; + } + + /* Handle ARCH_SHSTK_ENABLE */ + return -EINVAL; +}