[v6,28/41] x86: Introduce userspace API for shadow stack

Message ID 20230218211433.26859-29-rick.p.edgecombe@intel.com
State New
Headers
Series Shadow stacks for userspace |

Commit Message

Edgecombe, Rick P Feb. 18, 2023, 9:14 p.m. UTC
  From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Add three new arch_prctl() handles:

 - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
   feature. Returns 0 on success or an error.

 - ARCH_SHSTK_LOCK prevents future disabling or enabling of the
   specified feature. Returns 0 on success or an error

The features are handled per-thread and inherited over fork(2)/clone(2),
but reset on exec().

This is preparation patch. It does not implement any features.

Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[tweaked with feedback from tglx]
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---
v4:
 - Remove references to CET and replace with shadow stack (Peterz)

v3:
 - Move shstk.c Makefile changes earlier (Kees)
 - Add #ifdef around features_locked and features (Kees)
 - Encapsulate features reset earlier in reset_thread_features() so
   features and features_locked are not referenced in code that would be
   compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees)
 - Fix typo in commit log (Kees)
 - Switch arch_prctl() numbers to avoid conflict with LAM

v2:
 - Only allow one enable/disable per call (tglx)
 - Return error code like a normal arch_prctl() (Alexander Potapenko)
 - Make CET only (tglx)
---
 arch/x86/include/asm/processor.h  |  6 +++++
 arch/x86/include/asm/shstk.h      | 21 +++++++++++++++
 arch/x86/include/uapi/asm/prctl.h |  6 +++++
 arch/x86/kernel/Makefile          |  2 ++
 arch/x86/kernel/process_64.c      |  7 ++++-
 arch/x86/kernel/shstk.c           | 44 +++++++++++++++++++++++++++++++
 6 files changed, 85 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/asm/shstk.h
 create mode 100644 arch/x86/kernel/shstk.c
  

Comments

Borislav Petkov Feb. 24, 2023, 12:20 p.m. UTC | #1
On Sat, Feb 18, 2023 at 01:14:20PM -0800, Rick Edgecombe wrote:
> diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
> index 500b96e71f18..b2b3b7200b2d 100644
> --- a/arch/x86/include/uapi/asm/prctl.h
> +++ b/arch/x86/include/uapi/asm/prctl.h
> @@ -20,4 +20,10 @@
>  #define ARCH_MAP_VDSO_32		0x2002
>  #define ARCH_MAP_VDSO_64		0x2003
>  
> +/* Don't use 0x3001-0x3004 because of old glibcs */

So where is this all new interface to userspace programs documented? Do
we have an agreement with all the involved parties that this is how
we're going to support shadow stacks and that this is what userspace
should do?

I'd like to avoid one more fiasco with glibc etc here...

Thx.
  
Edgecombe, Rick P Feb. 24, 2023, 6:37 p.m. UTC | #2
On Fri, 2023-02-24 at 13:20 +0100, Borislav Petkov wrote:
> On Sat, Feb 18, 2023 at 01:14:20PM -0800, Rick Edgecombe wrote:
> > diff --git a/arch/x86/include/uapi/asm/prctl.h
> > b/arch/x86/include/uapi/asm/prctl.h
> > index 500b96e71f18..b2b3b7200b2d 100644
> > --- a/arch/x86/include/uapi/asm/prctl.h
> > +++ b/arch/x86/include/uapi/asm/prctl.h
> > @@ -20,4 +20,10 @@
> >   #define ARCH_MAP_VDSO_32             0x2002
> >   #define ARCH_MAP_VDSO_64             0x2003
> >   
> > +/* Don't use 0x3001-0x3004 because of old glibcs */
> 
> So where is this all new interface to userspace programs documented? 

In the first patch:

https://lore.kernel.org/lkml/20230218211433.26859-2-rick.p.edgecombe@intel.com/

Then some more documentation is added about ARCH_SHSTK_UNLOCK and
ARCH_SHSTK_STATUS (which are for CRIU) in those patches.

> Do
> we have an agreement with all the involved parties that this is how
> we're going to support shadow stacks and that this is what userspace
> should do?
> 
> I'd like to avoid one more fiasco with glibc etc here...

There are glibc patches prepared by HJ to use the new interface and
it's my understanding that he has discussed the changes with the other
glibc folks. Those glibc patches are used for testing these kernel
patches, but will not get upstream until the kernel patches to avoid
repeating the past problems. So I think it's as prepared as it can be.

One future thing that might come up... Glibc has this mode called
"permissive mode". When glibc is configured this way (compile time or
env var), it is supposed to disable shadow stack when dlopen()ing any
DSO that doesn't have the shadow stack elf header bit. The problem is,
it doesn't really work as intended. It only turns off shadow stack for
the calling thread, leaving the other threads free to call into the DSO
with shadow stack enabled. It's not clear yet if they are going to be
able to fix it in userspace. So this prompted some thinking of if there
could be an additional kernel mode to help glibc in this scenario. But
for non-permissive mode, glibc is queued up to use the interface as is.
  
Borislav Petkov Feb. 28, 2023, 10:58 a.m. UTC | #3
On Fri, Feb 24, 2023 at 06:37:57PM +0000, Edgecombe, Rick P wrote:
> In the first patch:
> 
> https://lore.kernel.org/lkml/20230218211433.26859-2-rick.p.edgecombe@intel.com/
> 
> Then some more documentation is added about ARCH_SHSTK_UNLOCK and
> ARCH_SHSTK_STATUS (which are for CRIU) in those patches.

Right, I was thinking more about ARCH_PRCTL(2), the man page.

But you can send that to the manpages folks later. I.e., it should be
nearly impossible to be missed. :)

> There are glibc patches prepared by HJ to use the new interface and
> it's my understanding that he has discussed the changes with the other
> glibc folks. Those glibc patches are used for testing these kernel
> patches, but will not get upstream until the kernel patches to avoid
> repeating the past problems. So I think it's as prepared as it can be.

Good.

> One future thing that might come up... Glibc has this mode called
> "permissive mode". When glibc is configured this way (compile time or
> env var), it is supposed to disable shadow stack when dlopen()ing any
> DSO that doesn't have the shadow stack elf header bit.

Maybe I don't understand all the possible use cases but if I were
interested in using shadow stack, then I'd enable it for all objects.
And if I want permissive, I'd disable it for all. A mixed thing sounds
like a mixed can of worms waiting to be opened.
  
Edgecombe, Rick P Feb. 28, 2023, 10:35 p.m. UTC | #4
On Tue, 2023-02-28 at 11:58 +0100, Borislav Petkov wrote:
> On Fri, Feb 24, 2023 at 06:37:57PM +0000, Edgecombe, Rick P wrote:
> > In the first patch:
> > 
> > 
https://lore.kernel.org/lkml/20230218211433.26859-2-rick.p.edgecombe@intel.com/
> > 
> > Then some more documentation is added about ARCH_SHSTK_UNLOCK and
> > ARCH_SHSTK_STATUS (which are for CRIU) in those patches.
> 
> Right, I was thinking more about ARCH_PRCTL(2), the man page.
> 
> But you can send that to the manpages folks later. I.e., it should be
> nearly impossible to be missed. :)

Sure I can add something. Somewhere I have a man page for
map_shadow_stack written up as well.

> 
> > There are glibc patches prepared by HJ to use the new interface and
> > it's my understanding that he has discussed the changes with the
> > other
> > glibc folks. Those glibc patches are used for testing these kernel
> > patches, but will not get upstream until the kernel patches to
> > avoid
> > repeating the past problems. So I think it's as prepared as it can
> > be.
> 
> Good.
> 
> > One future thing that might come up... Glibc has this mode called
> > "permissive mode". When glibc is configured this way (compile time
> > or
> > env var), it is supposed to disable shadow stack when dlopen()ing
> > any
> > DSO that doesn't have the shadow stack elf header bit.
> 
> Maybe I don't understand all the possible use cases but if I were
> interested in using shadow stack, then I'd enable it for all objects.

Enabling for all objects is the ideal, but in practice distros don't
have that.

> And if I want permissive, I'd disable it for all. A mixed thing
> sounds
> like a mixed can of worms waiting to be opened.

It is definitely a can of worms.
  

Patch

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 8d73004e4cac..bd16e012b3e9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -28,6 +28,7 @@  struct vm86;
 #include <asm/unwind_hints.h>
 #include <asm/vmxfeatures.h>
 #include <asm/vdso/processor.h>
+#include <asm/shstk.h>
 
 #include <linux/personality.h>
 #include <linux/cache.h>
@@ -475,6 +476,11 @@  struct thread_struct {
 	 */
 	u32			pkru;
 
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+	unsigned long		features;
+	unsigned long		features_locked;
+#endif
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
new file mode 100644
index 000000000000..ec753809f074
--- /dev/null
+++ b/arch/x86/include/asm/shstk.h
@@ -0,0 +1,21 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SHSTK_H
+#define _ASM_X86_SHSTK_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+struct task_struct;
+
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+long shstk_prctl(struct task_struct *task, int option, unsigned long features);
+void reset_thread_features(void);
+#else
+static inline long shstk_prctl(struct task_struct *task, int option,
+			       unsigned long arg2) { return -EINVAL; }
+static inline void reset_thread_features(void) {}
+#endif /* CONFIG_X86_USER_SHADOW_STACK */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_SHSTK_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 500b96e71f18..b2b3b7200b2d 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -20,4 +20,10 @@ 
 #define ARCH_MAP_VDSO_32		0x2002
 #define ARCH_MAP_VDSO_64		0x2003
 
+/* Don't use 0x3001-0x3004 because of old glibcs */
+
+#define ARCH_SHSTK_ENABLE		0x5001
+#define ARCH_SHSTK_DISABLE		0x5002
+#define ARCH_SHSTK_LOCK			0x5003
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 92446f1dedd7..b366641703e3 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -146,6 +146,8 @@  obj-$(CONFIG_CALL_THUNKS)		+= callthunks.o
 
 obj-$(CONFIG_X86_CET)			+= cet.o
 
+obj-$(CONFIG_X86_USER_SHADOW_STACK)	+= shstk.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4e34b3b68ebd..71094c8a305f 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -514,6 +514,8 @@  start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 		load_gs_index(__USER_DS);
 	}
 
+	reset_thread_features();
+
 	loadsegment(fs, 0);
 	loadsegment(es, _ds);
 	loadsegment(ds, _ds);
@@ -830,7 +832,10 @@  long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
 	case ARCH_MAP_VDSO_64:
 		return prctl_map_vdso(&vdso_image_64, arg2);
 #endif
-
+	case ARCH_SHSTK_ENABLE:
+	case ARCH_SHSTK_DISABLE:
+	case ARCH_SHSTK_LOCK:
+		return shstk_prctl(task, option, arg2);
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
new file mode 100644
index 000000000000..41ed6552e0a5
--- /dev/null
+++ b/arch/x86/kernel/shstk.c
@@ -0,0 +1,44 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * shstk.c - Intel shadow stack support
+ *
+ * Copyright (c) 2021, Intel Corporation.
+ * Yu-cheng Yu <yu-cheng.yu@intel.com>
+ */
+
+#include <linux/sched.h>
+#include <linux/bitops.h>
+#include <asm/prctl.h>
+
+void reset_thread_features(void)
+{
+	current->thread.features = 0;
+	current->thread.features_locked = 0;
+}
+
+long shstk_prctl(struct task_struct *task, int option, unsigned long features)
+{
+	if (option == ARCH_SHSTK_LOCK) {
+		task->thread.features_locked |= features;
+		return 0;
+	}
+
+	/* Don't allow via ptrace */
+	if (task != current)
+		return -EINVAL;
+
+	/* Do not allow to change locked features */
+	if (features & task->thread.features_locked)
+		return -EPERM;
+
+	/* Only support enabling/disabling one feature at a time. */
+	if (hweight_long(features) > 1)
+		return -EINVAL;
+
+	if (option == ARCH_SHSTK_DISABLE) {
+		return -EINVAL;
+	}
+
+	/* Handle ARCH_SHSTK_ENABLE */
+	return -EINVAL;
+}