[v7,28/41] x86: Introduce userspace API for shadow stack

Message ID 20230227222957.24501-29-rick.p.edgecombe@intel.com
State New
Headers
Series Shadow stacks for userspace |

Commit Message

Edgecombe, Rick P Feb. 27, 2023, 10:29 p.m. UTC
  From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Add three new arch_prctl() handles:

 - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
   feature. Returns 0 on success or an error.

 - ARCH_SHSTK_LOCK prevents future disabling or enabling of the
   specified feature. Returns 0 on success or an error

The features are handled per-thread and inherited over fork(2)/clone(2),
but reset on exec().

This is preparation patch. It does not implement any features.

Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[tweaked with feedback from tglx]
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---
v4:
 - Remove references to CET and replace with shadow stack (Peterz)

v3:
 - Move shstk.c Makefile changes earlier (Kees)
 - Add #ifdef around features_locked and features (Kees)
 - Encapsulate features reset earlier in reset_thread_features() so
   features and features_locked are not referenced in code that would be
   compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees)
 - Fix typo in commit log (Kees)
 - Switch arch_prctl() numbers to avoid conflict with LAM

v2:
 - Only allow one enable/disable per call (tglx)
 - Return error code like a normal arch_prctl() (Alexander Potapenko)
 - Make CET only (tglx)
---
 arch/x86/include/asm/processor.h  |  6 +++++
 arch/x86/include/asm/shstk.h      | 21 +++++++++++++++
 arch/x86/include/uapi/asm/prctl.h |  6 +++++
 arch/x86/kernel/Makefile          |  2 ++
 arch/x86/kernel/process_64.c      |  7 ++++-
 arch/x86/kernel/shstk.c           | 44 +++++++++++++++++++++++++++++++
 6 files changed, 85 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/asm/shstk.h
 create mode 100644 arch/x86/kernel/shstk.c
  

Comments

Borislav Petkov March 8, 2023, 10:27 a.m. UTC | #1
On Mon, Feb 27, 2023 at 02:29:44PM -0800, Rick Edgecombe wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Add three new arch_prctl() handles:
> 
>  - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
>    feature. Returns 0 on success or an error.

"... or a negative value on error."

>  - ARCH_SHSTK_LOCK prevents future disabling or enabling of the
>    specified feature. Returns 0 on success or an error

ditto.

What is the use case of the feature locking?

I'm under the simple assumption that once shstk is enabled for an app,
it remains so. I guess my question is rather, what's the use case for
enabling shadow stack and then disabling it later for an app...?

> The features are handled per-thread and inherited over fork(2)/clone(2),
> but reset on exec().
> 
> This is preparation patch. It does not implement any features.

That belongs under the "---" line I guess.

> Tested-by: Pengfei Xu <pengfei.xu@intel.com>
> Tested-by: John Allen <john.allen@amd.com>
> Tested-by: Kees Cook <keescook@chromium.org>
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> [tweaked with feedback from tglx]
> Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> ---
> v4:
>  - Remove references to CET and replace with shadow stack (Peterz)
> 
> v3:
>  - Move shstk.c Makefile changes earlier (Kees)
>  - Add #ifdef around features_locked and features (Kees)
>  - Encapsulate features reset earlier in reset_thread_features() so
>    features and features_locked are not referenced in code that would be
>    compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees)
>  - Fix typo in commit log (Kees)
>  - Switch arch_prctl() numbers to avoid conflict with LAM
> 
> v2:
>  - Only allow one enable/disable per call (tglx)
>  - Return error code like a normal arch_prctl() (Alexander Potapenko)
>  - Make CET only (tglx)
> ---
>  arch/x86/include/asm/processor.h  |  6 +++++
>  arch/x86/include/asm/shstk.h      | 21 +++++++++++++++
>  arch/x86/include/uapi/asm/prctl.h |  6 +++++
>  arch/x86/kernel/Makefile          |  2 ++
>  arch/x86/kernel/process_64.c      |  7 ++++-
>  arch/x86/kernel/shstk.c           | 44 +++++++++++++++++++++++++++++++
>  6 files changed, 85 insertions(+), 1 deletion(-)
>  create mode 100644 arch/x86/include/asm/shstk.h
>  create mode 100644 arch/x86/kernel/shstk.c

...

> +long shstk_prctl(struct task_struct *task, int option, unsigned long features)
> +{
> +	if (option == ARCH_SHSTK_LOCK) {
> +		task->thread.features_locked |= features;
> +		return 0;
> +	}
> +
> +	/* Don't allow via ptrace */
> +	if (task != current)
> +		return -EINVAL;
> +
> +	/* Do not allow to change locked features */
> +	if (features & task->thread.features_locked)
> +		return -EPERM;
> +
> +	/* Only support enabling/disabling one feature at a time. */
> +	if (hweight_long(features) > 1)
> +		return -EINVAL;
> +
> +	if (option == ARCH_SHSTK_DISABLE) {
> +		return -EINVAL;
> +	}

{} braces left over from some previous version. Can go now.
  
Edgecombe, Rick P March 8, 2023, 11:32 p.m. UTC | #2
On Wed, 2023-03-08 at 11:27 +0100, Borislav Petkov wrote:
> On Mon, Feb 27, 2023 at 02:29:44PM -0800, Rick Edgecombe wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > Add three new arch_prctl() handles:
> > 
> >  - ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
> >    feature. Returns 0 on success or an error.
> 
> "... or a negative value on error."

Sure.

> 
> >  - ARCH_SHSTK_LOCK prevents future disabling or enabling of the
> >    specified feature. Returns 0 on success or an error
> 
> ditto.
> 
> What is the use case of the feature locking?
> 
> I'm under the simple assumption that once shstk is enabled for an
> app,
> it remains so. I guess my question is rather, what's the use case for
> enabling shadow stack and then disabling it later for an app...?

This would be for things like the "permissive mode", where glibc
determines that it has to do something like dlopen() an unsupporting
DSO much later.

But being able to late lock the features is required for the working
behavior of glibc as well. Glibc enables shadow stack very early, then
disables it later if it finds that any of the normal dynamic libraries
don't support it. It only locks shadow stack after this point even in
non-permissive mode.

The selftest also does a lot of enabling and disabling.

> 
> > The features are handled per-thread and inherited over
> > fork(2)/clone(2),
> > but reset on exec().
> > 
> > This is preparation patch. It does not implement any features.
> 
> That belongs under the "---" line I guess.

Oh, yes.

> 
> > Tested-by: Pengfei Xu <pengfei.xu@intel.com>
> > Tested-by: John Allen <john.allen@amd.com>
> > Tested-by: Kees Cook <keescook@chromium.org>
> > Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > Reviewed-by: Kees Cook <keescook@chromium.org>
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > [tweaked with feedback from tglx]
> > Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > 
> > ---
> > v4:
> >  - Remove references to CET and replace with shadow stack (Peterz)
> > 
> > v3:
> >  - Move shstk.c Makefile changes earlier (Kees)
> >  - Add #ifdef around features_locked and features (Kees)
> >  - Encapsulate features reset earlier in reset_thread_features() so
> >    features and features_locked are not referenced in code that
> > would be
> >    compiled !CONFIG_X86_USER_SHADOW_STACK. (Kees)
> >  - Fix typo in commit log (Kees)
> >  - Switch arch_prctl() numbers to avoid conflict with LAM
> > 
> > v2:
> >  - Only allow one enable/disable per call (tglx)
> >  - Return error code like a normal arch_prctl() (Alexander
> > Potapenko)
> >  - Make CET only (tglx)
> > ---
> >  arch/x86/include/asm/processor.h  |  6 +++++
> >  arch/x86/include/asm/shstk.h      | 21 +++++++++++++++
> >  arch/x86/include/uapi/asm/prctl.h |  6 +++++
> >  arch/x86/kernel/Makefile          |  2 ++
> >  arch/x86/kernel/process_64.c      |  7 ++++-
> >  arch/x86/kernel/shstk.c           | 44
> > +++++++++++++++++++++++++++++++
> >  6 files changed, 85 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/x86/include/asm/shstk.h
> >  create mode 100644 arch/x86/kernel/shstk.c
> 
> ...
> 
> > +long shstk_prctl(struct task_struct *task, int option, unsigned
> > long features)
> > +{
> > +	if (option == ARCH_SHSTK_LOCK) {
> > +		task->thread.features_locked |= features;
> > +		return 0;
> > +	}
> > +
> > +	/* Don't allow via ptrace */
> > +	if (task != current)
> > +		return -EINVAL;
> > +
> > +	/* Do not allow to change locked features */
> > +	if (features & task->thread.features_locked)
> > +		return -EPERM;
> > +
> > +	/* Only support enabling/disabling one feature at a time. */
> > +	if (hweight_long(features) > 1)
> > +		return -EINVAL;
> > +
> > +	if (option == ARCH_SHSTK_DISABLE) {
> > +		return -EINVAL;
> > +	}
> 
> {} braces left over from some previous version. Can go now.
> 

This was intentional, but I wasn't sure on it. It makes the diff
cleaner in later patches, is the reason.
  
Borislav Petkov March 9, 2023, 12:57 p.m. UTC | #3
On Wed, Mar 08, 2023 at 11:32:36PM +0000, Edgecombe, Rick P wrote:
> This would be for things like the "permissive mode", where glibc
> determines that it has to do something like dlopen() an unsupporting
> DSO much later.
> 
> But being able to late lock the features is required for the working
> behavior of glibc as well. Glibc enables shadow stack very early, then
> disables it later if it finds that any of the normal dynamic libraries
> don't support it. It only locks shadow stack after this point even in
> non-permissive mode.

So this all sounds weird. Especially from a user point of view.

Now let's imagine there's a Linux user called Boris and he goes and buys
a CPU which supports shadow stack, gets a distro which has shadow stack
enabled. All good.

Now, at some point he loads a program which pulls in an old library
which hasn't been enabled for shadow stack yet.

In the name of not breaking stuff, his glibc is configured in permissive
mode by default so that program loads and shadow stack for it is
disabled.

And Boris doesn't even know and continues on his merry way thinking that
he has all that cool ROP protection.

So where is the knob that says, "disable permissive mode"?

Or at least where does the user get a warning saying, "hey, this app
doesn't do shadow stack and we disabled it for ya so that it can still
work"?

Or am I way off?

I hope you're catching my drift. Because if there's no enforcement of
shstk and we do this permissive mode by default, this whole overhead is
just a unnecessary nuisance...

But maybe that'll come later and I should keep going through the set...

Thx.
  
Edgecombe, Rick P March 9, 2023, 4:56 p.m. UTC | #4
On Thu, 2023-03-09 at 13:57 +0100, Borislav Petkov wrote:
> So this all sounds weird. Especially from a user point of view.
> 
> Now let's imagine there's a Linux user called Boris and he goes and
> buys
> a CPU which supports shadow stack, gets a distro which has shadow
> stack
> enabled. All good.
> 
> Now, at some point he loads a program which pulls in an old library
> which hasn't been enabled for shadow stack yet.
> 
> In the name of not breaking stuff, his glibc is configured in
> permissive
> mode by default so that program loads and shadow stack for it is
> disabled.
> 
> And Boris doesn't even know and continues on his merry way thinking
> that
> he has all that cool ROP protection.

There is a proc that shows if shadow stack is enabled in a thread. It
does indeed come later in the series.

> 
> So where is the knob that says, "disable permissive mode"?

glibc has an environment variable that can change the loader's
behavior. There is also a compile time config for the default mode. But
this "permissive mode" is a glibc thing. The kernel doesn't implement
it per-se, just provides building blocks.

> 
> Or at least where does the user get a warning saying, "hey, this app
> doesn't do shadow stack and we disabled it for ya so that it can
> still
> work"?
> 
> Or am I way off?

I don't think so. The whole "when to enable shadow stack" question is
thornier than it might seem though, and what we have here is based on
some trade offs in the details.

> 
> I hope you're catching my drift. Because if there's no enforcement of
> shstk and we do this permissive mode by default, this whole overhead
> is
> just a unnecessary nuisance...

In the existing glibc patches, and this is highly related to glibc
behavior because the decisions around enabling and locking have been
pushed there, there are two reasons why shadow stack would get disabled
on an supporting executable after it gets enabled.
1. An executable is loaded and one of the shared objects (the ones that
come out of ldd) does not support shadow stack
2. An executable is loaded in permissive mode, and much later during
execution dlopen()s a DSO that does not support shadow stack.

One of the challenges with enabling shadow stack is you only start
recording the shadow stack history when you enable it. If you enable it
at some point, and then return from that function you underflow the
shadow stack and get a violation. So if the shadow stack will be
locked, it has to be enabled at the earliest point it might return to
at some point (for example after returning from main()).

So in 1, the existing logic of glibc is to enable shadow stack at the
very beginning of the loader. Then go through the whole loading/linking
process. If problems are found, disable shadow stack. If no problems
are found, then lock it.

I've wondered if this super early glibc enabling behavior is really
needed and if they could enable it after processing the linked
libraries in the elf. Then save the work of enabling and disabling
shadow stack for situations that don't support it. To me this is the
big wart in the whole thing, but I don't think the kernel can help
resolve it. If glibc can enable later, then we can combine the locking
and enabling into a single operation. But it only saves a syscall and
it might prevent some other libc that needs to do things like glibc
does currently, from being able to make it work at all.

In 2, the enabling happens like normal and the locking is skipped, so
that shadow stack can be enabled during a dlopen(). But glibc
permissive mode promises more than it delivers. Since it can only
disable shadow stack per-thread, it leaves the other threads enabled.
Making a working permissive mode is sort of an unsolved problem. There
are some proposals to make it work in just userspace, and some that
would need additional kernel support. If you are interested I can go
into why per-process disabling is not straightforward.

So the locking is needed for the basic support in 1 and the weak
permissive mode in 2 uses it. I am considering this series to support
1, but people may end up using 2 to get some permissive-ness. In
general the idea of this API is to push the enabling decisions into
userspace because that is where the information for making the decision
is known. We previously tried to add some batch operations to improve
the performance, but tglx had suggested to start with something simple.
So we end up with this simple composable API.
  
Borislav Petkov March 9, 2023, 11:51 p.m. UTC | #5
On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote:
> There is a proc that shows if shadow stack is enabled in a thread. It
> does indeed come later in the series.

Not good enough:

1. buried somewhere in proc where no one knows about it

2. it is per thread so user needs to grep *all*

>  ... We previously tried to add some batch operations to improve the
>  performance, but tglx had suggested to start with something simple.
>  So we end up with this simple composable API.

I agree with starting simple and thanks for explaining this in detail.

TBH, though, it already sounds like a mess to me. I guess a mess we'll
have to deal with because there will always be this case of some
shared object/lib not being enabled for shstk because of raisins.

And TBH #2, I would've done it even simpler: if some shared object can't
do shadow stack, we disable it for the whole process. I mean, what's the
point? Only some of the stack is shadowed so an attacker could find
a way to keep the process perhaps run this shstk-unsupporting shared
object more/longer and ROP its way around the system.

But I tend to oversimplify things sometimes so...

What I'd like to have, though, is a kernel cmdline param which disables
permissive mode and userspace can't do anything about it. So that once
you boot your kernel, you can know that everything that runs on the
machine has shstk and is properly protected.

Also, it'll allow for faster fixing of all those shared objects to use
shstk by way of political pressure.

Thx.
  
Edgecombe, Rick P March 10, 2023, 1:13 a.m. UTC | #6
+Joao regarding mixed mode designs

On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote:
> On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote:
> > There is a proc that shows if shadow stack is enabled in a thread.
> > It
> > does indeed come later in the series.
> 
> Not good enough:
> 
> 1. buried somewhere in proc where no one knows about it
> 
> 2. it is per thread so user needs to grep *all*

See "x86: Expose thread features in /proc/$PID/status" for the patch.
We could emit something in dmesg I guess? The logic would be:
 - Record the presence of elf SHSTK bit on exec
 - On shadow stack disable, if it had the elf bit, pr_info("bad!")

> 
> >   ... We previously tried to add some batch operations to improve
> > the
> >   performance, but tglx had suggested to start with something
> > simple.
> >   So we end up with this simple composable API.
> 
> I agree with starting simple and thanks for explaining this in
> detail.
> 
> TBH, though, it already sounds like a mess to me. I guess a mess
> we'll
> have to deal with because there will always be this case of some
> shared object/lib not being enabled for shstk because of raisins.

The compatibility problems are totally the mess in this whole thing.
When you try to look at a "permissive" mode that actually works it gets
even more complex. Joao and I have been banging our heads on that
problem for months.

But there are some expected users of this that say: we compile and
check our known set of binaries, we won't get any surprises. So it's
more of a distro problem.

> 
> And TBH #2, I would've done it even simpler: if some shared object
> can't
> do shadow stack, we disable it for the whole process. I mean, what's
> the
> point? 

You mean a late loaded dlopen()ed DSO? The enabling logic can't know
this will happen ahead of time.

If you mean if the shared objects in the elf all support shadow stack,
then this is what happens. The complication is that the loader wants to
enable shadow stack before it has checked the elf libs so it doesn't
underflow the shadow stack when it returns from the function that does
this checking.

So it does:
1. Enable shadow stack
2. Call elf libs checking functions
3. If all good, lock shadow stack. Else, disable shadow stack.
4. Return from elf checking functions and if shstk is enabled, don't
underflow because it was enabled in step 1 and we have return addresses
from 2 on the shadow stack

I'm wondering if this can't be improved in glibc to look like:
1. Check elf libs, and record it somewhere
2. Wait until just the right spot
3. If all good, enable and lock shadow stack.

But it depends on the loader code design which I don't know well
enough.

> Only some of the stack is shadowed so an attacker could find
> a way to keep the process perhaps run this shstk-unsupporting shared
> object more/longer and ROP its way around the system.

I hope non-permissive mode is the standard usage eventually.

> 
> But I tend to oversimplify things sometimes so...
> 
> What I'd like to have, though, is a kernel cmdline param which
> disables
> permissive mode and userspace can't do anything about it. So that
> once
> you boot your kernel, you can know that everything that runs on the
> machine has shstk and is properly protected.

Szabolcs Nagy was commenting something similar in another thread, for
supporting kernel enforced security policies. I think the way to do it
would have the kernel detect the the elf bit itself (like it used to)
and enable shadow stack on exec. If you can't rely on userspace to call
in to enable it, it's not clear at what point the kernel should check
that it did.

But then if you trigger off of the elf bit in the kernel, you get all
the regression issues of the old glibcs at that point. But it is
already an "I don't care if I crash" mode, so...

I think if you trust your libc, glibc could implement this in userspace
too. It would be useful even as as testing override.

> 
> Also, it'll allow for faster fixing of all those shared objects to
> use
> shstk by way of political pressure.
  
H.J. Lu March 10, 2023, 2:03 a.m. UTC | #7
On Thu, Mar 9, 2023 at 5:13 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> +Joao regarding mixed mode designs
>
> On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote:
> > On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote:
> > > There is a proc that shows if shadow stack is enabled in a thread.
> > > It
> > > does indeed come later in the series.
> >
> > Not good enough:
> >
> > 1. buried somewhere in proc where no one knows about it
> >
> > 2. it is per thread so user needs to grep *all*
>
> See "x86: Expose thread features in /proc/$PID/status" for the patch.
> We could emit something in dmesg I guess? The logic would be:
>  - Record the presence of elf SHSTK bit on exec
>  - On shadow stack disable, if it had the elf bit, pr_info("bad!")
>
> >
> > >   ... We previously tried to add some batch operations to improve
> > > the
> > >   performance, but tglx had suggested to start with something
> > > simple.
> > >   So we end up with this simple composable API.
> >
> > I agree with starting simple and thanks for explaining this in
> > detail.
> >
> > TBH, though, it already sounds like a mess to me. I guess a mess
> > we'll
> > have to deal with because there will always be this case of some
> > shared object/lib not being enabled for shstk because of raisins.
>
> The compatibility problems are totally the mess in this whole thing.
> When you try to look at a "permissive" mode that actually works it gets
> even more complex. Joao and I have been banging our heads on that
> problem for months.
>
> But there are some expected users of this that say: we compile and
> check our known set of binaries, we won't get any surprises. So it's
> more of a distro problem.
>
> >
> > And TBH #2, I would've done it even simpler: if some shared object
> > can't
> > do shadow stack, we disable it for the whole process. I mean, what's
> > the
> > point?
>
> You mean a late loaded dlopen()ed DSO? The enabling logic can't know
> this will happen ahead of time.
>
> If you mean if the shared objects in the elf all support shadow stack,
> then this is what happens. The complication is that the loader wants to
> enable shadow stack before it has checked the elf libs so it doesn't
> underflow the shadow stack when it returns from the function that does
> this checking.
>
> So it does:
> 1. Enable shadow stack
> 2. Call elf libs checking functions
> 3. If all good, lock shadow stack. Else, disable shadow stack.
> 4. Return from elf checking functions and if shstk is enabled, don't
> underflow because it was enabled in step 1 and we have return addresses
> from 2 on the shadow stack
>
> I'm wondering if this can't be improved in glibc to look like:
> 1. Check elf libs, and record it somewhere
> 2. Wait until just the right spot
> 3. If all good, enable and lock shadow stack.

I will try it out.

> But it depends on the loader code design which I don't know well
> enough.
>
> > Only some of the stack is shadowed so an attacker could find
> > a way to keep the process perhaps run this shstk-unsupporting shared
> > object more/longer and ROP its way around the system.
>
> I hope non-permissive mode is the standard usage eventually.
>
> >
> > But I tend to oversimplify things sometimes so...
> >
> > What I'd like to have, though, is a kernel cmdline param which
> > disables
> > permissive mode and userspace can't do anything about it. So that
> > once
> > you boot your kernel, you can know that everything that runs on the
> > machine has shstk and is properly protected.
>
> Szabolcs Nagy was commenting something similar in another thread, for
> supporting kernel enforced security policies. I think the way to do it
> would have the kernel detect the the elf bit itself (like it used to)
> and enable shadow stack on exec. If you can't rely on userspace to call
> in to enable it, it's not clear at what point the kernel should check
> that it did.
>
> But then if you trigger off of the elf bit in the kernel, you get all
> the regression issues of the old glibcs at that point. But it is
> already an "I don't care if I crash" mode, so...
>
> I think if you trust your libc, glibc could implement this in userspace
> too. It would be useful even as as testing override.
>
> >
> > Also, it'll allow for faster fixing of all those shared objects to
> > use
> > shstk by way of political pressure.
  
Borislav Petkov March 10, 2023, 11:40 a.m. UTC | #8
On Fri, Mar 10, 2023 at 01:13:42AM +0000, Edgecombe, Rick P wrote:
> See "x86: Expose thread features in /proc/$PID/status" for the patch.
> We could emit something in dmesg I guess? The logic would be:

dmesg is just flaky: ring buffer can get overwritten, users don't see
it, ...

> The compatibility problems are totally the mess in this whole thing.
> When you try to look at a "permissive" mode that actually works it gets
> even more complex. Joao and I have been banging our heads on that
> problem for months.

Oh yeah, I'm soo NOT jealous. :-\

> But there are some expected users of this that say: we compile and
> check our known set of binaries, we won't get any surprises. So it's
> more of a distro problem.

I'm guessing what will happen here is that distros will gradually enable
shstk and once it is ubiquitous, there will be no reason to disable it
at all.

> You mean a late loaded dlopen()ed DSO? The enabling logic can't know
> this will happen ahead of time.

No, I meant the case where you start with shstk enabled and later
disable it when some lib does not support it.

From now on that whole process is marked as "cannot use shstk anymore"
and any other shared object that tries to use shstk simply doesn't get
it.

But meh, this makes the situation even more convoluted as the stuff that
has loaded before the first shstk-not-supporting lib, already uses
shstk.

So you have half and half.

What a mess.

> I hope non-permissive mode is the standard usage eventually.

Yah.

> I think if you trust your libc, glibc could implement this in userspace
> too. It would be useful even as as testing override.

No, you cannot trust any userspace. And there are other libc's beside
glibc.

This should be a kernel parameter. I'm not saying we should do it now
but we should do it at some point.

So that user Boris again, he installs his new shiny distro, he checks
that all the use cases and software he uses there is already
shstk-enabled and then he goes and builds the kernel with

	CONFIG_X86_USER_SHADOW_STACK_STRICT=y

or supplies a cmdline param and from now on, nothing can run without
shstk. No checking, no trusting, no nothing.

We fail any thread creation which doesn't init shstk.

Thx.
  
H.J. Lu March 10, 2023, 8 p.m. UTC | #9
On Thu, Mar 9, 2023 at 6:03 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Thu, Mar 9, 2023 at 5:13 PM Edgecombe, Rick P
> <rick.p.edgecombe@intel.com> wrote:
> >
> > +Joao regarding mixed mode designs
> >
> > On Fri, 2023-03-10 at 00:51 +0100, Borislav Petkov wrote:
> > > On Thu, Mar 09, 2023 at 04:56:37PM +0000, Edgecombe, Rick P wrote:
> > > > There is a proc that shows if shadow stack is enabled in a thread.
> > > > It
> > > > does indeed come later in the series.
> > >
> > > Not good enough:
> > >
> > > 1. buried somewhere in proc where no one knows about it
> > >
> > > 2. it is per thread so user needs to grep *all*
> >
> > See "x86: Expose thread features in /proc/$PID/status" for the patch.
> > We could emit something in dmesg I guess? The logic would be:
> >  - Record the presence of elf SHSTK bit on exec
> >  - On shadow stack disable, if it had the elf bit, pr_info("bad!")
> >
> > >
> > > >   ... We previously tried to add some batch operations to improve
> > > > the
> > > >   performance, but tglx had suggested to start with something
> > > > simple.
> > > >   So we end up with this simple composable API.
> > >
> > > I agree with starting simple and thanks for explaining this in
> > > detail.
> > >
> > > TBH, though, it already sounds like a mess to me. I guess a mess
> > > we'll
> > > have to deal with because there will always be this case of some
> > > shared object/lib not being enabled for shstk because of raisins.
> >
> > The compatibility problems are totally the mess in this whole thing.
> > When you try to look at a "permissive" mode that actually works it gets
> > even more complex. Joao and I have been banging our heads on that
> > problem for months.
> >
> > But there are some expected users of this that say: we compile and
> > check our known set of binaries, we won't get any surprises. So it's
> > more of a distro problem.
> >
> > >
> > > And TBH #2, I would've done it even simpler: if some shared object
> > > can't
> > > do shadow stack, we disable it for the whole process. I mean, what's
> > > the
> > > point?
> >
> > You mean a late loaded dlopen()ed DSO? The enabling logic can't know
> > this will happen ahead of time.
> >
> > If you mean if the shared objects in the elf all support shadow stack,
> > then this is what happens. The complication is that the loader wants to
> > enable shadow stack before it has checked the elf libs so it doesn't
> > underflow the shadow stack when it returns from the function that does
> > this checking.
> >
> > So it does:
> > 1. Enable shadow stack
> > 2. Call elf libs checking functions
> > 3. If all good, lock shadow stack. Else, disable shadow stack.
> > 4. Return from elf checking functions and if shstk is enabled, don't
> > underflow because it was enabled in step 1 and we have return addresses
> > from 2 on the shadow stack
> >
> > I'm wondering if this can't be improved in glibc to look like:
> > 1. Check elf libs, and record it somewhere
> > 2. Wait until just the right spot
> > 3. If all good, enable and lock shadow stack.
>
> I will try it out.
>

Currently glibc enables shadow stack as early as possible.  There
are only a few places where a function call in glibc never returns.
We can enable shadow stack just before calling main.   There are
quite some code paths without shadow stack protection.   Is this
an issue?

H.J.
  
Edgecombe, Rick P March 10, 2023, 8:27 p.m. UTC | #10
On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote:
> > > So it does:
> > > 1. Enable shadow stack
> > > 2. Call elf libs checking functions
> > > 3. If all good, lock shadow stack. Else, disable shadow stack.
> > > 4. Return from elf checking functions and if shstk is enabled,
> > > don't
> > > underflow because it was enabled in step 1 and we have return
> > > addresses
> > > from 2 on the shadow stack
> > > 
> > > I'm wondering if this can't be improved in glibc to look like:
> > > 1. Check elf libs, and record it somewhere
> > > 2. Wait until just the right spot
> > > 3. If all good, enable and lock shadow stack.
> > 
> > I will try it out.
> > 
> 
> Currently glibc enables shadow stack as early as possible.  There
> are only a few places where a function call in glibc never returns.
> We can enable shadow stack just before calling main.   There are
> quite some code paths without shadow stack protection.   Is this
> an issue?

Thanks for checking. Hmm, does the loader get attacked?
  
H.J. Lu March 10, 2023, 8:43 p.m. UTC | #11
On Fri, Mar 10, 2023 at 12:27 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote:
> > > > So it does:
> > > > 1. Enable shadow stack
> > > > 2. Call elf libs checking functions
> > > > 3. If all good, lock shadow stack. Else, disable shadow stack.
> > > > 4. Return from elf checking functions and if shstk is enabled,
> > > > don't
> > > > underflow because it was enabled in step 1 and we have return
> > > > addresses
> > > > from 2 on the shadow stack
> > > >
> > > > I'm wondering if this can't be improved in glibc to look like:
> > > > 1. Check elf libs, and record it somewhere
> > > > 2. Wait until just the right spot
> > > > 3. If all good, enable and lock shadow stack.
> > >
> > > I will try it out.
> > >
> >
> > Currently glibc enables shadow stack as early as possible.  There
> > are only a few places where a function call in glibc never returns.
> > We can enable shadow stack just before calling main.   There are
> > quite some code paths without shadow stack protection.   Is this
> > an issue?
>
> Thanks for checking. Hmm, does the loader get attacked?

Not I know of.  But there are user codes from .init_array
and .preinit_array which are executed before main.   In theory,
an attack can happen before main.
  
Edgecombe, Rick P March 10, 2023, 9:01 p.m. UTC | #12
On Fri, 2023-03-10 at 12:43 -0800, H.J. Lu wrote:
> On Fri, Mar 10, 2023 at 12:27 PM Edgecombe, Rick P
> <rick.p.edgecombe@intel.com> wrote:
> > 
> > On Fri, 2023-03-10 at 12:00 -0800, H.J. Lu wrote:
> > > > > So it does:
> > > > > 1. Enable shadow stack
> > > > > 2. Call elf libs checking functions
> > > > > 3. If all good, lock shadow stack. Else, disable shadow
> > > > > stack.
> > > > > 4. Return from elf checking functions and if shstk is
> > > > > enabled,
> > > > > don't
> > > > > underflow because it was enabled in step 1 and we have return
> > > > > addresses
> > > > > from 2 on the shadow stack
> > > > > 
> > > > > I'm wondering if this can't be improved in glibc to look
> > > > > like:
> > > > > 1. Check elf libs, and record it somewhere
> > > > > 2. Wait until just the right spot
> > > > > 3. If all good, enable and lock shadow stack.
> > > > 
> > > > I will try it out.
> > > > 
> > > 
> > > Currently glibc enables shadow stack as early as possible.  There
> > > are only a few places where a function call in glibc never
> > > returns.
> > > We can enable shadow stack just before calling main.   There are
> > > quite some code paths without shadow stack protection.   Is this
> > > an issue?
> > 
> > Thanks for checking. Hmm, does the loader get attacked?
> 
> Not I know of.  But there are user codes from .init_array
> and .preinit_array which are executed before main.   In theory,
> an attack can happen before main.

Hmm, it would be nice to not add any startup overhead to non-shadow
stack binaries. I guess it's a tradeoff. Might be worth asking around.

But you can't just enable shadow stack before any user code? It has to
go something like?
1. Execute init codes
2. Check elf libs
3. Enable SHSTK

Or what if you just did the enable-disable dance if the execing binary
itself has shadow stack. If it doesn't have shadow stack, the elf libs
won't change the decision.
  

Patch

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 8d73004e4cac..bd16e012b3e9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -28,6 +28,7 @@  struct vm86;
 #include <asm/unwind_hints.h>
 #include <asm/vmxfeatures.h>
 #include <asm/vdso/processor.h>
+#include <asm/shstk.h>
 
 #include <linux/personality.h>
 #include <linux/cache.h>
@@ -475,6 +476,11 @@  struct thread_struct {
 	 */
 	u32			pkru;
 
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+	unsigned long		features;
+	unsigned long		features_locked;
+#endif
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
new file mode 100644
index 000000000000..ec753809f074
--- /dev/null
+++ b/arch/x86/include/asm/shstk.h
@@ -0,0 +1,21 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SHSTK_H
+#define _ASM_X86_SHSTK_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+struct task_struct;
+
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+long shstk_prctl(struct task_struct *task, int option, unsigned long features);
+void reset_thread_features(void);
+#else
+static inline long shstk_prctl(struct task_struct *task, int option,
+			       unsigned long arg2) { return -EINVAL; }
+static inline void reset_thread_features(void) {}
+#endif /* CONFIG_X86_USER_SHADOW_STACK */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_SHSTK_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 500b96e71f18..b2b3b7200b2d 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -20,4 +20,10 @@ 
 #define ARCH_MAP_VDSO_32		0x2002
 #define ARCH_MAP_VDSO_64		0x2003
 
+/* Don't use 0x3001-0x3004 because of old glibcs */
+
+#define ARCH_SHSTK_ENABLE		0x5001
+#define ARCH_SHSTK_DISABLE		0x5002
+#define ARCH_SHSTK_LOCK			0x5003
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 92446f1dedd7..b366641703e3 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -146,6 +146,8 @@  obj-$(CONFIG_CALL_THUNKS)		+= callthunks.o
 
 obj-$(CONFIG_X86_CET)			+= cet.o
 
+obj-$(CONFIG_X86_USER_SHADOW_STACK)	+= shstk.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4e34b3b68ebd..71094c8a305f 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -514,6 +514,8 @@  start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 		load_gs_index(__USER_DS);
 	}
 
+	reset_thread_features();
+
 	loadsegment(fs, 0);
 	loadsegment(es, _ds);
 	loadsegment(ds, _ds);
@@ -830,7 +832,10 @@  long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
 	case ARCH_MAP_VDSO_64:
 		return prctl_map_vdso(&vdso_image_64, arg2);
 #endif
-
+	case ARCH_SHSTK_ENABLE:
+	case ARCH_SHSTK_DISABLE:
+	case ARCH_SHSTK_LOCK:
+		return shstk_prctl(task, option, arg2);
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
new file mode 100644
index 000000000000..41ed6552e0a5
--- /dev/null
+++ b/arch/x86/kernel/shstk.c
@@ -0,0 +1,44 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * shstk.c - Intel shadow stack support
+ *
+ * Copyright (c) 2021, Intel Corporation.
+ * Yu-cheng Yu <yu-cheng.yu@intel.com>
+ */
+
+#include <linux/sched.h>
+#include <linux/bitops.h>
+#include <asm/prctl.h>
+
+void reset_thread_features(void)
+{
+	current->thread.features = 0;
+	current->thread.features_locked = 0;
+}
+
+long shstk_prctl(struct task_struct *task, int option, unsigned long features)
+{
+	if (option == ARCH_SHSTK_LOCK) {
+		task->thread.features_locked |= features;
+		return 0;
+	}
+
+	/* Don't allow via ptrace */
+	if (task != current)
+		return -EINVAL;
+
+	/* Do not allow to change locked features */
+	if (features & task->thread.features_locked)
+		return -EPERM;
+
+	/* Only support enabling/disabling one feature at a time. */
+	if (hweight_long(features) > 1)
+		return -EINVAL;
+
+	if (option == ARCH_SHSTK_DISABLE) {
+		return -EINVAL;
+	}
+
+	/* Handle ARCH_SHSTK_ENABLE */
+	return -EINVAL;
+}