[-v2] x86/retpoline: Ensure default return thunk isn't used at runtime

Message ID 20240104132446.GEZZaxnrIgIyat0pqf@fat_crate.local
State New
Headers
Series [-v2] x86/retpoline: Ensure default return thunk isn't used at runtime |

Commit Message

Borislav Petkov Jan. 4, 2024, 1:24 p.m. UTC
  Thoughts? Complaints?

---
From: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Wed, 3 Jan 2024 19:36:26 +0100

Make sure the default return thunk is not used after all return
instructions have been patched by the alternatives because the default
return thunk is insufficient when it comes to mitigating Retbleed or
SRSO.

Fix based on a earlier version by David Kaplan <david.kaplan@amd.com>.

  [ bp: Use the big-fat "NOTICE NOTICE" banner and fix the compilation
    error of warn_thunk_thunk being an invisible symbol. ]

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231010171020.462211-4-david.kaplan@amd.com
---
 arch/x86/entry/calling.h             | 33 ++++++++++++++++++++++++++++
 arch/x86/entry/entry.S               |  4 ++++
 arch/x86/entry/thunk_64.S            | 33 ----------------------------
 arch/x86/include/asm/nospec-branch.h |  2 ++
 arch/x86/kernel/cpu/bugs.c           | 16 ++++++++++++++
 arch/x86/lib/retpoline.S             | 15 +++++--------
 6 files changed, 61 insertions(+), 42 deletions(-)
  

Comments

Borislav Petkov Jan. 4, 2024, 1:26 p.m. UTC | #1
On Thu, Jan 04, 2024 at 02:24:46PM +0100, Borislav Petkov wrote:
> +void __warn_thunk(void)
> +{
> +	pr_warn_once("\n");
> +	pr_warn_once("**********************************************************\n");
> +	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
> +	pr_warn_once("**                                                      **\n");
> +	pr_warn_once("**   Unpatched return thunk in use. This should not     **\n");
> +	pr_warn_once("**   happen on a production kernel. Please report this  **\n");
> +	pr_warn_once("**   to x86@kernel.org.                                 **\n");

I'm not yet sure here whether this should say "upstream kernels" because
otherwise we'll get a bunch of distro or whatnot downstream kernels
reports where we can't really do anything about...

Hmmm.
  
Josh Poimboeuf Feb. 7, 2024, 5:50 p.m. UTC | #2
On Thu, Jan 04, 2024 at 02:26:23PM +0100, Borislav Petkov wrote:
> On Thu, Jan 04, 2024 at 02:24:46PM +0100, Borislav Petkov wrote:
> > +void __warn_thunk(void)
> > +{
> > +	pr_warn_once("\n");
> > +	pr_warn_once("**********************************************************\n");
> > +	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
> > +	pr_warn_once("**                                                      **\n");
> > +	pr_warn_once("**   Unpatched return thunk in use. This should not     **\n");
> > +	pr_warn_once("**   happen on a production kernel. Please report this  **\n");
> > +	pr_warn_once("**   to x86@kernel.org.                                 **\n");
> 
> I'm not yet sure here whether this should say "upstream kernels" because
> otherwise we'll get a bunch of distro or whatnot downstream kernels
> reports where we can't really do anything about...
> 
> Hmmm.

At the very least, the dump_stack() should be a WARN_ON_ONCE().
Otherwise this is actually *more* likely to be ignored since automated
tools don't have a way to catch it: no taint, no "WARNING" string, no
panic_on_warn, etc.

But also, I'm not a fan of the banner.  A warning is enough IMO.

Many/most warnings can be "security" issues.  A production server which
ignores warnings/taints/etc would be a much bigger problem.

And as you say, there are many frankenkernels out there and upstream
doesn't want to be in the business of debugging them.
  
Borislav Petkov Feb. 7, 2024, 6:53 p.m. UTC | #3
On Wed, Feb 07, 2024 at 09:50:10AM -0800, Josh Poimboeuf wrote:
> And as you say, there are many frankenkernels out there and upstream
> doesn't want to be in the business of debugging them.

Ok, all valid points. Diff ontop.

I'll queue it now so that it has ample time of cooking in linux-next.

Thx.

---

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 612c9ec456ae..5a300a7bad04 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2853,16 +2853,5 @@ ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *bu
 
 void __warn_thunk(void)
 {
-	pr_warn_once("\n");
-	pr_warn_once("**********************************************************\n");
-	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
-	pr_warn_once("**                                                      **\n");
-	pr_warn_once("**   Unpatched return thunk in use. This should not     **\n");
-	pr_warn_once("**   happen on a production kernel. Please report this  **\n");
-	pr_warn_once("**   to x86@kernel.org.                                 **\n");
-	pr_warn_once("**                                                      **\n");
-	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
-	pr_warn_once("**********************************************************\n");
-
-	dump_stack();
+	WARN_ONCE(1, "Unpatched return thunk in use. This should not happen!\n");
 }
  
Josh Poimboeuf Feb. 7, 2024, 7:49 p.m. UTC | #4
On Wed, Feb 07, 2024 at 07:53:28PM +0100, Borislav Petkov wrote:
> On Wed, Feb 07, 2024 at 09:50:10AM -0800, Josh Poimboeuf wrote:
> > And as you say, there are many frankenkernels out there and upstream
> > doesn't want to be in the business of debugging them.
> 
> Ok, all valid points. Diff ontop.
> 
> I'll queue it now so that it has ample time of cooking in linux-next.
> 
> Thx.
> 
> ---
> 
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 612c9ec456ae..5a300a7bad04 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -2853,16 +2853,5 @@ ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *bu
>  
>  void __warn_thunk(void)
>  {
> -	pr_warn_once("\n");
> -	pr_warn_once("**********************************************************\n");
> -	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
> -	pr_warn_once("**                                                      **\n");
> -	pr_warn_once("**   Unpatched return thunk in use. This should not     **\n");
> -	pr_warn_once("**   happen on a production kernel. Please report this  **\n");
> -	pr_warn_once("**   to x86@kernel.org.                                 **\n");
> -	pr_warn_once("**                                                      **\n");
> -	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
> -	pr_warn_once("**********************************************************\n");
> -
> -	dump_stack();
> +	WARN_ONCE(1, "Unpatched return thunk in use. This should not happen!\n");
>  }

LGTM, thanks!
  
Borislav Petkov Feb. 12, 2024, 10:43 a.m. UTC | #5
On Wed, Feb 07, 2024 at 11:49:19AM -0800, Josh Poimboeuf wrote:
> LGTM, thanks!

Thanks, had to hoist up both THUNK macros into the header to make that
nuisance 32-bit build too :)

---

commit 4461438a8405e800f90e0e40409e5f3d07eed381 (HEAD -> refs/heads/tip-x86-bugs)
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date:   Wed Jan 3 19:36:26 2024 +0100

    x86/retpoline: Ensure default return thunk isn't used at runtime
    
    Make sure the default return thunk is not used after all return
    instructions have been patched by the alternatives because the default
    return thunk is insufficient when it comes to mitigating Retbleed or
    SRSO.
    
    Fix based on an earlier version by David Kaplan <david.kaplan@amd.com>.
    
      [ bp: Fix the compilation error of warn_thunk_thunk being an invisible
            symbol, hoist thunk macro into calling.h ]
    
    Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/20231010171020.462211-4-david.kaplan@amd.com
    Link: https://lore.kernel.org/r/20240104132446.GEZZaxnrIgIyat0pqf@fat_crate.local

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 39e069b68c6e..bd31b2534053 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -426,3 +426,63 @@ For 32-bit we have the following conventions - kernel is built with
 .endm
 
 #endif /* CONFIG_SMP */
+
+#ifdef CONFIG_X86_64
+
+/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
+.macro THUNK name, func
+SYM_FUNC_START(\name)
+	pushq %rbp
+	movq %rsp, %rbp
+
+	pushq %rdi
+	pushq %rsi
+	pushq %rdx
+	pushq %rcx
+	pushq %rax
+	pushq %r8
+	pushq %r9
+	pushq %r10
+	pushq %r11
+
+	call \func
+
+	popq %r11
+	popq %r10
+	popq %r9
+	popq %r8
+	popq %rax
+	popq %rcx
+	popq %rdx
+	popq %rsi
+	popq %rdi
+	popq %rbp
+	RET
+SYM_FUNC_END(\name)
+	_ASM_NOKPROBE(\name)
+.endm
+
+#else /* CONFIG_X86_32 */
+
+/* put return address in eax (arg1) */
+.macro THUNK name, func, put_ret_addr_in_eax=0
+SYM_CODE_START_NOALIGN(\name)
+	pushl %eax
+	pushl %ecx
+	pushl %edx
+
+	.if \put_ret_addr_in_eax
+	/* Place EIP in the arg1 */
+	movl 3*4(%esp), %eax
+	.endif
+
+	call \func
+	popl %edx
+	popl %ecx
+	popl %eax
+	RET
+	_ASM_NOKPROBE(\name)
+SYM_CODE_END(\name)
+	.endm
+
+#endif
diff --git a/arch/x86/entry/entry.S b/arch/x86/entry/entry.S
index 8c8d38f0cb1d..582731f74dc8 100644
--- a/arch/x86/entry/entry.S
+++ b/arch/x86/entry/entry.S
@@ -7,6 +7,8 @@
 #include <linux/linkage.h>
 #include <asm/msr-index.h>
 
+#include "calling.h"
+
 .pushsection .noinstr.text, "ax"
 
 SYM_FUNC_START(entry_ibpb)
@@ -20,3 +22,5 @@ SYM_FUNC_END(entry_ibpb)
 EXPORT_SYMBOL_GPL(entry_ibpb);
 
 .popsection
+
+THUNK warn_thunk_thunk, __warn_thunk
diff --git a/arch/x86/entry/thunk_32.S b/arch/x86/entry/thunk_32.S
index 0103e103a657..da37f42f4549 100644
--- a/arch/x86/entry/thunk_32.S
+++ b/arch/x86/entry/thunk_32.S
@@ -4,33 +4,15 @@
  * Copyright 2008 by Steven Rostedt, Red Hat, Inc
  *  (inspired by Andi Kleen's thunk_64.S)
  */
-	#include <linux/export.h>
-	#include <linux/linkage.h>
-	#include <asm/asm.h>
 
-	/* put return address in eax (arg1) */
-	.macro THUNK name, func, put_ret_addr_in_eax=0
-SYM_CODE_START_NOALIGN(\name)
-	pushl %eax
-	pushl %ecx
-	pushl %edx
+#include <linux/export.h>
+#include <linux/linkage.h>
+#include <asm/asm.h>
 
-	.if \put_ret_addr_in_eax
-	/* Place EIP in the arg1 */
-	movl 3*4(%esp), %eax
-	.endif
+#include "calling.h"
 
-	call \func
-	popl %edx
-	popl %ecx
-	popl %eax
-	RET
-	_ASM_NOKPROBE(\name)
-SYM_CODE_END(\name)
-	.endm
-
-	THUNK preempt_schedule_thunk, preempt_schedule
-	THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
-	EXPORT_SYMBOL(preempt_schedule_thunk)
-	EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
+THUNK preempt_schedule_thunk, preempt_schedule
+THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
+EXPORT_SYMBOL(preempt_schedule_thunk)
+EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
 
diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index 416b400f39db..119ebdc3d362 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -9,39 +9,6 @@
 #include "calling.h"
 #include <asm/asm.h>
 
-	/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
-	.macro THUNK name, func
-SYM_FUNC_START(\name)
-	pushq %rbp
-	movq %rsp, %rbp
-
-	pushq %rdi
-	pushq %rsi
-	pushq %rdx
-	pushq %rcx
-	pushq %rax
-	pushq %r8
-	pushq %r9
-	pushq %r10
-	pushq %r11
-
-	call \func
-
-	popq %r11
-	popq %r10
-	popq %r9
-	popq %r8
-	popq %rax
-	popq %rcx
-	popq %rdx
-	popq %rsi
-	popq %rdi
-	popq %rbp
-	RET
-SYM_FUNC_END(\name)
-	_ASM_NOKPROBE(\name)
-	.endm
-
 THUNK preempt_schedule_thunk, preempt_schedule
 THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
 EXPORT_SYMBOL(preempt_schedule_thunk)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 2c0679ebe914..55754617eaee 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -357,6 +357,8 @@ extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
 
+extern void __warn_thunk(void);
+
 #ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
 extern void call_depth_return_thunk(void);
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index f2775417bda2..a78892b0f823 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2850,3 +2850,8 @@ ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *bu
 	return cpu_show_common(dev, attr, buf, X86_BUG_GDS);
 }
 #endif
+
+void __warn_thunk(void)
+{
+	WARN_ONCE(1, "Unpatched return thunk in use. This should not happen!\n");
+}
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 0045153ba222..721b528da9ac 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -369,19 +369,16 @@ SYM_FUNC_END(call_depth_return_thunk)
  * 'JMP __x86_return_thunk' sites are changed to something else by
  * apply_returns().
  *
- * This should be converted eventually to call a warning function which
- * should scream loudly when the default return thunk is called after
- * alternatives have been applied.
- *
- * That warning function cannot BUG() because the bug splat cannot be
- * displayed in all possible configurations, leading to users not really
- * knowing why the machine froze.
+ * The ALTERNATIVE below adds a really loud warning to catch the case
+ * where the insufficient default return thunk ends up getting used for
+ * whatever reason like miscompilation or failure of
+ * objtool/alternatives/etc to patch all the return sites.
  */
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	ANNOTATE_UNRET_SAFE
-	ret
+	ALTERNATIVE __stringify(ANNOTATE_UNRET_SAFE; ret), \
+		   "jmp warn_thunk_thunk", X86_FEATURE_ALWAYS
 	int3
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
  

Patch

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index e59d3073e7cf..a4679e8f30ad 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -426,3 +426,36 @@  For 32-bit we have the following conventions - kernel is built with
 .endm
 
 #endif /* CONFIG_SMP */
+
+/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
+.macro THUNK name, func
+SYM_FUNC_START(\name)
+	pushq %rbp
+	movq %rsp, %rbp
+
+	pushq %rdi
+	pushq %rsi
+	pushq %rdx
+	pushq %rcx
+	pushq %rax
+	pushq %r8
+	pushq %r9
+	pushq %r10
+	pushq %r11
+
+	call \func
+
+	popq %r11
+	popq %r10
+	popq %r9
+	popq %r8
+	popq %rax
+	popq %rcx
+	popq %rdx
+	popq %rsi
+	popq %rdi
+	popq %rbp
+	RET
+SYM_FUNC_END(\name)
+	_ASM_NOKPROBE(\name)
+.endm
diff --git a/arch/x86/entry/entry.S b/arch/x86/entry/entry.S
index 8c8d38f0cb1d..582731f74dc8 100644
--- a/arch/x86/entry/entry.S
+++ b/arch/x86/entry/entry.S
@@ -7,6 +7,8 @@ 
 #include <linux/linkage.h>
 #include <asm/msr-index.h>
 
+#include "calling.h"
+
 .pushsection .noinstr.text, "ax"
 
 SYM_FUNC_START(entry_ibpb)
@@ -20,3 +22,5 @@  SYM_FUNC_END(entry_ibpb)
 EXPORT_SYMBOL_GPL(entry_ibpb);
 
 .popsection
+
+THUNK warn_thunk_thunk, __warn_thunk
diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index 416b400f39db..119ebdc3d362 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -9,39 +9,6 @@ 
 #include "calling.h"
 #include <asm/asm.h>
 
-	/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
-	.macro THUNK name, func
-SYM_FUNC_START(\name)
-	pushq %rbp
-	movq %rsp, %rbp
-
-	pushq %rdi
-	pushq %rsi
-	pushq %rdx
-	pushq %rcx
-	pushq %rax
-	pushq %r8
-	pushq %r9
-	pushq %r10
-	pushq %r11
-
-	call \func
-
-	popq %r11
-	popq %r10
-	popq %r9
-	popq %r8
-	popq %rax
-	popq %rcx
-	popq %rdx
-	popq %rsi
-	popq %rdi
-	popq %rbp
-	RET
-SYM_FUNC_END(\name)
-	_ASM_NOKPROBE(\name)
-	.endm
-
 THUNK preempt_schedule_thunk, preempt_schedule
 THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
 EXPORT_SYMBOL(preempt_schedule_thunk)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 691ff1ef701b..64b175f03cdb 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -348,6 +348,8 @@  extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
 
+extern void __warn_thunk(void);
+
 #ifdef CONFIG_CALL_DEPTH_TRACKING
 extern void call_depth_return_thunk(void);
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bb0ab8466b91..b96483551299 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2849,3 +2849,19 @@  ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *bu
 	return cpu_show_common(dev, attr, buf, X86_BUG_GDS);
 }
 #endif
+
+void __warn_thunk(void)
+{
+	pr_warn_once("\n");
+	pr_warn_once("**********************************************************\n");
+	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
+	pr_warn_once("**                                                      **\n");
+	pr_warn_once("**   Unpatched return thunk in use. This should not     **\n");
+	pr_warn_once("**   happen on a production kernel. Please report this  **\n");
+	pr_warn_once("**   to x86@kernel.org.                                 **\n");
+	pr_warn_once("**                                                      **\n");
+	pr_warn_once("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
+	pr_warn_once("**********************************************************\n");
+
+	dump_stack();
+}
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 7b2589877d06..5ed0c22f5351 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -369,19 +369,16 @@  SYM_FUNC_END(call_depth_return_thunk)
  * 'JMP __x86_return_thunk' sites are changed to something else by
  * apply_returns().
  *
- * This should be converted eventually to call a warning function which
- * should scream loudly when the default return thunk is called after
- * alternatives have been applied.
- *
- * That warning function cannot BUG() because the bug splat cannot be
- * displayed in all possible configurations, leading to users not really
- * knowing why the machine froze.
+ * The ALTERNATIVE below adds a really loud warning to catch the case
+ * where the insufficient default return thunk ends up getting used for
+ * whatever reason like miscompilation or failure of
+ * objtool/alternatives/etc to patch all the return sites.
  */
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	ANNOTATE_UNRET_SAFE
-	ret
+	ALTERNATIVE __stringify(ANNOTATE_UNRET_SAFE; ret), \
+		   "jmp warn_thunk_thunk", X86_FEATURE_ALWAYS
 	int3
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)