[1/3] Revert "x86/retpoline: Remove .text..__x86.return_thunk section"

Message ID 20231010171020.462211-2-david.kaplan@amd.com
State New
Headers
Series Ensure default return thunk isn't used at runtime |

Commit Message

Kaplan, David Oct. 10, 2023, 5:10 p.m. UTC
  This reverts commit e92626af3234708fe30f53b269d210d202b95206.

This commit broke patching of the return thunk jmp in the retpoline
sequence.

Before (broken sequence):

objdump -d -r arch/x86/lib/retpoline.o:
0000000000000000 <__x86_indirect_thunk_array>:
   ...
   a:   e9 d1 02 00 00          jmpq   2e0 <__x86_return_thunk>

live disassembly at runtime:
0xffffffff81d12a8a <+10>:    jmpq   0xffffffff81d12d60
<__x86_return_thunk>

This jmp to the default return thunk should not happen after alternatives
patching.

After reverting this:

objdump -d -r arch/x86/lib/retpoline.o:
0000000000000000 <__x86_indirect_thunk_array>:
   ...
   a:   e9 00 00 00 00          jmpq   f <__x86_indirect_thunk_array+0xf>
                        b: R_X86_64_PLT32       __x86_return_thunk-0x4

live disassembly at runtime:
 0xffffffff81d12a8a <+10>:    jmpq   0xffffffff81f0410b
<srso_alias_return_thunk>

This is correct as the jmp is written to the correct return sequence.

objtool (add_jump_destinations()) only recognizes return thunk jmps that have
relocation entries, which will not occur if the return thunk is in the
same section as the indirect thunks.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
---
 arch/x86/kernel/vmlinux.lds.S | 3 +++
 arch/x86/lib/retpoline.S      | 2 ++
 2 files changed, 5 insertions(+)
  

Comments

Peter Zijlstra Oct. 10, 2023, 5:48 p.m. UTC | #1
On Tue, Oct 10, 2023 at 12:10:18PM -0500, David Kaplan wrote:

>  arch/x86/kernel/vmlinux.lds.S | 3 +++
>  arch/x86/lib/retpoline.S      | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 9cdb1a7332c4..54a5596adaa6 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -132,7 +132,10 @@ SECTIONS
>  		LOCK_TEXT
>  		KPROBES_TEXT
>  		SOFTIRQENTRY_TEXT
> +#ifdef CONFIG_RETPOLINE
>  		*(.text..__x86.indirect_thunk)
> +		*(.text..__x86.return_thunk)
> +#endif
>  		STATIC_CALL_TEXT
>  
>  		ALIGN_ENTRY_TEXT_BEGIN
> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> index db813113e637..3da768a71cf9 100644
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -129,6 +129,8 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
>  
>  #ifdef CONFIG_RETHUNK

Perhaps elucidate the future reader with a comment here? Lest someone
tries removing it again.

>  
> +	.section .text..__x86.return_thunk
> +
>  #ifdef CONFIG_CPU_SRSO
  
Josh Poimboeuf Oct. 10, 2023, 7:57 p.m. UTC | #2
On Tue, Oct 10, 2023 at 07:48:33PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 10, 2023 at 12:10:18PM -0500, David Kaplan wrote:
> 
> >  arch/x86/kernel/vmlinux.lds.S | 3 +++
> >  arch/x86/lib/retpoline.S      | 2 ++
> >  2 files changed, 5 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> > index 9cdb1a7332c4..54a5596adaa6 100644
> > --- a/arch/x86/kernel/vmlinux.lds.S
> > +++ b/arch/x86/kernel/vmlinux.lds.S
> > @@ -132,7 +132,10 @@ SECTIONS
> >  		LOCK_TEXT
> >  		KPROBES_TEXT
> >  		SOFTIRQENTRY_TEXT
> > +#ifdef CONFIG_RETPOLINE
> >  		*(.text..__x86.indirect_thunk)
> > +		*(.text..__x86.return_thunk)
> > +#endif
> >  		STATIC_CALL_TEXT
> >  
> >  		ALIGN_ENTRY_TEXT_BEGIN
> > diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> > index db813113e637..3da768a71cf9 100644
> > --- a/arch/x86/lib/retpoline.S
> > +++ b/arch/x86/lib/retpoline.S
> > @@ -129,6 +129,8 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
> >  
> >  #ifdef CONFIG_RETHUNK
> 
> Perhaps elucidate the future reader with a comment here? Lest someone
> tries removing it again.

Yes, please.

Also we could make objtool properly detect the non-relocated jump
target.  But this works for now.
  
Borislav Petkov Oct. 10, 2023, 8:04 p.m. UTC | #3
On Tue, Oct 10, 2023 at 12:57:21PM -0700, Josh Poimboeuf wrote:
> Also we could make objtool properly detect the non-relocated jump
> target.

I was wondering about that... I guess it can compute the JMP target and
compare it to the address of __x86_return_thunk?
  
Josh Poimboeuf Oct. 10, 2023, 8:19 p.m. UTC | #4
On Tue, Oct 10, 2023 at 10:04:29PM +0200, Borislav Petkov wrote:
> On Tue, Oct 10, 2023 at 12:57:21PM -0700, Josh Poimboeuf wrote:
> > Also we could make objtool properly detect the non-relocated jump
> > target.
> 
> I was wondering about that... I guess it can compute the JMP target and
> compare it to the address of __x86_return_thunk?

Fine, you twisted my arm ;-)

This seems to do the trick.  Lemme write up a proper patch.

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e308d1ba664e..6cbc9812a36e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1610,6 +1610,11 @@ static int add_jump_destinations(struct objtool_file *file)
 			return -1;
 		}
 
+		if (jump_dest->sym && jump_dest->sym->return_thunk) {
+			add_return_call(file, insn, true);
+			continue;
+		}
+
 		/*
 		 * Cross-function jump.
 		 */
  
Kaplan, David Oct. 10, 2023, 8:40 p.m. UTC | #5
[AMD Official Use Only - General]

> -----Original Message-----
> From: Josh Poimboeuf <jpoimboe@kernel.org>
> Sent: Tuesday, October 10, 2023 3:19 PM
> To: Borislav Petkov <bp@alien8.de>
> Cc: Peter Zijlstra <peterz@infradead.org>; Kaplan, David
> <David.Kaplan@amd.com>; x86@kernel.org; luto@kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH 1/3] Revert "x86/retpoline: Remove
> .text..__x86.return_thunk section"
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Tue, Oct 10, 2023 at 10:04:29PM +0200, Borislav Petkov wrote:
> > On Tue, Oct 10, 2023 at 12:57:21PM -0700, Josh Poimboeuf wrote:
> > > Also we could make objtool properly detect the non-relocated jump
> > > target.
> >
> > I was wondering about that... I guess it can compute the JMP target
> > and compare it to the address of __x86_return_thunk?
>
> Fine, you twisted my arm ;-)
>
> This seems to do the trick.  Lemme write up a proper patch.
>
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c index
> e308d1ba664e..6cbc9812a36e 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1610,6 +1610,11 @@ static int add_jump_destinations(struct
> objtool_file *file)
>                         return -1;
>                 }
>
> +               if (jump_dest->sym && jump_dest->sym->return_thunk) {
> +                       add_return_call(file, insn, true);
> +                       continue;
> +               }
> +
>                 /*
>                  * Cross-function jump.
>                  */

This looks good to me in my testing so far.

--David Kaplan
  
Josh Poimboeuf Oct. 10, 2023, 9:22 p.m. UTC | #6
On Tue, Oct 10, 2023 at 01:19:12PM -0700, Josh Poimboeuf wrote:
> On Tue, Oct 10, 2023 at 10:04:29PM +0200, Borislav Petkov wrote:
> > On Tue, Oct 10, 2023 at 12:57:21PM -0700, Josh Poimboeuf wrote:
> > > Also we could make objtool properly detect the non-relocated jump
> > > target.
> > 
> > I was wondering about that... I guess it can compute the JMP target and
> > compare it to the address of __x86_return_thunk?
> 
> Fine, you twisted my arm ;-)
> 
> This seems to do the trick.  Lemme write up a proper patch.

Here is said patch.

---8<---

From: Josh Poimboeuf <jpoimboe@kernel.org>
Subject: [PATCH] objtool: Fix return thunk patching in retpolines

With CONFIG_RETHUNK enabled, the compiler replaces every RET with a tail
call to a return thunk ('JMP __x86_return_thunk').  Objtool annotates
all such return sites so they can be patched during boot by
apply_returns().

The implementation of __x86_return_thunk() is just a bare RET.  It's
only meant to be used temporarily until apply_returns() patches all
return sites with either a JMP to another return thunk or an actual RET.

The following commit

  e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section") retpolines

broke objtool's detection of return sites in retpolines.  Since
retpolines and return thunks are now in the same section, the compiler
no longer uses relocations for the intra-section jumps between the
retpolines and the return thunk, causing objtool to overlook them.

As a result, none of the retpolines' return sites get patched.  Each one
stays at 'JMP __x86_return_thunk', effectively a bare RET.

Fix it by teaching objtool to detect when a non-relocated jump target is
a return thunk.

Fixes: e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section")
Reported-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 tools/objtool/check.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e308d1ba664e..556469db4239 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
 			return -1;
 		}
 
+		/*
+		 * Since retpolines are in the same section as the return
+		 * thunk, they might not use a relocation when branching to it.
+		 */
+		if (jump_dest->sym && jump_dest->sym->return_thunk) {
+			add_return_call(file, insn, true);
+			continue;
+		}
+
 		/*
 		 * Cross-function jump.
 		 */
  
Peter Zijlstra Oct. 11, 2023, 7:41 a.m. UTC | #7
On Tue, Oct 10, 2023 at 02:22:54PM -0700, Josh Poimboeuf wrote:

> From: Josh Poimboeuf <jpoimboe@kernel.org>
> Subject: [PATCH] objtool: Fix return thunk patching in retpolines
> 
> With CONFIG_RETHUNK enabled, the compiler replaces every RET with a tail
> call to a return thunk ('JMP __x86_return_thunk').  Objtool annotates
> all such return sites so they can be patched during boot by
> apply_returns().
> 
> The implementation of __x86_return_thunk() is just a bare RET.  It's
> only meant to be used temporarily until apply_returns() patches all
> return sites with either a JMP to another return thunk or an actual RET.
> 
> The following commit
> 
>   e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section") retpolines
> 
> broke objtool's detection of return sites in retpolines.  Since
> retpolines and return thunks are now in the same section, the compiler
> no longer uses relocations for the intra-section jumps between the
> retpolines and the return thunk, causing objtool to overlook them.
> 
> As a result, none of the retpolines' return sites get patched.  Each one
> stays at 'JMP __x86_return_thunk', effectively a bare RET.
> 
> Fix it by teaching objtool to detect when a non-relocated jump target is
> a return thunk.
> 
> Fixes: e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section")
> Reported-by: David Kaplan <david.kaplan@amd.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
>  tools/objtool/check.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index e308d1ba664e..556469db4239 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
>  			return -1;
>  		}
>  
> +		/*
> +		 * Since retpolines are in the same section as the return
> +		 * thunk, they might not use a relocation when branching to it.
> +		 */
> +		if (jump_dest->sym && jump_dest->sym->return_thunk) {
> +			add_return_call(file, insn, true);
> +			continue;
> +		}

*urgh*... I mean, yes, that obviously works, but should we not also have
the retpoline thingy for consistency? That case makes less sense though
:/

Perhaps warn about this instead of fixing it? Forcing people to play the
section game?

I dunno.. no real strong opinions.
  
Borislav Petkov Oct. 11, 2023, 9:34 a.m. UTC | #8
On Wed, Oct 11, 2023 at 09:41:42AM +0200, Peter Zijlstra wrote:
> *urgh*... I mean, yes, that obviously works, but should we not also have
> the retpoline thingy for consistency? That case makes less sense though
> :/
> 
> Perhaps warn about this instead of fixing it? Forcing people to play the
> section game?

I like the conservative aspect of that: keep the separate sections and
warn if the return thunk is in the same section.

Thx.
  
Josh Poimboeuf Oct. 11, 2023, 4:28 p.m. UTC | #9
On Wed, Oct 11, 2023 at 09:41:42AM +0200, Peter Zijlstra wrote:
> > +++ b/tools/objtool/check.c
> > @@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
> >  			return -1;
> >  		}
> >  
> > +		/*
> > +		 * Since retpolines are in the same section as the return
> > +		 * thunk, they might not use a relocation when branching to it.
> > +		 */
> > +		if (jump_dest->sym && jump_dest->sym->return_thunk) {
> > +			add_return_call(file, insn, true);
> > +			continue;
> > +		}
> 
> *urgh*... I mean, yes, that obviously works, but should we not also have
> the retpoline thingy for consistency? That case makes less sense though
> :/

Consistency with what?  The extra section seems pointless but maybe I'm
missing something.
  
Peter Zijlstra Oct. 11, 2023, 10:35 p.m. UTC | #10
On Wed, Oct 11, 2023 at 09:28:43AM -0700, Josh Poimboeuf wrote:
> On Wed, Oct 11, 2023 at 09:41:42AM +0200, Peter Zijlstra wrote:
> > > +++ b/tools/objtool/check.c
> > > @@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
> > >  			return -1;
> > >  		}
> > >  
> > > +		/*
> > > +		 * Since retpolines are in the same section as the return
> > > +		 * thunk, they might not use a relocation when branching to it.
> > > +		 */
> > > +		if (jump_dest->sym && jump_dest->sym->return_thunk) {
> > > +			add_return_call(file, insn, true);
> > > +			continue;
> > > +		}
> > 
> > *urgh*... I mean, yes, that obviously works, but should we not also have
> > the retpoline thingy for consistency? That case makes less sense though
> > :/
> 
> Consistency with what? 

the reloc case; specifically, I was thinking something along these
lines:

		if (jump-dest->sym && jump_dest->sym->retpoline_thunk) {
			add_retpoline_call(file, insn);
			continue;
		}

Then both reloc and immediate versions are more or less the same.

> The extra section seems pointless but maybe I'm missing something.

By having the section things are better delineated I suppose, be it
retpolines or rethunks, all references should be to inside the section
(and thus have a reloc) while within the section there should never be a
reference to itself.

I'm not sure it's worth much, but then we can have the above two cases
issue a WARN instead of fixing it up.

I don't care too deeply, I can't make up my mind either way. But perhaps
keeping the section is easier on all the backports, it's easy to forget
a tiny objtool patch like this.
  
Ingo Molnar Oct. 11, 2023, 10:42 p.m. UTC | #11
* Peter Zijlstra <peterz@infradead.org> wrote:

> I don't care too deeply, I can't make up my mind either way. But perhaps 
> keeping the section is easier on all the backports, it's easy to forget a 
> tiny objtool patch like this.

If the objtool fix has a Fixes tag that points to one of the major 
mitigation commits, then it won't be forgotten.

Arguably the new objtool is more robust against what could happen, so that 
patch is not going away - and with that patch mainline doesn't have to keep 
the (now ...) pointless section.

Maybe change the commit order around: first add the objtool fix, then 
remove the section, pointing back to the objtool SHA1 in the very next 
commit, explaining that the objtool fix enables this change. That makes it 
all backporting-proof as well.

Thanks,

	Ingo
  
Josh Poimboeuf Oct. 12, 2023, 2:27 a.m. UTC | #12
On Thu, Oct 12, 2023 at 12:35:13AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 11, 2023 at 09:28:43AM -0700, Josh Poimboeuf wrote:
> > On Wed, Oct 11, 2023 at 09:41:42AM +0200, Peter Zijlstra wrote:
> > > > +++ b/tools/objtool/check.c
> > > > @@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
> > > >  			return -1;
> > > >  		}
> > > >  
> > > > +		/*
> > > > +		 * Since retpolines are in the same section as the return
> > > > +		 * thunk, they might not use a relocation when branching to it.
> > > > +		 */
> > > > +		if (jump_dest->sym && jump_dest->sym->return_thunk) {
> > > > +			add_return_call(file, insn, true);
> > > > +			continue;
> > > > +		}
> > > 
> > > *urgh*... I mean, yes, that obviously works, but should we not also have
> > > the retpoline thingy for consistency? That case makes less sense though
> > > :/
> > 
> > Consistency with what? 
> 
> the reloc case; specifically, I was thinking something along these
> lines:
> 
> 		if (jump-dest->sym && jump_dest->sym->retpoline_thunk) {
> 			add_retpoline_call(file, insn);
> 			continue;
> 		}
> 
> Then both reloc and immediate versions are more or less the same.

Ok, I see.  If somebody were to do a {JMP,CALL}_NOSPEC somewhere in
retpoline.S, it would break similarly.

It doesn't hurt to add that to the patch, that way retpoline/rethunk
site detection is all handled the same.

> > The extra section seems pointless but maybe I'm missing something.
> 
> By having the section things are better delineated I suppose, be it
> retpolines or rethunks, all references should be to inside the section
> (and thus have a reloc) while within the section there should never be
> a reference to itself.

As far as delineating things goes, a good argument could be made to
instead do that on the source code level.  i.e., put the rethunk code in
rethunk.S rather than retpoline.S.  Incidentally, that would also fix
this problem.

From an object code standpoint, objtool is the only one who cares about
the relocs.  It's a good idea to make objtool more robust against
non-relocs regardless, as the reloc assumption could always be broken
later by LTO.

BTW, I wonder if we can also get rid of .text..__x86.indirect_thunk and
just put most of the retpoline/rethunk code in .text (other than than
the SRSO aliasing hacks which still need special placement).
  
Peter Zijlstra Oct. 12, 2023, 8:16 a.m. UTC | #13
On Wed, Oct 11, 2023 at 07:27:58PM -0700, Josh Poimboeuf wrote:
> From an object code standpoint, objtool is the only one who cares about
> the relocs.  It's a good idea to make objtool more robust against
> non-relocs regardless, as the reloc assumption could always be broken
> later by LTO.

AFAIK LTO runs before the actual link stage and doesn't resolve
inter-section references, but yeah, that might just be an implementation
detail.

Fair enough.
  

Patch

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9cdb1a7332c4..54a5596adaa6 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -132,7 +132,10 @@  SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		SOFTIRQENTRY_TEXT
+#ifdef CONFIG_RETPOLINE
 		*(.text..__x86.indirect_thunk)
+		*(.text..__x86.return_thunk)
+#endif
 		STATIC_CALL_TEXT
 
 		ALIGN_ENTRY_TEXT_BEGIN
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index db813113e637..3da768a71cf9 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -129,6 +129,8 @@  SYM_CODE_END(__x86_indirect_jump_thunk_array)
 
 #ifdef CONFIG_RETHUNK
 
+	.section .text..__x86.return_thunk
+
 #ifdef CONFIG_CPU_SRSO
 
 /*