[PATCHv14,01/17] x86/mm: Rework address range check in get_user() and put_user()

Message ID 20230111123736.20025-2-kirill.shutemov@linux.intel.com
State New
Headers
Series Linear Address Masking enabling |

Commit Message

Kirill A. Shutemov Jan. 11, 2023, 12:37 p.m. UTC
  The functions get_user() and put_user() check that the target address
range resides in the user space portion of the virtual address space.
In order to perform this check, the functions compare the end of the
range against TASK_SIZE_MAX.

For kernels compiled with CONFIG_X86_5LEVEL, this process requires some
additional trickery using ALTERNATIVE, as TASK_SIZE_MAX depends on the
paging mode in use.

Linus suggested that this check could be simplified for 64-bit kernels.
It is sufficient to check bit 63 of the address to ensure that the range
belongs to user space. Additionally, the use of branches can be avoided
by setting the target address to all ones if bit 63 is set.

There's no need to check the end of the access range as there's huge
gap between end of userspace range and start of the kernel range. The
gap consists of canonical hole and unused ranges on both kernel and
userspace sides.

If an address with bit 63 set is passed down, it will trigger a #GP
exception. _ASM_EXTABLE_UA() complains about this. Replace it with
plain _ASM_EXTABLE() as it is expected behaviour now.

The updated get_user() and put_user() checks are also compatible with
Linear Address Masking, which allows user space to encode metadata in
the upper bits of pointers and eliminates the need to untag the address
before handling it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 arch/x86/lib/getuser.S | 83 ++++++++++++++++--------------------------
 arch/x86/lib/putuser.S | 54 ++++++++++++---------------
 2 files changed, 55 insertions(+), 82 deletions(-)
  

Comments

Peter Zijlstra Jan. 18, 2023, 3:49 p.m. UTC | #1
On Wed, Jan 11, 2023 at 03:37:20PM +0300, Kirill A. Shutemov wrote:
> The functions get_user() and put_user() check that the target address
> range resides in the user space portion of the virtual address space.
> In order to perform this check, the functions compare the end of the
> range against TASK_SIZE_MAX.
> 
> For kernels compiled with CONFIG_X86_5LEVEL, this process requires some
> additional trickery using ALTERNATIVE, as TASK_SIZE_MAX depends on the
> paging mode in use.
> 
> Linus suggested that this check could be simplified for 64-bit kernels.
> It is sufficient to check bit 63 of the address to ensure that the range
> belongs to user space. Additionally, the use of branches can be avoided
> by setting the target address to all ones if bit 63 is set.
> 
> There's no need to check the end of the access range as there's huge
> gap between end of userspace range and start of the kernel range. The
> gap consists of canonical hole and unused ranges on both kernel and
> userspace sides.

So far I can follow, however

> If an address with bit 63 set is passed down, it will trigger a #GP
> exception. _ASM_EXTABLE_UA() complains about this. Replace it with
> plain _ASM_EXTABLE() as it is expected behaviour now.

here I don't. The new logic basically squishes every kernel address to
-1L -- a known unmapped address, but getting that address in
{get,put}_user() is still a fail, right?

We used to manually branch to bad_get_user when outside TASK_SIZE_MAX,
now we rely on #GP.

So why silence it?
  
Linus Torvalds Jan. 18, 2023, 3:59 p.m. UTC | #2
On Wed, Jan 18, 2023 at 7:50 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 11, 2023 at 03:37:20PM +0300, Kirill A. Shutemov wrote:
>
> > If an address with bit 63 set is passed down, it will trigger a #GP
> > exception. _ASM_EXTABLE_UA() complains about this. Replace it with
> > plain _ASM_EXTABLE() as it is expected behaviour now.
>
> here I don't. The new logic basically squishes every kernel address to
> -1L -- a known unmapped address, but getting that address in
> {get,put}_user() is still a fail, right?
>
> We used to manually branch to bad_get_user when outside TASK_SIZE_MAX,
> now we rely on #GP.
>
> So why silence it?

We don't silence it - for a kernel address that turns into an all-ones
address, the the _ASM_EXTABLE() will still cause the -EFAULT due to
the page fault.

But it's not the high bit set case that is the problem here.

The problem is a "positive" address that is non-canonical.

Testing against TASK_SIZE_MAX would catch non-canonical addresses
before the access, and we'd return -EFAULT.

But now that we don't test against TASK_SIZE_MAX any more,
non-canonical accesses will cause a GP fault, and *that* message is
what we want to silence.

We'll still return -EFAULT, of course, we're just getting rid of the

        WARN_ONCE(trapnr == X86_TRAP_GP,
                "General protection fault in user access.
Non-canonical address?");

issue that comes from not being so exact about the address limit any more.

                 Linus
  
Peter Zijlstra Jan. 18, 2023, 4:48 p.m. UTC | #3
On Wed, Jan 18, 2023 at 07:59:21AM -0800, Linus Torvalds wrote:

> We don't silence it - for a kernel address that turns into an all-ones
> address, the the _ASM_EXTABLE() will still cause the -EFAULT due to
> the page fault.

> But it's not the high bit set case that is the problem here.

Yes, and the explicit bad_get_user jump would not print the message and
now with _UA removed it won't either (I seem to have my wires crossed
just now).

> The problem is a "positive" address that is non-canonical.
> 
> Testing against TASK_SIZE_MAX would catch non-canonical addresses
> before the access, and we'd return -EFAULT.
> 
> But now that we don't test against TASK_SIZE_MAX any more,
> non-canonical accesses will cause a GP fault, and *that* message is
> what we want to silence.

Right, but I was thinking that we'd explicitly allowed those because
with LAM enabled we'd actually accept those addresses.

> We'll still return -EFAULT, of course, we're just getting rid of the
> 
>         WARN_ONCE(trapnr == X86_TRAP_GP,
>                 "General protection fault in user access.
> Non-canonical address?");
> 
> issue that comes from not being so exact about the address limit any more.

Ah indeed, so for !LAM we'd now print the message were we would not
before (the whole TASK_SIZE_MAX+ range).

OK, agreed.
  
Linus Torvalds Jan. 18, 2023, 5:01 p.m. UTC | #4
On Wed, Jan 18, 2023 at 8:49 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> > We'll still return -EFAULT, of course, we're just getting rid of the
> >
> >         WARN_ONCE(trapnr == X86_TRAP_GP,
> >                 "General protection fault in user access.
> > Non-canonical address?");
> >
> > issue that comes from not being so exact about the address limit any more.
>
> Ah indeed, so for !LAM we'd now print the message were we would not
> before (the whole TASK_SIZE_MAX+ range).

Yeah.

We could just remove that warning entirely, but it has been useful for
syzbot catching random user addresses that weren't caught by
"access_ok()" when people did bad bad things (ie using the
non-checking "__copy_from_user()" and friends).

I'm not sure how much that warning is worth any more - and for
get_user() and put_user() itself it buys us nothing, since by
definition _those_ do the range checking. Christoph getting rid of the
set_fs() model simplified a lot of our user address checking.

But I think it's easier to just keep that existing warning about "how
did you get a non-canonical address here" for other user accesses, and
just make get/put_user() use that _ASM_EXTABLE() version that doesn't
do it.

               Linus
  

Patch

diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index b70d98d79a9d..b64a2bd1a1ef 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -37,22 +37,22 @@ 
 
 #define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence", X86_FEATURE_LFENCE_RDTSC
 
-#ifdef CONFIG_X86_5LEVEL
-#define LOAD_TASK_SIZE_MINUS_N(n) \
-	ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rdx), \
-		    __stringify(mov $((1 << 56) - 4096 - (n)),%rdx), X86_FEATURE_LA57
-#else
-#define LOAD_TASK_SIZE_MINUS_N(n) \
-	mov $(TASK_SIZE_MAX - (n)),%_ASM_DX
-#endif
+.macro check_range size:req
+.if IS_ENABLED(CONFIG_X86_64)
+	mov %rax, %rdx
+	sar $63, %rdx
+	or %rdx, %rax
+.else
+	cmp $TASK_SIZE_MAX-\size+1, %eax
+	jae .Lbad_get_user
+	sbb %edx, %edx		/* array_index_mask_nospec() */
+	and %edx, %eax
+.endif
+.endm
 
 	.text
 SYM_FUNC_START(__get_user_1)
-	LOAD_TASK_SIZE_MINUS_N(0)
-	cmp %_ASM_DX,%_ASM_AX
-	jae bad_get_user
-	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
-	and %_ASM_DX, %_ASM_AX
+	check_range size=1
 	ASM_STAC
 1:	movzbl (%_ASM_AX),%edx
 	xor %eax,%eax
@@ -62,11 +62,7 @@  SYM_FUNC_END(__get_user_1)
 EXPORT_SYMBOL(__get_user_1)
 
 SYM_FUNC_START(__get_user_2)
-	LOAD_TASK_SIZE_MINUS_N(1)
-	cmp %_ASM_DX,%_ASM_AX
-	jae bad_get_user
-	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
-	and %_ASM_DX, %_ASM_AX
+	check_range size=2
 	ASM_STAC
 2:	movzwl (%_ASM_AX),%edx
 	xor %eax,%eax
@@ -76,11 +72,7 @@  SYM_FUNC_END(__get_user_2)
 EXPORT_SYMBOL(__get_user_2)
 
 SYM_FUNC_START(__get_user_4)
-	LOAD_TASK_SIZE_MINUS_N(3)
-	cmp %_ASM_DX,%_ASM_AX
-	jae bad_get_user
-	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
-	and %_ASM_DX, %_ASM_AX
+	check_range size=4
 	ASM_STAC
 3:	movl (%_ASM_AX),%edx
 	xor %eax,%eax
@@ -90,30 +82,17 @@  SYM_FUNC_END(__get_user_4)
 EXPORT_SYMBOL(__get_user_4)
 
 SYM_FUNC_START(__get_user_8)
-#ifdef CONFIG_X86_64
-	LOAD_TASK_SIZE_MINUS_N(7)
-	cmp %_ASM_DX,%_ASM_AX
-	jae bad_get_user
-	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
-	and %_ASM_DX, %_ASM_AX
+	check_range size=8
 	ASM_STAC
+#ifdef CONFIG_X86_64
 4:	movq (%_ASM_AX),%rdx
-	xor %eax,%eax
-	ASM_CLAC
-	RET
 #else
-	LOAD_TASK_SIZE_MINUS_N(7)
-	cmp %_ASM_DX,%_ASM_AX
-	jae bad_get_user_8
-	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
-	and %_ASM_DX, %_ASM_AX
-	ASM_STAC
 4:	movl (%_ASM_AX),%edx
 5:	movl 4(%_ASM_AX),%ecx
+#endif
 	xor %eax,%eax
 	ASM_CLAC
 	RET
-#endif
 SYM_FUNC_END(__get_user_8)
 EXPORT_SYMBOL(__get_user_8)
 
@@ -166,7 +145,7 @@  EXPORT_SYMBOL(__get_user_nocheck_8)
 
 SYM_CODE_START_LOCAL(.Lbad_get_user_clac)
 	ASM_CLAC
-bad_get_user:
+.Lbad_get_user:
 	xor %edx,%edx
 	mov $(-EFAULT),%_ASM_AX
 	RET
@@ -184,23 +163,23 @@  SYM_CODE_END(.Lbad_get_user_8_clac)
 #endif
 
 /* get_user */
-	_ASM_EXTABLE_UA(1b, .Lbad_get_user_clac)
-	_ASM_EXTABLE_UA(2b, .Lbad_get_user_clac)
-	_ASM_EXTABLE_UA(3b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(1b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(2b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(3b, .Lbad_get_user_clac)
 #ifdef CONFIG_X86_64
-	_ASM_EXTABLE_UA(4b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(4b, .Lbad_get_user_clac)
 #else
-	_ASM_EXTABLE_UA(4b, .Lbad_get_user_8_clac)
-	_ASM_EXTABLE_UA(5b, .Lbad_get_user_8_clac)
+	_ASM_EXTABLE(4b, .Lbad_get_user_8_clac)
+	_ASM_EXTABLE(5b, .Lbad_get_user_8_clac)
 #endif
 
 /* __get_user */
-	_ASM_EXTABLE_UA(6b, .Lbad_get_user_clac)
-	_ASM_EXTABLE_UA(7b, .Lbad_get_user_clac)
-	_ASM_EXTABLE_UA(8b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(6b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(7b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(8b, .Lbad_get_user_clac)
 #ifdef CONFIG_X86_64
-	_ASM_EXTABLE_UA(9b, .Lbad_get_user_clac)
+	_ASM_EXTABLE(9b, .Lbad_get_user_clac)
 #else
-	_ASM_EXTABLE_UA(9b, .Lbad_get_user_8_clac)
-	_ASM_EXTABLE_UA(10b, .Lbad_get_user_8_clac)
+	_ASM_EXTABLE(9b, .Lbad_get_user_8_clac)
+	_ASM_EXTABLE(10b, .Lbad_get_user_8_clac)
 #endif
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index 32125224fcca..3062d09a776d 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -33,20 +33,20 @@ 
  * as they get called from within inline assembly.
  */
 
-#ifdef CONFIG_X86_5LEVEL
-#define LOAD_TASK_SIZE_MINUS_N(n) \
-	ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rbx), \
-		    __stringify(mov $((1 << 56) - 4096 - (n)),%rbx), X86_FEATURE_LA57
-#else
-#define LOAD_TASK_SIZE_MINUS_N(n) \
-	mov $(TASK_SIZE_MAX - (n)),%_ASM_BX
-#endif
+.macro check_range size:req
+.if IS_ENABLED(CONFIG_X86_64)
+	mov %rcx, %rbx
+	sar $63, %rbx
+	or %rbx, %rcx
+.else
+	cmp $TASK_SIZE_MAX-\size+1, %ecx
+	jae .Lbad_put_user
+.endif
+.endm
 
 .text
 SYM_FUNC_START(__put_user_1)
-	LOAD_TASK_SIZE_MINUS_N(0)
-	cmp %_ASM_BX,%_ASM_CX
-	jae .Lbad_put_user
+	check_range size=1
 	ASM_STAC
 1:	movb %al,(%_ASM_CX)
 	xor %ecx,%ecx
@@ -66,9 +66,7 @@  SYM_FUNC_END(__put_user_nocheck_1)
 EXPORT_SYMBOL(__put_user_nocheck_1)
 
 SYM_FUNC_START(__put_user_2)
-	LOAD_TASK_SIZE_MINUS_N(1)
-	cmp %_ASM_BX,%_ASM_CX
-	jae .Lbad_put_user
+	check_range size=2
 	ASM_STAC
 3:	movw %ax,(%_ASM_CX)
 	xor %ecx,%ecx
@@ -88,9 +86,7 @@  SYM_FUNC_END(__put_user_nocheck_2)
 EXPORT_SYMBOL(__put_user_nocheck_2)
 
 SYM_FUNC_START(__put_user_4)
-	LOAD_TASK_SIZE_MINUS_N(3)
-	cmp %_ASM_BX,%_ASM_CX
-	jae .Lbad_put_user
+	check_range size=4
 	ASM_STAC
 5:	movl %eax,(%_ASM_CX)
 	xor %ecx,%ecx
@@ -110,9 +106,7 @@  SYM_FUNC_END(__put_user_nocheck_4)
 EXPORT_SYMBOL(__put_user_nocheck_4)
 
 SYM_FUNC_START(__put_user_8)
-	LOAD_TASK_SIZE_MINUS_N(7)
-	cmp %_ASM_BX,%_ASM_CX
-	jae .Lbad_put_user
+	check_range size=8
 	ASM_STAC
 7:	mov %_ASM_AX,(%_ASM_CX)
 #ifdef CONFIG_X86_32
@@ -144,15 +138,15 @@  SYM_CODE_START_LOCAL(.Lbad_put_user_clac)
 	RET
 SYM_CODE_END(.Lbad_put_user_clac)
 
-	_ASM_EXTABLE_UA(1b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(2b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(3b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(4b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(5b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(6b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(7b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(9b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(1b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(2b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(3b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(4b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(5b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(6b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(7b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(9b, .Lbad_put_user_clac)
 #ifdef CONFIG_X86_32
-	_ASM_EXTABLE_UA(8b, .Lbad_put_user_clac)
-	_ASM_EXTABLE_UA(10b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(8b, .Lbad_put_user_clac)
+	_ASM_EXTABLE(10b, .Lbad_put_user_clac)
 #endif