[-next,v2] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()

Message ID 20221102081620.1465154-1-zhongbaisong@huawei.com
State New
Headers
Series [-next,v2] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb() |

Commit Message

Baisong Zhong Nov. 2, 2022, 8:16 a.m. UTC
  we got a syzkaller problem because of aarch64 alignment fault
if KFENCE enabled.

When the size from user bpf program is an odd number, like
399, 407, etc, it will cause the struct skb_shared_info's
unaligned access. As seen below:

BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032

Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
 __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
 arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
 arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
 atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
 __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
 skb_clone+0xf4/0x214 net/core/skbuff.c:1481
 ____bpf_clone_redirect net/core/filter.c:2433 [inline]
 bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
 bpf_prog_d3839dd9068ceb51+0x80/0x330
 bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
 bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
 bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
 __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
 __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381

kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512

allocated by task 15074 on cpu 0 at 1342.585390s:
 kmalloc include/linux/slab.h:568 [inline]
 kzalloc include/linux/slab.h:675 [inline]
 bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
 bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
 __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
 __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
 __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381

To fix the problem, we adjust @size so that (@size + @hearoom) is a
multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info
is aligned to a cache line.

Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
---
v2: use SKB_DATA_ALIGN instead kmalloc_size_roundup
---
 net/bpf/test_run.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

patchwork-bot+netdevbpf@kernel.org Nov. 4, 2022, 3:30 p.m. UTC | #1
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Wed, 2 Nov 2022 16:16:20 +0800 you wrote:
> we got a syzkaller problem because of aarch64 alignment fault
> if KFENCE enabled.
> 
> When the size from user bpf program is an odd number, like
> 399, 407, etc, it will cause the struct skb_shared_info's
> unaligned access. As seen below:
> 
> [...]

Here is the summary with links:
  - [-next,v2] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
    https://git.kernel.org/bpf/bpf/c/d3fd203f36d4

You are awesome, thank you!
  
Alexander Potapenko Nov. 4, 2022, 5:06 p.m. UTC | #2
On Wed, Nov 2, 2022 at 9:16 AM Baisong Zhong <zhongbaisong@huawei.com> wrote:
>
> we got a syzkaller problem because of aarch64 alignment fault
> if KFENCE enabled.
>
> When the size from user bpf program is an odd number, like
> 399, 407, etc, it will cause the struct skb_shared_info's
> unaligned access. As seen below:
>
> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032

It's interesting that KFENCE is reporting a UAF without a deallocation
stack here.

Looks like an unaligned access to 0xffff6254fffac077 causes the ARM
CPU to throw a fault handled by __do_kernel_fault()
This isn't technically a page fault, but anyway the access address
gets passed to kfence_handle_page_fault(), which defaults to a
use-after-free, because the address belongs to the object page, not
the redzone page.

Catalin, Mark, what is the right way to only handle traps caused by
reading/writing to a page for which `set_memory_valid(addr, 1, 0)` was
called?

> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
>  __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
>  arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
>  arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
>  atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
>  __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
>  skb_clone+0xf4/0x214 net/core/skbuff.c:1481
>  ____bpf_clone_redirect net/core/filter.c:2433 [inline]
>  bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
>  bpf_prog_d3839dd9068ceb51+0x80/0x330
>  bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
>  bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
>  bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
>  bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>  __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>  __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>
> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512
>
> allocated by task 15074 on cpu 0 at 1342.585390s:
>  kmalloc include/linux/slab.h:568 [inline]
>  kzalloc include/linux/slab.h:675 [inline]
>  bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
>  bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
>  bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>  __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>  __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>  __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
>
> To fix the problem, we adjust @size so that (@size + @hearoom) is a
> multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info
> is aligned to a cache line.
>
> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> ---
> v2: use SKB_DATA_ALIGN instead kmalloc_size_roundup
> ---
>  net/bpf/test_run.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 4b855af267b1..bfdd7484b93f 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -259,6 +259,7 @@ static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
>         if (user_size > size)
>                 return ERR_PTR(-EMSGSIZE);
>
> +       size = SKB_DATA_ALIGN(size);
>         data = kzalloc(size + headroom + tailroom, GFP_USER);
>         if (!data)
>                 return ERR_PTR(-ENOMEM);
> --
> 2.25.1
>


--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
  
Mark Rutland Nov. 7, 2022, 10:33 a.m. UTC | #3
On Fri, Nov 04, 2022 at 06:06:05PM +0100, Alexander Potapenko wrote:
> On Wed, Nov 2, 2022 at 9:16 AM Baisong Zhong <zhongbaisong@huawei.com> wrote:
> >
> > we got a syzkaller problem because of aarch64 alignment fault
> > if KFENCE enabled.
> >
> > When the size from user bpf program is an odd number, like
> > 399, 407, etc, it will cause the struct skb_shared_info's
> > unaligned access. As seen below:
> >
> > BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> 
> It's interesting that KFENCE is reporting a UAF without a deallocation
> stack here.
> 
> Looks like an unaligned access to 0xffff6254fffac077 causes the ARM
> CPU to throw a fault handled by __do_kernel_fault()

Importantly, an unaligned *atomic*, which is a bug regardless of KFENCE.

> This isn't technically a page fault, but anyway the access address
> gets passed to kfence_handle_page_fault(), which defaults to a
> use-after-free, because the address belongs to the object page, not
> the redzone page.
> 
> Catalin, Mark, what is the right way to only handle traps caused by
> reading/writing to a page for which `set_memory_valid(addr, 1, 0)` was
> called?

That should appear as a translation fault, so we could add an
is_el1_translation_fault() helper for that. I can't immediately recall how
misaligned atomics are presented, but I presume as something other than a
translation fault.

If the below works for you, I can go spin that as a real patch.

Mark.

---->8----
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 5b391490e045b..1de4b6afa8515 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -239,6 +239,11 @@ static bool is_el1_data_abort(unsigned long esr)
        return ESR_ELx_EC(esr) == ESR_ELx_EC_DABT_CUR;
 }
 
+static bool is_el1_translation_fault(unsigned long esr)
+{
+       return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT;
+}
+
 static inline bool is_el1_permission_fault(unsigned long addr, unsigned long esr,
                                           struct pt_regs *regs)
 {
@@ -385,7 +390,8 @@ static void __do_kernel_fault(unsigned long addr, unsigned long esr,
        } else if (addr < PAGE_SIZE) {
                msg = "NULL pointer dereference";
        } else {
-               if (kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
+               if (is_el1_translation_fault(esr) &&
+                   kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
                        return;
 
                msg = "paging request";
  
Alexander Potapenko Nov. 7, 2022, 1:17 p.m. UTC | #4
On Mon, Nov 7, 2022 at 11:33 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Nov 04, 2022 at 06:06:05PM +0100, Alexander Potapenko wrote:
> > On Wed, Nov 2, 2022 at 9:16 AM Baisong Zhong <zhongbaisong@huawei.com> wrote:
> > >
> > > we got a syzkaller problem because of aarch64 alignment fault
> > > if KFENCE enabled.
> > >
> > > When the size from user bpf program is an odd number, like
> > > 399, 407, etc, it will cause the struct skb_shared_info's
> > > unaligned access. As seen below:
> > >
> > > BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> >
> > It's interesting that KFENCE is reporting a UAF without a deallocation
> > stack here.
> >
> > Looks like an unaligned access to 0xffff6254fffac077 causes the ARM
> > CPU to throw a fault handled by __do_kernel_fault()
>
> Importantly, an unaligned *atomic*, which is a bug regardless of KFENCE.
>
> > This isn't technically a page fault, but anyway the access address
> > gets passed to kfence_handle_page_fault(), which defaults to a
> > use-after-free, because the address belongs to the object page, not
> > the redzone page.
> >
> > Catalin, Mark, what is the right way to only handle traps caused by
> > reading/writing to a page for which `set_memory_valid(addr, 1, 0)` was
> > called?
>
> That should appear as a translation fault, so we could add an
> is_el1_translation_fault() helper for that. I can't immediately recall how
> misaligned atomics are presented, but I presume as something other than a
> translation fault.
>
> If the below works for you, I can go spin that as a real patch.

Thanks!
It works for me in QEMU (doesn't report UAF for an unaligned atomic
access and doesn't break the original KFENCE tests), and matches my
reading of https://developer.arm.com/documentation/ddi0595/2020-12/AArch64-Registers/ESR-EL1--Exception-Syndrome-Register--EL1-

Feel free to add:
  Reviewed-by: Alexander Potapenko <glider@google.com>
  Tested-by: Alexander Potapenko <glider@google.com>

> Mark.
>
> ---->8----
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 5b391490e045b..1de4b6afa8515 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -239,6 +239,11 @@ static bool is_el1_data_abort(unsigned long esr)
>         return ESR_ELx_EC(esr) == ESR_ELx_EC_DABT_CUR;
>  }
>
> +static bool is_el1_translation_fault(unsigned long esr)
> +{
> +       return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT;

Should we also introduce ESR_ELx_FSC(esr) for this?

> +}
> +
>  static inline bool is_el1_permission_fault(unsigned long addr, unsigned long esr,
>                                            struct pt_regs *regs)
>  {
> @@ -385,7 +390,8 @@ static void __do_kernel_fault(unsigned long addr, unsigned long esr,
>         } else if (addr < PAGE_SIZE) {
>                 msg = "NULL pointer dereference";
>         } else {
> -               if (kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
> +               if (is_el1_translation_fault(esr) &&
> +                   kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
>                         return;
>
>                 msg = "paging request";
--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
  

Patch

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 4b855af267b1..bfdd7484b93f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -259,6 +259,7 @@  static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
 	if (user_size > size)
 		return ERR_PTR(-EMSGSIZE);
 
+	size = SKB_DATA_ALIGN(size);
 	data = kzalloc(size + headroom + tailroom, GFP_USER);
 	if (!data)
 		return ERR_PTR(-ENOMEM);