[bpf-next,v3,0/3] bpf: Add LDX/STX/ST sanitize in jited BPF progs

Message ID 20221125122912.54709-1-sunhao.th@gmail.com
Headers
Series bpf: Add LDX/STX/ST sanitize in jited BPF progs |

Message

Hao Sun Nov. 25, 2022, 12:29 p.m. UTC
  The verifier sometimes makes mistakes[1][2] that may be exploited to
achieve arbitrary read/write. Currently, syzbot is continuously testing
bpf, and can find memory issues in bpf syscalls, but it can hardly find
mischecking/bugs in the verifier. We need runtime checks like KASAN in
BPF programs for this. This patch series implements address sanitize
in jited BPF progs for testing purpose, so that tools like syzbot can
find interesting bugs in the verifier automatically by, if possible,
generating and executing BPF programs that bypass the verifier but have
memory issues, then triggering this sanitizing.

The idea is to dispatch read/write addr of a BPF program to the kernel
functions that are instrumented by KASAN, to achieve indirect checking.
Indirect checking is adopted because this is much simple, instrument
direct checking like compilers makes the jit much more complex. The
main step is: back up all the scratch regs to extend BPF prog stack,
store addr to R1, and then insert the checking function before load
or store insns, during bpf_misc_fixup(). The stack size of BPF progs
is extended by 64 bytes in this mode, to backup R1~R5 to make sure
the checking funcs won't corrupt regs states. An extra Kconfig option
is used to enable this, so normal use case won't be impacted at all.

Also, not all ldx/stx/st are instrumented. Insns rewrote by other fixup
or conversion passes that use BPF_REG_AX are skipped, because that
conflicts with us; insns whose access addr is specified by R10 are also
skipped because they are trivial to verify.

Patch1 sanitizes st/stx insns, and Patch2 sanitizes ldx insns, Patch3 adds
selftests for instrumentation in each possible case, and all new/existing
selftests for the verifier can pass. Also, a BPF prog that also exploits
CVE-2022-23222 to achieve OOB read is provided[3], this can be perfertly
captured with this patch series.

[1] http://bit.do/CVE-2021-3490
[2] http://bit.do/CVE-2022-23222
[3] OOB-read: https://pastebin.com/raw/Ee1Cw492

v1 -> v2:
        remove changes to JIT completely, backup regs to extended stack.
v2 -> v3:
	fix missing-prototypes warning report by kernel test bot.
	simplify regs backing up and rewrite corresponding selftests.

Hao Sun (3):
  bpf: Sanitize STX/ST in jited BPF progs with KASAN
  bpf: Sanitize LDX in jited BPF progs with KASAN
  selftests/bpf: Add tests for LDX/STX/ST sanitize

 kernel/bpf/Kconfig                            |  13 +
 kernel/bpf/verifier.c                         | 173 ++++++++++
 .../selftests/bpf/verifier/sanitize_st_ldx.c  | 317 ++++++++++++++++++
 3 files changed, 503 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/verifier/sanitize_st_ldx.c


base-commit: 2b3e8f6f5b939ceeb2e097339bf78ebaaf11dfe9
  

Comments

Alexei Starovoitov Nov. 28, 2022, 12:38 a.m. UTC | #1
On Fri, Nov 25, 2022 at 08:29:09PM +0800, Hao Sun wrote:
> The verifier sometimes makes mistakes[1][2] that may be exploited to
> achieve arbitrary read/write. Currently, syzbot is continuously testing
> bpf, and can find memory issues in bpf syscalls, but it can hardly find
> mischecking/bugs in the verifier. We need runtime checks like KASAN in
> BPF programs for this. This patch series implements address sanitize
> in jited BPF progs for testing purpose, so that tools like syzbot can
> find interesting bugs in the verifier automatically by, if possible,
> generating and executing BPF programs that bypass the verifier but have
> memory issues, then triggering this sanitizing.

The above paragraph makes it sound that it's currently impossible to
use kasan with BPF. Which is confusing and incorrect statement.
kasan adds all the necessary instrumentation to BPF interpreter already
and syzbot can perform bug discovery.
syzbot runner should just disable JIT and run all progs via interpreter.
Adding all this logic to run JITed progs in kasan kernel is
just unnecessary complexity.
  
Hao Sun Nov. 28, 2022, 1:41 a.m. UTC | #2
Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2022年11月28日周一 08:38写道:
>
> On Fri, Nov 25, 2022 at 08:29:09PM +0800, Hao Sun wrote:
> > The verifier sometimes makes mistakes[1][2] that may be exploited to
> > achieve arbitrary read/write. Currently, syzbot is continuously testing
> > bpf, and can find memory issues in bpf syscalls, but it can hardly find
> > mischecking/bugs in the verifier. We need runtime checks like KASAN in
> > BPF programs for this. This patch series implements address sanitize
> > in jited BPF progs for testing purpose, so that tools like syzbot can
> > find interesting bugs in the verifier automatically by, if possible,
> > generating and executing BPF programs that bypass the verifier but have
> > memory issues, then triggering this sanitizing.
>
> The above paragraph makes it sound that it's currently impossible to
> use kasan with BPF. Which is confusing and incorrect statement.
> kasan adds all the necessary instrumentation to BPF interpreter already
> and syzbot can perform bug discovery.
> syzbot runner should just disable JIT and run all progs via interpreter.
> Adding all this logic to run JITed progs in kasan kernel is
> just unnecessary complexity.

Sorry for the confusion, I mean JITed BPF prog can't use KASAN currently,
maybe it should be called BPF_JITED_PROG_KASAN.

It's actually useful because JIT is used in most real cases for testing/fuzzing,
syzbot uses WITH_JIT_ALWAYS_ON[1][2]. For those tools, they may need
to run hundred times for each generated BPF prog to find interesting bugs in
the verifier, JIT makes it much faster. Also, bugs in JIT can be
missed if they're
disabled.

[1] http://bit.do/syzbot-bpf-config
[2] http://bit.do/syzbot-bpf-next-config
  
Alexei Starovoitov Nov. 28, 2022, 2:12 a.m. UTC | #3
On Sun, Nov 27, 2022 at 5:41 PM Hao Sun <sunhao.th@gmail.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2022年11月28日周一 08:38写道:
> >
> > On Fri, Nov 25, 2022 at 08:29:09PM +0800, Hao Sun wrote:
> > > The verifier sometimes makes mistakes[1][2] that may be exploited to
> > > achieve arbitrary read/write. Currently, syzbot is continuously testing
> > > bpf, and can find memory issues in bpf syscalls, but it can hardly find
> > > mischecking/bugs in the verifier. We need runtime checks like KASAN in
> > > BPF programs for this. This patch series implements address sanitize
> > > in jited BPF progs for testing purpose, so that tools like syzbot can
> > > find interesting bugs in the verifier automatically by, if possible,
> > > generating and executing BPF programs that bypass the verifier but have
> > > memory issues, then triggering this sanitizing.
> >
> > The above paragraph makes it sound that it's currently impossible to
> > use kasan with BPF. Which is confusing and incorrect statement.
> > kasan adds all the necessary instrumentation to BPF interpreter already
> > and syzbot can perform bug discovery.
> > syzbot runner should just disable JIT and run all progs via interpreter.
> > Adding all this logic to run JITed progs in kasan kernel is
> > just unnecessary complexity.
>
> Sorry for the confusion, I mean JITed BPF prog can't use KASAN currently,
> maybe it should be called BPF_JITED_PROG_KASAN.
>
> It's actually useful because JIT is used in most real cases for testing/fuzzing,
> syzbot uses WITH_JIT_ALWAYS_ON[1][2].

Just turn it off in syzbot. jit_always_on is a security feature
because of speculative execution bugs that can exploit
any in-kernel interpreter (not only bpf interpreter).

> For those tools, they may need
> to run hundred times for each generated BPF prog to find interesting bugs in
> the verifier, JIT makes it much faster.

Unlikely. With all the overhead of saving a bunch of regs,
restoring them and calling functions instead of direct load/store
such JITed code is probably running at the same speed as
interpreter.
Also syzbot generated progs are tiny.
Your oob reproducer is tiny too.
The speed of execution doesn't matter in such cases.

> Also, bugs in JIT can be
> missed if they're
> disabled.

Disagree. Replacing direct load/store with calls
doesn't improve JIT test coverage.

Also think long term. Beyond kasan there are various *sans
that instrument code differently. load/store may not be
the only insns that should be instrumented.
So hacking JITs either directly or via verifier isn't going
to scale.
  
Hao Sun Nov. 28, 2022, 2:58 a.m. UTC | #4
Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2022年11月28日周一 10:12写道:
>
> On Sun, Nov 27, 2022 at 5:41 PM Hao Sun <sunhao.th@gmail.com> wrote:
> >
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2022年11月28日周一 08:38写道:
> > >
> > > On Fri, Nov 25, 2022 at 08:29:09PM +0800, Hao Sun wrote:
> > > > The verifier sometimes makes mistakes[1][2] that may be exploited to
> > > > achieve arbitrary read/write. Currently, syzbot is continuously testing
> > > > bpf, and can find memory issues in bpf syscalls, but it can hardly find
> > > > mischecking/bugs in the verifier. We need runtime checks like KASAN in
> > > > BPF programs for this. This patch series implements address sanitize
> > > > in jited BPF progs for testing purpose, so that tools like syzbot can
> > > > find interesting bugs in the verifier automatically by, if possible,
> > > > generating and executing BPF programs that bypass the verifier but have
> > > > memory issues, then triggering this sanitizing.
> > >
> > > The above paragraph makes it sound that it's currently impossible to
> > > use kasan with BPF. Which is confusing and incorrect statement.
> > > kasan adds all the necessary instrumentation to BPF interpreter already
> > > and syzbot can perform bug discovery.
> > > syzbot runner should just disable JIT and run all progs via interpreter.
> > > Adding all this logic to run JITed progs in kasan kernel is
> > > just unnecessary complexity.
> >
> > Sorry for the confusion, I mean JITed BPF prog can't use KASAN currently,
> > maybe it should be called BPF_JITED_PROG_KASAN.
> >
> > It's actually useful because JIT is used in most real cases for testing/fuzzing,
> > syzbot uses WITH_JIT_ALWAYS_ON[1][2].
>
> Just turn it off in syzbot. jit_always_on is a security feature
> because of speculative execution bugs that can exploit
> any in-kernel interpreter (not only bpf interpreter).
>

Will do that, thanks for the information.

> > For those tools, they may need
> > to run hundred times for each generated BPF prog to find interesting bugs in
> > the verifier, JIT makes it much faster.
>
> Unlikely. With all the overhead of saving a bunch of regs,
> restoring them and calling functions instead of direct load/store
> such JITed code is probably running at the same speed as
> interpreter.
> Also syzbot generated progs are tiny.
> Your oob reproducer is tiny too.
> The speed of execution doesn't matter in such cases.
>

Hard to tell which one is faster, since each execution of insn in the
interpreter requires a jmp.
But you're right, did not think about this, I guess randomly generated
progs that can pass the verifier are normally tiny, so the speed indeed
may not be an issue here.

> > Also, bugs in JIT can be
> > missed if they're
> > disabled.
>
> Disagree. Replacing direct load/store with calls
> doesn't improve JIT test coverage.
>
> Also think long term. Beyond kasan there are various *sans
> that instrument code differently. load/store may not be
> the only insns that should be instrumented.
> So hacking JITs either directly or via verifier isn't going
> to scale.

Right, just let those *sans instrument the interpreter is more scalable.

Thanks
Hao