[bpf-next,v3,1/2] bpf: make the verifier tracks the "not equal" for regs

Message ID 20231214062434.3565630-2-menglong8.dong@gmail.com
State New
Headers
Series bpf: support to track BPF_JNE |

Commit Message

Menglong Dong Dec. 14, 2023, 6:24 a.m. UTC
  We can derive some new information for BPF_JNE in regs_refine_cond_op().
Take following code for example:

  /* The type of "a" is u16 */
  if (a > 0 && a < 100) {
    /* the range of the register for a is [0, 99], not [1, 99],
     * and will cause the following error:
     *
     *   invalid zero-sized read
     *
     * as a can be 0.
     */
    bpf_skb_store_bytes(skb, xx, xx, a, 0);
  }

In the code above, "a > 0" will be compiled to "jmp xxx if a == 0". In the
TRUE branch, the dst_reg will be marked as known to 0. However, in the
fallthrough(FALSE) branch, the dst_reg will not be handled, which makes
the [min, max] for a is [0, 99], not [1, 99].

For BPF_JNE, we can reduce the range of the dst reg if the src reg is a
const and is exactly the edge of the dst reg.

Signed-off-by: Menglong Dong <menglong8.dong@gmail.com>
---
v2:
- fix a typo in the subject
- add some comments, as Eduard advised
---
 kernel/bpf/verifier.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)
  

Comments

Shung-Hsi Yu Dec. 14, 2023, 10 a.m. UTC | #1
On Thu, Dec 14, 2023 at 02:24:33PM +0800, Menglong Dong wrote:
> We can derive some new information for BPF_JNE in regs_refine_cond_op().
> Take following code for example:
> 
>   /* The type of "a" is u16 */
>   if (a > 0 && a < 100) {
>     /* the range of the register for a is [0, 99], not [1, 99],
>      * and will cause the following error:
>      *
>      *   invalid zero-sized read
>      *
>      * as a can be 0.
>      */
>     bpf_skb_store_bytes(skb, xx, xx, a, 0);
>   }
> 
> In the code above, "a > 0" will be compiled to "jmp xxx if a == 0". In the
> TRUE branch, the dst_reg will be marked as known to 0. However, in the
> fallthrough(FALSE) branch, the dst_reg will not be handled, which makes
> the [min, max] for a is [0, 99], not [1, 99].
> 
> For BPF_JNE, we can reduce the range of the dst reg if the src reg is a
> const and is exactly the edge of the dst reg.
> 
> Signed-off-by: Menglong Dong <menglong8.dong@gmail.com>

Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
  
Alexei Starovoitov Dec. 14, 2023, 1:49 p.m. UTC | #2
On Wed, Dec 13, 2023 at 10:28 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> We can derive some new information for BPF_JNE in regs_refine_cond_op().
> Take following code for example:
>
>   /* The type of "a" is u16 */
>   if (a > 0 && a < 100) {
>     /* the range of the register for a is [0, 99], not [1, 99],
>      * and will cause the following error:
>      *
>      *   invalid zero-sized read
>      *
>      * as a can be 0.
>      */
>     bpf_skb_store_bytes(skb, xx, xx, a, 0);
>   }

Please craft a selftest from above with inline asm
(C might not work as compiler might optimize it)

Also we call:
        /* fallthrough (FALSE) branch */
        regs_refine_cond_op(false_reg1, false_reg2,
rev_opcode(opcode), is_jmp32);
        /* jump (TRUE) branch */
        regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32);

so despite BPF_JNE is not handled explicitly it still should have
caught above due to rev_opcode() ?
  
Menglong Dong Dec. 14, 2023, 2:07 p.m. UTC | #3
Hello,

On Thu, Dec 14, 2023 at 9:49 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 13, 2023 at 10:28 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > We can derive some new information for BPF_JNE in regs_refine_cond_op().
> > Take following code for example:
> >
> >   /* The type of "a" is u16 */
> >   if (a > 0 && a < 100) {
> >     /* the range of the register for a is [0, 99], not [1, 99],
> >      * and will cause the following error:
> >      *
> >      *   invalid zero-sized read
> >      *
> >      * as a can be 0.
> >      */
> >     bpf_skb_store_bytes(skb, xx, xx, a, 0);
> >   }
>
> Please craft a selftest from above with inline asm
> (C might not work as compiler might optimize it)

Okay! Should I add this selftests to reg_bounds as a subtest,
or add a "verifier_reg_edge.c" for verifier testing?

> Also we call:
>         /* fallthrough (FALSE) branch */
>         regs_refine_cond_op(false_reg1, false_reg2,
> rev_opcode(opcode), is_jmp32);
>         /* jump (TRUE) branch */
>         regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32);
>
> so despite BPF_JNE is not handled explicitly it still should have
> caught above due to rev_opcode() ?

Ennn.....I'm a little confused. In this case, the TRUE path is
handled properly, as the opcode is BPF_JEQ; and the FALSE
is not handled properly, as the opcode is rev_opcode(BPF_JEQ),
which is BPF_JNE. And the bpf_skb_store_bytes() will be called
in the FALSE path. The origin state of false_reg* should be the same
as true_reg*.
  
Alexei Starovoitov Dec. 14, 2023, 4:10 p.m. UTC | #4
On Thu, Dec 14, 2023 at 6:07 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> Hello,
>
> On Thu, Dec 14, 2023 at 9:49 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Dec 13, 2023 at 10:28 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > We can derive some new information for BPF_JNE in regs_refine_cond_op().
> > > Take following code for example:
> > >
> > >   /* The type of "a" is u16 */
> > >   if (a > 0 && a < 100) {
> > >     /* the range of the register for a is [0, 99], not [1, 99],
> > >      * and will cause the following error:
> > >      *
> > >      *   invalid zero-sized read
> > >      *
> > >      * as a can be 0.
> > >      */
> > >     bpf_skb_store_bytes(skb, xx, xx, a, 0);
> > >   }
> >
> > Please craft a selftest from above with inline asm
> > (C might not work as compiler might optimize it)
>
> Okay! Should I add this selftests to reg_bounds as a subtest,
> or add a "verifier_reg_edge.c" for verifier testing?

reg_bounds is for automated.
I think it will fit fine in the existing progs/verifier_bounds.c

>
> > Also we call:
> >         /* fallthrough (FALSE) branch */
> >         regs_refine_cond_op(false_reg1, false_reg2,
> > rev_opcode(opcode), is_jmp32);
> >         /* jump (TRUE) branch */
> >         regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32);
> >
> > so despite BPF_JNE is not handled explicitly it still should have
> > caught above due to rev_opcode() ?
>
> Ennn.....I'm a little confused. In this case, the TRUE path is
> handled properly, as the opcode is BPF_JEQ; and the FALSE
> is not handled properly, as the opcode is rev_opcode(BPF_JEQ),
> which is BPF_JNE. And the bpf_skb_store_bytes() will be called
> in the FALSE path. The origin state of false_reg* should be the same
> as true_reg*.

Ahh. I see.
It wasn't clear how 'a > 0 && a < 100' got to be JNE after optimizations.
  
Andrii Nakryiko Dec. 14, 2023, 11:19 p.m. UTC | #5
On Wed, Dec 13, 2023 at 10:28 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> We can derive some new information for BPF_JNE in regs_refine_cond_op().
> Take following code for example:
>
>   /* The type of "a" is u16 */
>   if (a > 0 && a < 100) {
>     /* the range of the register for a is [0, 99], not [1, 99],
>      * and will cause the following error:
>      *
>      *   invalid zero-sized read
>      *
>      * as a can be 0.
>      */
>     bpf_skb_store_bytes(skb, xx, xx, a, 0);
>   }
>
> In the code above, "a > 0" will be compiled to "jmp xxx if a == 0". In the
> TRUE branch, the dst_reg will be marked as known to 0. However, in the
> fallthrough(FALSE) branch, the dst_reg will not be handled, which makes
> the [min, max] for a is [0, 99], not [1, 99].
>
> For BPF_JNE, we can reduce the range of the dst reg if the src reg is a
> const and is exactly the edge of the dst reg.
>
> Signed-off-by: Menglong Dong <menglong8.dong@gmail.com>
> ---
> v2:
> - fix a typo in the subject
> - add some comments, as Eduard advised
> ---
>  kernel/bpf/verifier.c | 38 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)
>

The logic looks good

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 727a59e4a647..9b1932e51823 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14332,7 +14332,43 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state
>                 }
>                 break;
>         case BPF_JNE:
> -               /* we don't derive any new information for inequality yet */
> +               if (!is_reg_const(reg2, is_jmp32))
> +                       swap(reg1, reg2);
> +               if (!is_reg_const(reg2, is_jmp32))
> +                       break;
> +
> +               /* try to recompute the bound of reg1 if reg2 is a const and
> +                * is exactly the edge of reg1.
> +                */
> +               val = reg_const_value(reg2, is_jmp32);
> +               if (is_jmp32) {
> +                       /* u32_min_value is not equal to 0xffffffff at this point,
> +                        * because otherwise u32_max_value is 0xffffffff as well,
> +                        * in such a case both reg1 and reg2 would be constants,
> +                        * jump would be predicted and reg_set_min_max() won't
> +                        * be called.
> +                        *
> +                        * Same reasoning works for all {u,s}{min,max}{32,64} cases
> +                        * below.
> +                        */
> +                       if (reg1->u32_min_value == (u32)val)
> +                               reg1->u32_min_value++;
> +                       if (reg1->u32_max_value == (u32)val)
> +                               reg1->u32_max_value--;
> +                       if (reg1->s32_min_value == (s32)val)
> +                               reg1->s32_min_value++;
> +                       if (reg1->s32_max_value == (s32)val)
> +                               reg1->s32_max_value--;
> +               } else {
> +                       if (reg1->umin_value == (u64)val)
> +                               reg1->umin_value++;
> +                       if (reg1->umax_value == (u64)val)
> +                               reg1->umax_value--;
> +                       if (reg1->smin_value == (s64)val)
> +                               reg1->smin_value++;
> +                       if (reg1->smax_value == (s64)val)
> +                               reg1->smax_value--;
> +               }
>                 break;
>         case BPF_JSET:
>                 if (!is_reg_const(reg2, is_jmp32))
> --
> 2.39.2
>
  

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 727a59e4a647..9b1932e51823 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14332,7 +14332,43 @@  static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state
 		}
 		break;
 	case BPF_JNE:
-		/* we don't derive any new information for inequality yet */
+		if (!is_reg_const(reg2, is_jmp32))
+			swap(reg1, reg2);
+		if (!is_reg_const(reg2, is_jmp32))
+			break;
+
+		/* try to recompute the bound of reg1 if reg2 is a const and
+		 * is exactly the edge of reg1.
+		 */
+		val = reg_const_value(reg2, is_jmp32);
+		if (is_jmp32) {
+			/* u32_min_value is not equal to 0xffffffff at this point,
+			 * because otherwise u32_max_value is 0xffffffff as well,
+			 * in such a case both reg1 and reg2 would be constants,
+			 * jump would be predicted and reg_set_min_max() won't
+			 * be called.
+			 *
+			 * Same reasoning works for all {u,s}{min,max}{32,64} cases
+			 * below.
+			 */
+			if (reg1->u32_min_value == (u32)val)
+				reg1->u32_min_value++;
+			if (reg1->u32_max_value == (u32)val)
+				reg1->u32_max_value--;
+			if (reg1->s32_min_value == (s32)val)
+				reg1->s32_min_value++;
+			if (reg1->s32_max_value == (s32)val)
+				reg1->s32_max_value--;
+		} else {
+			if (reg1->umin_value == (u64)val)
+				reg1->umin_value++;
+			if (reg1->umax_value == (u64)val)
+				reg1->umax_value--;
+			if (reg1->smin_value == (s64)val)
+				reg1->smin_value++;
+			if (reg1->smax_value == (s64)val)
+				reg1->smax_value--;
+		}
 		break;
 	case BPF_JSET:
 		if (!is_reg_const(reg2, is_jmp32))