[v1,1/2] LoongArch: BPF: Fix sign-extension mov instructions

Message ID 20231207032951.16334-2-yangtiezhu@loongson.cn
State New
Headers
Series Fix failed test cases in test_bpf.ko for LoongArch |

Commit Message

Tiezhu Yang Dec. 7, 2023, 3:29 a.m. UTC
  We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
in include/linux/filter.h, additionally, for BPF_ALU64 the value of
the destination register is unchanged whereas for BPF_ALU the upper
32 bits of the destination register are zeroed, so it should clear
the upper 32 bits for BPF_ALU.

[root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
[root@linux fedora]# modprobe test_bpf

Before:
test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)

After:
test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS

By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
also be fixed with this patch.

Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 arch/loongarch/net/bpf_jit.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Huacai Chen Dec. 7, 2023, 4:12 a.m. UTC | #1
On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
>
> We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> the destination register is unchanged whereas for BPF_ALU the upper
> 32 bits of the destination register are zeroed, so it should clear
> the upper 32 bits for BPF_ALU.
>
> [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> [root@linux fedora]# modprobe test_bpf
>
> Before:
> test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
>
> After:
> test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
>
> By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> also be fixed with this patch.
Hmmm, it is a little strange that privileged verifier_movsx has no problem.

Huacai

>
> Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> ---
>  arch/loongarch/net/bpf_jit.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 169ff8b3915e..8c907c2c42f7 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
>                 case 8:
>                         move_reg(ctx, t1, src);
>                         emit_insn(ctx, extwb, dst, t1);
> +                       emit_zext_32(ctx, dst, is32);
>                         break;
>                 case 16:
>                         move_reg(ctx, t1, src);
>                         emit_insn(ctx, extwh, dst, t1);
> +                       emit_zext_32(ctx, dst, is32);
>                         break;
>                 case 32:
>                         emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> --
> 2.42.0
>
>
  
Hengqi Chen Dec. 9, 2023, 6:03 a.m. UTC | #2
On Thu, Dec 7, 2023 at 12:12 PM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> >
> > We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> > in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> > the destination register is unchanged whereas for BPF_ALU the upper
> > 32 bits of the destination register are zeroed, so it should clear
> > the upper 32 bits for BPF_ALU.
> >
> > [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> > [root@linux fedora]# modprobe test_bpf
> >
> > Before:
> > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> >
> > After:
> > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
> >
> > By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> > also be fixed with this patch.
> Hmmm, it is a little strange that privileged verifier_movsx has no problem.
>

I have found the differences between priv and unpriv mode.
The BPF verifier performs different optimizations for priv and upriv progs.
See the following commits:

https://github.com/torvalds/linux/commit/e2ae4ca266a1c
https://github.com/torvalds/linux/commit/52875a04f4b26
https://github.com/torvalds/linux/commit/a1b14abc009d9

As a result, with unprivileged_bpf_disabled on, we have:

  # bpftool p d x i 55
  void mov32sx_s16_range_2():
  ; asm volatile ("                    \
     0: (b7) r1 = 65535
     1: (bc) w2 = w1
     2: (77) r2 >>= 1
     3: (b7) r0 = 1
     4: (95) exit

  # bpftool p d j i 55
  void mov32sx_s16_range_2():
  0xffff800002416238:
  ; asm volatile ("                    \
     0:    addi.d          $a6, $zero, 33(0x21)
     4:    addi.d          $sp, $sp, -64(0xfc0)
     8:    st.d            $ra, $sp, 56(0x38)
     c:    st.d            $fp, $sp, 48(0x30)
    10:    st.d            $s0, $sp, 40(0x28)
    14:    st.d            $s1, $sp, 32(0x20)
    18:    st.d            $s2, $sp, 24(0x18)
    1c:    st.d            $s3, $sp, 16(0x10)
    20:    st.d            $s4, $sp, 8(0x8)
    24:    st.d            $s5, $sp, 0
    28:    addi.d          $fp, $sp, 64(0x40)
    2c:    lu12i.w         $a0, 15(0xf)
    30:    ori             $a0, $a0, 0xfff
    34:    move            $t1, $a0
    38:    ext.w.h         $a1, $t1
    3c:    srli.d          $a1, $a1, 0x1
    40:    addi.w          $a5, $zero, 1(0x1)
    44:    ld.d            $ra, $sp, 56(0x38)
    48:    ld.d            $fp, $sp, 48(0x30)
    4c:    ld.d            $s0, $sp, 40(0x28)
    50:    ld.d            $s1, $sp, 32(0x20)
    54:    ld.d            $s2, $sp, 24(0x18)
    58:    ld.d            $s3, $sp, 16(0x10)
    5c:    ld.d            $s4, $sp, 8(0x8)
    60:    ld.d            $s5, $sp, 0
    64:    addi.d          $sp, $sp, 64(0x40)
    68:    move            $a0, $a5
    6c:    jirl            $zero, $ra, 0

With unprivileged_bpf_disabled off, we have:

  # bpftool p d x i 59
     0: (b7) r1 = 65535
     1: (bc) w2 = w1
     2: (77) r2 >>= 1
     3: (55) if r2 != 0x7fffffff goto pc+2
     4: (b7) r0 = 1
     5: (95) exit
     6: (05) goto pc-1
     7: (05) goto pc-1

  # bpftool p d j i 59
  0xffff8000024146a0:
     0:    addi.d          $a6, $zero, 33(0x21)
     4:    addi.d          $sp, $sp, -64(0xfc0)
     8:    st.d            $ra, $sp, 56(0x38)
     c:    st.d            $fp, $sp, 48(0x30)
    10:    st.d            $s0, $sp, 40(0x28)
    14:    st.d            $s1, $sp, 32(0x20)
    18:    st.d            $s2, $sp, 24(0x18)
    1c:    st.d            $s3, $sp, 16(0x10)
    20:    st.d            $s4, $sp, 8(0x8)
    24:    st.d            $s5, $sp, 0
    28:    addi.d          $fp, $sp, 64(0x40)
    2c:    lu12i.w         $a0, 15(0xf)
    30:    ori             $a0, $a0, 0xfff
    34:    move            $t1, $a0
    38:    ext.w.h         $a1, $t1
    3c:    srli.d          $a1, $a1, 0x1
    40:    lu12i.w         $t1, 524287(0x7ffff)
    44:    ori             $t1, $t1, 0xfff
    48:    move            $t2, $a1
    4c:    beq             $t2, $t1, 8(0x8)    # 0x0000000000000054
    50:    b               12(0xc)    # 0x000000000000005c
    54:    addi.w          $a5, $zero, 1(0x1)
    58:    b               12(0xc)    # 0x0000000000000064
    5c:    b               0    # 0x000000000000005c
    60:    b               0    # 0x0000000000000060
    64:    ld.d            $ra, $sp, 56(0x38)
    68:    ld.d            $fp, $sp, 48(0x30)
    6c:    ld.d            $s0, $sp, 40(0x28)
    70:    ld.d            $s1, $sp, 32(0x20)
    74:    ld.d            $s2, $sp, 24(0x18)
    78:    ld.d            $s3, $sp, 16(0x10)
    7c:    ld.d            $s4, $sp, 8(0x8)
    80:    ld.d            $s5, $sp, 0
    84:    addi.d          $sp, $sp, 64(0x40)
    88:    move            $a0, $a5
    8c:    jirl            $zero, $ra, 0

Without this fix, it seems like the prog is trapped in an infinite loop.

This patch looks good to me, so I am going to offer:

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>

Cheers,
--
Hengqi

> Huacai
>
> >
> > Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > ---
> >  arch/loongarch/net/bpf_jit.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 169ff8b3915e..8c907c2c42f7 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
> >                 case 8:
> >                         move_reg(ctx, t1, src);
> >                         emit_insn(ctx, extwb, dst, t1);
> > +                       emit_zext_32(ctx, dst, is32);
> >                         break;
> >                 case 16:
> >                         move_reg(ctx, t1, src);
> >                         emit_insn(ctx, extwh, dst, t1);
> > +                       emit_zext_32(ctx, dst, is32);
> >                         break;
> >                 case 32:
> >                         emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> > --
> > 2.42.0
> >
> >
  
Huacai Chen Dec. 9, 2023, 12:01 p.m. UTC | #3
Applied, thanks.

Huacai

On Sat, Dec 9, 2023 at 2:04 PM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> On Thu, Dec 7, 2023 at 12:12 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > >
> > > We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> > > in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> > > the destination register is unchanged whereas for BPF_ALU the upper
> > > 32 bits of the destination register are zeroed, so it should clear
> > > the upper 32 bits for BPF_ALU.
> > >
> > > [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> > > [root@linux fedora]# modprobe test_bpf
> > >
> > > Before:
> > > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > >
> > > After:
> > > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> > > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
> > >
> > > By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> > > also be fixed with this patch.
> > Hmmm, it is a little strange that privileged verifier_movsx has no problem.
> >
>
> I have found the differences between priv and unpriv mode.
> The BPF verifier performs different optimizations for priv and upriv progs.
> See the following commits:
>
> https://github.com/torvalds/linux/commit/e2ae4ca266a1c
> https://github.com/torvalds/linux/commit/52875a04f4b26
> https://github.com/torvalds/linux/commit/a1b14abc009d9
>
> As a result, with unprivileged_bpf_disabled on, we have:
>
>   # bpftool p d x i 55
>   void mov32sx_s16_range_2():
>   ; asm volatile ("                    \
>      0: (b7) r1 = 65535
>      1: (bc) w2 = w1
>      2: (77) r2 >>= 1
>      3: (b7) r0 = 1
>      4: (95) exit
>
>   # bpftool p d j i 55
>   void mov32sx_s16_range_2():
>   0xffff800002416238:
>   ; asm volatile ("                    \
>      0:    addi.d          $a6, $zero, 33(0x21)
>      4:    addi.d          $sp, $sp, -64(0xfc0)
>      8:    st.d            $ra, $sp, 56(0x38)
>      c:    st.d            $fp, $sp, 48(0x30)
>     10:    st.d            $s0, $sp, 40(0x28)
>     14:    st.d            $s1, $sp, 32(0x20)
>     18:    st.d            $s2, $sp, 24(0x18)
>     1c:    st.d            $s3, $sp, 16(0x10)
>     20:    st.d            $s4, $sp, 8(0x8)
>     24:    st.d            $s5, $sp, 0
>     28:    addi.d          $fp, $sp, 64(0x40)
>     2c:    lu12i.w         $a0, 15(0xf)
>     30:    ori             $a0, $a0, 0xfff
>     34:    move            $t1, $a0
>     38:    ext.w.h         $a1, $t1
>     3c:    srli.d          $a1, $a1, 0x1
>     40:    addi.w          $a5, $zero, 1(0x1)
>     44:    ld.d            $ra, $sp, 56(0x38)
>     48:    ld.d            $fp, $sp, 48(0x30)
>     4c:    ld.d            $s0, $sp, 40(0x28)
>     50:    ld.d            $s1, $sp, 32(0x20)
>     54:    ld.d            $s2, $sp, 24(0x18)
>     58:    ld.d            $s3, $sp, 16(0x10)
>     5c:    ld.d            $s4, $sp, 8(0x8)
>     60:    ld.d            $s5, $sp, 0
>     64:    addi.d          $sp, $sp, 64(0x40)
>     68:    move            $a0, $a5
>     6c:    jirl            $zero, $ra, 0
>
> With unprivileged_bpf_disabled off, we have:
>
>   # bpftool p d x i 59
>      0: (b7) r1 = 65535
>      1: (bc) w2 = w1
>      2: (77) r2 >>= 1
>      3: (55) if r2 != 0x7fffffff goto pc+2
>      4: (b7) r0 = 1
>      5: (95) exit
>      6: (05) goto pc-1
>      7: (05) goto pc-1
>
>   # bpftool p d j i 59
>   0xffff8000024146a0:
>      0:    addi.d          $a6, $zero, 33(0x21)
>      4:    addi.d          $sp, $sp, -64(0xfc0)
>      8:    st.d            $ra, $sp, 56(0x38)
>      c:    st.d            $fp, $sp, 48(0x30)
>     10:    st.d            $s0, $sp, 40(0x28)
>     14:    st.d            $s1, $sp, 32(0x20)
>     18:    st.d            $s2, $sp, 24(0x18)
>     1c:    st.d            $s3, $sp, 16(0x10)
>     20:    st.d            $s4, $sp, 8(0x8)
>     24:    st.d            $s5, $sp, 0
>     28:    addi.d          $fp, $sp, 64(0x40)
>     2c:    lu12i.w         $a0, 15(0xf)
>     30:    ori             $a0, $a0, 0xfff
>     34:    move            $t1, $a0
>     38:    ext.w.h         $a1, $t1
>     3c:    srli.d          $a1, $a1, 0x1
>     40:    lu12i.w         $t1, 524287(0x7ffff)
>     44:    ori             $t1, $t1, 0xfff
>     48:    move            $t2, $a1
>     4c:    beq             $t2, $t1, 8(0x8)    # 0x0000000000000054
>     50:    b               12(0xc)    # 0x000000000000005c
>     54:    addi.w          $a5, $zero, 1(0x1)
>     58:    b               12(0xc)    # 0x0000000000000064
>     5c:    b               0    # 0x000000000000005c
>     60:    b               0    # 0x0000000000000060
>     64:    ld.d            $ra, $sp, 56(0x38)
>     68:    ld.d            $fp, $sp, 48(0x30)
>     6c:    ld.d            $s0, $sp, 40(0x28)
>     70:    ld.d            $s1, $sp, 32(0x20)
>     74:    ld.d            $s2, $sp, 24(0x18)
>     78:    ld.d            $s3, $sp, 16(0x10)
>     7c:    ld.d            $s4, $sp, 8(0x8)
>     80:    ld.d            $s5, $sp, 0
>     84:    addi.d          $sp, $sp, 64(0x40)
>     88:    move            $a0, $a5
>     8c:    jirl            $zero, $ra, 0
>
> Without this fix, it seems like the prog is trapped in an infinite loop.
>
> This patch looks good to me, so I am going to offer:
>
> Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
>
> Cheers,
> --
> Hengqi
>
> > Huacai
> >
> > >
> > > Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> > > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > > ---
> > >  arch/loongarch/net/bpf_jit.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index 169ff8b3915e..8c907c2c42f7 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
> > >                 case 8:
> > >                         move_reg(ctx, t1, src);
> > >                         emit_insn(ctx, extwb, dst, t1);
> > > +                       emit_zext_32(ctx, dst, is32);
> > >                         break;
> > >                 case 16:
> > >                         move_reg(ctx, t1, src);
> > >                         emit_insn(ctx, extwh, dst, t1);
> > > +                       emit_zext_32(ctx, dst, is32);
> > >                         break;
> > >                 case 32:
> > >                         emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> > > --
> > > 2.42.0
> > >
> > >
>
  

Patch

diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 169ff8b3915e..8c907c2c42f7 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -480,10 +480,12 @@  static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
 		case 8:
 			move_reg(ctx, t1, src);
 			emit_insn(ctx, extwb, dst, t1);
+			emit_zext_32(ctx, dst, is32);
 			break;
 		case 16:
 			move_reg(ctx, t1, src);
 			emit_insn(ctx, extwh, dst, t1);
+			emit_zext_32(ctx, dst, is32);
 			break;
 		case 32:
 			emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);