[v3,0/3] Fix crashes and warning in ext4 unit test

Message ID 20240302181755.9192-1-shikemeng@huaweicloud.com
Headers
Series Fix crashes and warning in ext4 unit test |

Message

Kemeng Shi March 2, 2024, 6:17 p.m. UTC
  v2->v3:
-fix warning that sb->s_umount is still held when unit test finishs
-fix warning that sbi->s_freeclusters_counter is used before
initialization.

v1->v2:
-properly handle error from sget()

Previously, the new mballoc unit tests are only tested by running
"./tools/testing/kunit/kunit.py run ..." in which case only rare configs
are enabled.
This series fixes issues when more debug configs are enabled. Fixes are
tested with and without kunit_tool [1].

[1] https://docs.kernel.org/dev-tools/kunit/run_manual.html

Kemeng Shi (3):
  ext4: alloc test super block from sget
  ext4: hold group lock in ext4 kunit test
  ext4: initialize sbi->s_freeclusters_counter before use in kunit test

 fs/ext4/mballoc-test.c | 77 ++++++++++++++++++++++++++++++++----------
 1 file changed, 60 insertions(+), 17 deletions(-)
  

Comments

Guenter Roeck March 3, 2024, 3:33 p.m. UTC | #1
On 3/2/24 10:17, Kemeng Shi wrote:
> v2->v3:
> -fix warning that sb->s_umount is still held when unit test finishs
> -fix warning that sbi->s_freeclusters_counter is used before
> initialization.
> 
> v1->v2:
> -properly handle error from sget()
> 
> Previously, the new mballoc unit tests are only tested by running
> "./tools/testing/kunit/kunit.py run ..." in which case only rare configs
> are enabled.
> This series fixes issues when more debug configs are enabled. Fixes are
> tested with and without kunit_tool [1].
> 
> [1] https://docs.kernel.org/dev-tools/kunit/run_manual.html
> 
> Kemeng Shi (3):
>    ext4: alloc test super block from sget
>    ext4: hold group lock in ext4 kunit test
>    ext4: initialize sbi->s_freeclusters_counter before use in kunit test
> 
>   fs/ext4/mballoc-test.c | 77 ++++++++++++++++++++++++++++++++----------
>   1 file changed, 60 insertions(+), 17 deletions(-)
> 

I still see crashes with this version. Some examples below.

Guenter

---
mips:

         KTAP version 1
         # Subtest: test_mark_diskspace_used
CPU 0 Unable to handle kernel paging request at virtual address 00780000, epc == 807a4b28, ra == 807a4c20
Oops[#1]:
CPU: 0 PID: 112 Comm: kunit_try_catch Tainted: G                 N 6.8.0-rc6-next-20240301-11397-g2cd922b7255b #1
Hardware name: mti,malta
$ 0   : 00000000 00000001 00780000 00000000
$ 4   : 811c6de0 00000000 00000001 00000000
$ 8   : 00000005 00000000 82f0cc00 00000001
$12   : ffffffff 00000002 00000000 fff80000
$16   : 82d99558 811c6de0 00000000 00000020
$20   : 811c0000 00000000 00000001 00000000
$24   : 00000000 80415884
$28   : 82f1c000 82f1f9f8 00000000 807a4c20
Hi    : 00000000
Lo    : 00002128
epc   : 807a4b28 percpu_counter_add_batch+0x7c/0x224
ra    : 807a4c20 percpu_counter_add_batch+0x174/0x224
Status: 1000a402	KERNEL EXL
Cause : 00800008 (ExcCode 02)
BadVA : 00780000
PrId  : 00019300 (MIPS 24Kc)
Modules linked in:
Process kunit_try_catch (pid: 112, threadinfo=82f1c000, task=82c2cec0, tls=00000000)
Stack : 82d99400 b332f3f3 00000000 b332f3f3 82f1fb20 811c0000 00000000 00000000
         811c0000 82d99400 82d63800 00000008 00000000 80414ac4 00000000 801a4aa4
         00000001 00000000 00000020 00000001 00000000 82f1fa70 00000000 00000023
         00000080 82f1fb20 82f1fb20 00000380 82c2cec0 00000000 00000000 b332f3f3
         82f1fae0 00000025 82d63800 821bbc08 00000001 00000001 821f12a8 82f0cc00
         ...
Call Trace:
[<807a4b28>] percpu_counter_add_batch+0x7c/0x224
[<80414ac4>] ext4_mb_mark_diskspace_used+0x25c/0x26c
[<80414ba4>] test_mark_diskspace_used+0xd0/0x308
[<806e8fe0>] kunit_try_run_case+0x70/0x204
[<806eb1dc>] kunit_generic_run_threadfn_adapter+0x1c/0x28
[<80162be0>] kthread+0x128/0x150
[<80103038>] ret_from_kernel_thread+0x14/0x1c
Code: 02242021  8c820000  00431021 <8c5e0000> 001e17c3  03d5a021  00523021  029e902b  02469021
---[ end trace 0000000000000000 ]---

Various arm emulations:
[    6.617298]         # Subtest: test_mark_diskspace_used
[    6.620243]         ok 1 block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64
[    6.622190] 8<--- cut here ---
[    6.622374] Unable to handle kernel paging request at virtual address 0a3f6000 when read
[    6.622549] [0a3f6000] *pgd=00000000
[    6.622960] Internal error: Oops: 5 [#1] SMP ARM
[    6.623138] Modules linked in:
[    6.623342] CPU: 0 PID: 187 Comm: kunit_try_catch Tainted: G                 N 6.8.0-rc6-next-20240301-11397-g2cd922b7255b #1
[    6.623573] Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[    6.623738] PC is at percpu_counter_add_batch+0x2c/0x110
[    6.624171] LR is at percpu_counter_add_batch+0xa8/0x110
  
Kemeng Shi March 4, 2024, 6:33 a.m. UTC | #2
on 3/3/2024 11:33 PM, Guenter Roeck wrote:
> On 3/2/24 10:17, Kemeng Shi wrote:
>> v2->v3:
>> -fix warning that sb->s_umount is still held when unit test finishs
>> -fix warning that sbi->s_freeclusters_counter is used before
>> initialization.
>>
>> v1->v2:
>> -properly handle error from sget()
>>
>> Previously, the new mballoc unit tests are only tested by running
>> "./tools/testing/kunit/kunit.py run ..." in which case only rare configs
>> are enabled.
>> This series fixes issues when more debug configs are enabled. Fixes are
>> tested with and without kunit_tool [1].
>>
>> [1] https://docs.kernel.org/dev-tools/kunit/run_manual.html
>>
>> Kemeng Shi (3):
>>    ext4: alloc test super block from sget
>>    ext4: hold group lock in ext4 kunit test
>>    ext4: initialize sbi->s_freeclusters_counter before use in kunit test
>>
>>   fs/ext4/mballoc-test.c | 77 ++++++++++++++++++++++++++++++++----------
>>   1 file changed, 60 insertions(+), 17 deletions(-)
>>
> 
> I still see crashes with this version. Some examples below.
> 
Thanks so much for the test and report. It's likely cuased by using
sbi->s_dirtyclusters_counter uninitialized. Will fix it in next
version.

Kemeng
> Guenter
> 
> ---
> mips:
> 
>         KTAP version 1
>         # Subtest: test_mark_diskspace_used
> CPU 0 Unable to handle kernel paging request at virtual address 00780000, epc == 807a4b28, ra == 807a4c20
> Oops[#1]:
> CPU: 0 PID: 112 Comm: kunit_try_catch Tainted: G                 N 6.8.0-rc6-next-20240301-11397-g2cd922b7255b #1
> Hardware name: mti,malta
> $ 0   : 00000000 00000001 00780000 00000000
> $ 4   : 811c6de0 00000000 00000001 00000000
> $ 8   : 00000005 00000000 82f0cc00 00000001
> $12   : ffffffff 00000002 00000000 fff80000
> $16   : 82d99558 811c6de0 00000000 00000020
> $20   : 811c0000 00000000 00000001 00000000
> $24   : 00000000 80415884
> $28   : 82f1c000 82f1f9f8 00000000 807a4c20
> Hi    : 00000000
> Lo    : 00002128
> epc   : 807a4b28 percpu_counter_add_batch+0x7c/0x224
> ra    : 807a4c20 percpu_counter_add_batch+0x174/0x224
> Status: 1000a402    KERNEL EXL
> Cause : 00800008 (ExcCode 02)
> BadVA : 00780000
> PrId  : 00019300 (MIPS 24Kc)
> Modules linked in:
> Process kunit_try_catch (pid: 112, threadinfo=82f1c000, task=82c2cec0, tls=00000000)
> Stack : 82d99400 b332f3f3 00000000 b332f3f3 82f1fb20 811c0000 00000000 00000000
>         811c0000 82d99400 82d63800 00000008 00000000 80414ac4 00000000 801a4aa4
>         00000001 00000000 00000020 00000001 00000000 82f1fa70 00000000 00000023
>         00000080 82f1fb20 82f1fb20 00000380 82c2cec0 00000000 00000000 b332f3f3
>         82f1fae0 00000025 82d63800 821bbc08 00000001 00000001 821f12a8 82f0cc00
>         ...
> Call Trace:
> [<807a4b28>] percpu_counter_add_batch+0x7c/0x224
> [<80414ac4>] ext4_mb_mark_diskspace_used+0x25c/0x26c
> [<80414ba4>] test_mark_diskspace_used+0xd0/0x308
> [<806e8fe0>] kunit_try_run_case+0x70/0x204
> [<806eb1dc>] kunit_generic_run_threadfn_adapter+0x1c/0x28
> [<80162be0>] kthread+0x128/0x150
> [<80103038>] ret_from_kernel_thread+0x14/0x1c
> Code: 02242021  8c820000  00431021 <8c5e0000> 001e17c3  03d5a021  00523021  029e902b  02469021
> ---[ end trace 0000000000000000 ]---
> 
> Various arm emulations:
> [    6.617298]         # Subtest: test_mark_diskspace_used
> [    6.620243]         ok 1 block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64
> [    6.622190] 8<--- cut here ---
> [    6.622374] Unable to handle kernel paging request at virtual address 0a3f6000 when read
> [    6.622549] [0a3f6000] *pgd=00000000
> [    6.622960] Internal error: Oops: 5 [#1] SMP ARM
> [    6.623138] Modules linked in:
> [    6.623342] CPU: 0 PID: 187 Comm: kunit_try_catch Tainted: G                 N 6.8.0-rc6-next-20240301-11397-g2cd922b7255b #1
> [    6.623573] Hardware name: Freescale i.MX6 Ultralite (Device Tree)
> [    6.623738] PC is at percpu_counter_add_batch+0x2c/0x110
> [    6.624171] LR is at percpu_counter_add_batch+0xa8/0x110
> 
>