[v2,0/2] fix dying cpu compare race

Message ID 20230406015629.1804722-1-yebin@huaweicloud.com
Headers
Series fix dying cpu compare race |

Message

Ye Bin April 6, 2023, 1:56 a.m. UTC
  From: Ye Bin <yebin10@huawei.com>

This patch set solve race between  '__percpu_counter_compare()' and cpu offline.
Before commit 5825bea05265("xfs: __percpu_counter_compare() inode count debug too expensive").
I got issue as follows when do cpu online/offline test:
smpboot: CPU 1 is now offline
XFS: Assertion failed: percpu_counter_compare(&mp->m_ifree, 0) >= 0, file: fs/xfs/xfs_trans.c, line: 622
------------[ cut here ]------------
kernel BUG at fs/xfs/xfs_message.c:110!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 3 PID: 25512 Comm: fsstress Not tainted 5.10.0-04288-gcb31bdc8c65d #8
RIP: 0010:assfail+0x77/0x8b fs/xfs/xfs_message.c:110
RSP: 0018:ffff88810a5df5c0 EFLAGS: 00010293
RAX: ffff88810f3a8000 RBX: 0000000000000201 RCX: ffffffffaa8bd7c0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000000 R08: ffff88810f3a8000 R09: ffffed103edf71cd
R10: ffff8881f6fb8e67 R11: ffffed103edf71cc R12: ffffffffab0108c0
R13: ffffffffab010220 R14: ffffffffffffffff R15: 0000000000000000
FS:  00007f8536e16b80(0000) GS:ffff8881f6f80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005617e1115f44 CR3: 000000015873a005 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 xfs_trans_unreserve_and_mod_sb+0x833/0xca0 fs/xfs/xfs_trans.c:622
 xlog_cil_commit+0x1169/0x29b0 fs/xfs/xfs_log_cil.c:1325
 __xfs_trans_commit+0x2c0/0xe20 fs/xfs/xfs_trans.c:889
 xfs_create_tmpfile+0x6a6/0x9a0 fs/xfs/xfs_inode.c:1320
 xfs_rename_alloc_whiteout fs/xfs/xfs_inode.c:3193 [inline]
 xfs_rename+0x58a/0x1e00 fs/xfs/xfs_inode.c:3245
 xfs_vn_rename+0x28e/0x410 fs/xfs/xfs_iops.c:436
 vfs_rename+0x10b5/0x1dd0 fs/namei.c:4329
 do_renameat2+0xa19/0xb10 fs/namei.c:4474
 __do_sys_renameat2 fs/namei.c:4512 [inline]
 __se_sys_renameat2 fs/namei.c:4509 [inline]
 __x64_sys_renameat2+0xe4/0x120 fs/namei.c:4509
 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f853623d91d

I can reproduce above issue by injecting kernel latency to invalidate the quick
judgment of “__percpu_counter_compare()”.
For quick judgment logic, the number of CPUs may have decreased before calling
percpu_counter_cpu_dead() when concurrent with CPU offline. That leads to
calculation errors. For example:
Assumption:
(1) batch = 32
(2) The final count is 2
(3) The number of CPUs is 4
If the number of percpu variables on each CPU is as follows when CPU3 is offline:
 cpu0   cpu1   cpu2  cpu3
  31     31     31    31
 fbc->count = -122 -> 'percpu_counter_cpu_dead()' isn't called.
So at this point, check if percpu counter is greater than 0.
 abs(count - rhs) = -122
 batch * num_ online_ cpus() = 32 * 3 = 96 -> Online CPUs number become 3
That is: abs (count rhs) > batch * num_online_cpus() condition met. The actual
value is 2, but the fact that count<0 returns -1 is the opposite.

Ye Bin (2):
  cpu/hotplug: introduce 'num_dying_cpus' to get dying CPUs count
  lib/percpu_counter: fix dying cpu compare race

 include/linux/cpumask.h | 20 ++++++++++++++++----
 kernel/cpu.c            |  2 ++
 lib/percpu_counter.c    | 11 ++++++++++-
 3 files changed, 28 insertions(+), 5 deletions(-)