[diagnostic,qspinlock] Diagnostics for excessive lock-drop wait loop time
Commit Message
We see systems stuck in the queued_spin_lock_slowpath() loop that waits
for the lock to become unlocked in the case where the current CPU has
set pending state. Therefore, this not-for-mainline commit gives a warning
that includes the lock word state if the loop has been spinning for more
than 10 seconds. It also adds a WARN_ON_ONCE() that complains if the
lock is not in pending state.
If this is to be placed in production, some reporting mechanism not
involving spinlocks is likely needed, for example, BPF, trace events,
or some combination thereof.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Comments
Hi Paul,
-----Original Message-----
From: Paul E. McKenney [mailto:paulmck@kernel.org]
> We see systems stuck in the queued_spin_lock_slowpath() loop that waits for the lock to become unlocked in the case where the current CPU has set pending state.
Interesting!
Do you know if the hangs started with a recent patch? What codepaths are active (virtualization/arch/...)? Does it happen extremely rarely? Do you have any additional information?
I saw a similar situation a few years ago in a proprietary kernel, but it only happened once ever and I gave up on looking for the reason after a few days (including some time combing through the compiler generated assembler).
Have fun,
jonas
On Thu, Jan 12, 2023 at 08:51:04PM +0000, Jonas Oberhauser wrote:
> Hi Paul,
>
> -----Original Message-----
> From: Paul E. McKenney [mailto:paulmck@kernel.org]
> > We see systems stuck in the queued_spin_lock_slowpath() loop that waits for the lock to become unlocked in the case where the current CPU has set pending state.
>
> Interesting!
> Do you know if the hangs started with a recent patch? What codepaths are active (virtualization/arch/...)? Does it happen extremely rarely? Do you have any additional information?
As best we can tell right now, see it about three times per day per
million systems on x86 systems running v5.12 plus backports. It is
entirely possible that it is a hardware/firmware problem, but normally
that would cause the failure to cluster on a specific piece of hardware
or specific type of hardware, and we are not seeing that.
But we are in very early days investigating this. In particular,
everything in the previous paragraph is subject to change. For example,
we have not yet eliminated the possibility that the lockword is being
corrupted by unrelated kernel software, which is part of the motivation
for the patch in my earlier email.
> I saw a similar situation a few years ago in a proprietary kernel, but it only happened once ever and I gave up on looking for the reason after a few days (including some time combing through the compiler generated assembler).
If it makes you feel better, yesterday I was sure that I had found the
bug by inspection. But no, just confusion on my part! ;-)
But thank you very much for the possible corroborating information.
You never know!
Thanx, Paul
@@ -379,8 +379,22 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
* clear_pending_set_locked() implementations imply full
* barriers.
*/
- if (val & _Q_LOCKED_MASK)
- atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK));
+ if (val & _Q_LOCKED_MASK) {
+ int cnt = _Q_PENDING_LOOPS;
+ unsigned long j = jiffies + 10 * HZ;
+ struct qspinlock qval;
+ int val;
+
+ for (;;) {
+ val = atomic_read_acquire(&lock->val);
+ atomic_set(&qval.val, val);
+ WARN_ON_ONCE(!(val & _Q_PENDING_VAL));
+ if (!(val & _Q_LOCKED_MASK))
+ break;
+ if (!--cnt && !WARN(time_after(jiffies, j), "%s: Still pending and locked: %#x (%c%c%#x)\n", __func__, val, ".L"[!!qval.locked], ".P"[!!qval.pending], qval.tail))
+ cnt = _Q_PENDING_LOOPS;
+ }
+ }
/*
* take ownership and clear the pending bit.