locking/rtmutex: Do the trylock-slowpath with DEBUG_RT_MUTEXES enabled.

Message ID 20230328165430.9eOXd-55@linutronix.de
State New
Headers
Series locking/rtmutex: Do the trylock-slowpath with DEBUG_RT_MUTEXES enabled. |

Commit Message

Sebastian Andrzej Siewior March 28, 2023, 4:54 p.m. UTC
  With DEBUG_RT_MUTEXES enabled the fast-path locking
(rt_mutex_cmpxchg_acquire()) always fails. This leads to the invocation
of blk_flush_plug() even if the lock is not acquired which is
unnecessary and avoids batch processing of requests.

rt_mutex_slowtrylock() performs the trylock-slowpath and acquires the
lock if possible.
__rt_mutex_trylock() performs the fastpath try-lock and the slowpath
trylock. The latter is not desired in the non-debug case because it
fails very often even after rt_mutex_owner() reported that there is no
owner.
Here some numbers from a boot up + a few FS operations, hackbench:
- total __rt_mutex_lock() -> __rt_mutex_trylock() invocations with no
  owner: 32160
- success: 189
- failed: 31971
  - RT_MUTEX_HAS_WAITERS was set the whole time: 27469
  - owner appeared after the wait_lock has been obtained: 4502

The slowlock trylock failed in most cases without an owner because a
waiter was pending and did not acquire the lock yet. The few cases in
which it succeeded were because the pending bit was cleared after the
wait_lock was acquired.
Based on these numbers, rt_mutex_slowtrylock() in the non-DEBUG case
adds just overhead without contributing anything to the locking process.

In a dist-upgrade test with DEBUG_RT_MUTEXES enabled, the here proposed
rt_mutex_slowtrylock() optimisation acquired all locks with
current->plug set and avoided a blk_flush_plug() invocation.

Use rt_mutex_slowtrylock() in the DEBUG_RT_MUTEXES case to acquire the
lock instead the disabled rt_mutex_cmpxchg_acquire().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
On 2023-03-22 17:27:21 [+0100], To Thomas Gleixner wrote:
> > Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
> > lock operation whether the lock is contended or not.
> 
> For mutex & ww_mutex operations. rwsem is not affected by
> CONFIG_DEBUG_RT_MUTEXES. As for mutex it could be mitigated by invoking
> try_to_take_rt_mutex() before blk_flush_plug().

This fixes the problem. I only observed blk_flush_plug() invocations
from down_read()/rwbase_read_lock() and down() which are not affected by
CONFIG_DEBUG_RT_MUTEXES.
I haven't observed anything in the ww-mutex path so we can ignore it or
do something similar to this.

 kernel/locking/rtmutex.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)
  

Comments

Thomas Gleixner April 21, 2023, 5:58 p.m. UTC | #1
On Tue, Mar 28 2023 at 18:54, Sebastian Andrzej Siewior wrote:
> On 2023-03-22 17:27:21 [+0100], To Thomas Gleixner wrote:
>> > Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
>> > lock operation whether the lock is contended or not.
>> 
>> For mutex & ww_mutex operations. rwsem is not affected by
>> CONFIG_DEBUG_RT_MUTEXES. As for mutex it could be mitigated by invoking
>> try_to_take_rt_mutex() before blk_flush_plug().

> I haven't observed anything in the ww-mutex path so we can ignore it or
> do something similar to this.

Yay for consistency !

I fixed it up to the below.

Thanks,

        tglx
---

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -219,6 +219,11 @@ static __always_inline bool rt_mutex_cmp
 	return try_cmpxchg_acquire(&lock->owner, &old, new);
 }
 
+static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *lock)
+{
+	return rt_mutex_cmpxchg_acquire(lock, old, new);
+}
+
 static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
 						     struct task_struct *old,
 						     struct task_struct *new)
@@ -298,6 +303,20 @@ static __always_inline bool rt_mutex_cmp
 
 }
 
+static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *lock)
+{
+	/*
+	 * With debug enabled rt_mutex_cmpxchg trylock() will always fail,
+	 * which will unconditionally invoke blk_flush_plug() in the slow
+	 * path of __rt_mutex_lock() and __ww_rt_mutex_lock() even in the
+	 * non-contended case.
+	 *
+	 * Avoid that by using rt_mutex_slow_trylock() which is fully covered
+	 * by the debug code and can acquire a non-contended rtmutex.
+	 */
+	return rt_mutex_slowtrylock(lock);
+}
+
 static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
 						     struct task_struct *old,
 						     struct task_struct *new)
@@ -1698,9 +1717,8 @@ static int __sched rt_mutex_slowlock(str
 static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
 					   unsigned int state)
 {
-	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
+	if (likely(rt_mutex_try_acquire(lock)))
 		return 0;
-
 	/*
 	 * The task is about to sleep. Flush plugged IO as that might
 	 * take locks and corrupt tsk::pi_blocked_on.
--- a/kernel/locking/ww_rt_mutex.c
+++ b/kernel/locking/ww_rt_mutex.c
@@ -62,7 +62,7 @@ static int __sched
 	}
 	mutex_acquire_nest(&rtm->dep_map, 0, 0, nest_lock, ip);
 
-	if (likely(rt_mutex_cmpxchg_acquire(&rtm->rtmutex, NULL, current))) {
+	if (likely(rt_mutex_try_acquire(&rtm->rtmutex, NULL, current))) {
 		if (ww_ctx)
 			ww_mutex_set_context_fastpath(lock, ww_ctx);
 		return 0;
  
Sebastian Andrzej Siewior April 24, 2023, 8:42 a.m. UTC | #2
On 2023-04-21 19:58:52 [+0200], Thomas Gleixner wrote:
> On Tue, Mar 28 2023 at 18:54, Sebastian Andrzej Siewior wrote:
> > On 2023-03-22 17:27:21 [+0100], To Thomas Gleixner wrote:
> >> > Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
> >> > lock operation whether the lock is contended or not.
> >> 
> >> For mutex & ww_mutex operations. rwsem is not affected by
> >> CONFIG_DEBUG_RT_MUTEXES. As for mutex it could be mitigated by invoking
> >> try_to_take_rt_mutex() before blk_flush_plug().
> 
> > I haven't observed anything in the ww-mutex path so we can ignore it or
> > do something similar to this.
> 
> Yay for consistency !
> 
> I fixed it up to the below.

you fixed the ww-mutex path and did with the debug path what I did in
the follow-up patch. Let me fold this then and drop the other one.

> Thanks,
> 
>         tglx

Sebastian
  

Patch

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index c1bc2cb1522cb..08c599a5089a2 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1698,9 +1698,18 @@  static int __sched rt_mutex_slowlock(struct rt_mutex_base *lock,
 static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
 					   unsigned int state)
 {
+	/*
+	 * With DEBUG enabled cmpxchg trylock will always fail. Instead of
+	 * invoking blk_flush_plug() try the trylock-slowpath first which will
+	 * succeed if the lock is not contended.
+	 */
+#ifdef CONFIG_DEBUG_RT_MUTEXES
+	if (likely(rt_mutex_slowtrylock(lock)))
+		return 0;
+#else
 	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
 		return 0;
-
+#endif
 	/*
 	 * If we are going to sleep and we have plugged IO queued, make sure to
 	 * submit it to avoid deadlocks.