[tip:,locking/core] locking/x86: Implement local_xchg() using CMPXCHG without the LOCK prefix

Message ID 170929672897.398.5127033798711557434.tip-bot2@tip-bot2
State New
Headers
Series [tip:,locking/core] locking/x86: Implement local_xchg() using CMPXCHG without the LOCK prefix |

Commit Message

tip-bot2 for Thomas Gleixner March 1, 2024, 12:38 p.m. UTC
  The following commit has been merged into the locking/core branch of tip:

Commit-ID:     e807c2a37044a51de89d6d4f8a1f5ecfb3752f36
Gitweb:        https://git.kernel.org/tip/e807c2a37044a51de89d6d4f8a1f5ecfb3752f36
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 24 Jan 2024 11:58:16 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 01 Mar 2024 12:54:25 +01:00

locking/x86: Implement local_xchg() using CMPXCHG without the LOCK prefix

Implement local_xchg() using the CMPXCHG instruction without the LOCK prefix.
XCHG is expensive due to the implied LOCK prefix.  The processor
cannot prefetch cachelines if XCHG is used.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20240124105816.612670-1-ubizjak@gmail.com
---
 arch/x86/include/asm/local.h | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)
  

Patch

diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 73dba8b..59aa966 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -131,8 +131,20 @@  static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
 				 (typeof(l->a.counter) *) old, new);
 }
 
-/* Always has a lock prefix */
-#define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
+/*
+ * Implement local_xchg using CMPXCHG instruction without the LOCK prefix.
+ * XCHG is expensive due to the implied LOCK prefix.  The processor
+ * cannot prefetch cachelines if XCHG is used.
+ */
+static __always_inline long
+local_xchg(local_t *l, long n)
+{
+	long c = local_read(l);
+
+	do { } while (!local_try_cmpxchg(l, &c, n));
+
+	return c;
+}
 
 /**
  * local_add_unless - add unless the number is already a given value