From patchwork Sun Dec 25 11:55:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Guo Ren X-Patchwork-Id: 36467 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp491529wrt; Sun, 25 Dec 2022 03:56:51 -0800 (PST) X-Google-Smtp-Source: AMrXdXvfkwy0nWWjVfHHpwnm+SDAgO5GTqCQjz2r5A2D/tYwoMGd6H2KlqMzZi9LWoRKkYkH8tdo X-Received: by 2002:a17:906:36ce:b0:7c1:727c:5f70 with SMTP id b14-20020a17090636ce00b007c1727c5f70mr13224718ejc.46.1671969411540; Sun, 25 Dec 2022 03:56:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671969411; cv=none; d=google.com; s=arc-20160816; b=mo/T1YApd1heMxv3rPhLq+8DqoWJR06WmuHTQxm4AtcpFeVGzFIEWcecNlKQhOxkwO Lz9iJ5uv6vH9AZxL/0jO9oMKRxtIbHoQ0rZYDCO9SwMk51S6GMI0kGATWJycmyg00PAs CH/tP+TqpSsr71x+n1sMxUDoBGB5ViA+iu7VG9QMnpm4N4XT8oiav3KkQoidOhFPpyOY 9T3CCpyYWmo47RfvQtKo8dCq36es1D7o00BNcAaQlZhU1d9GKyxarDsGz1YX+mMk4pMg TAdBWY0EGJTM7lhdbAzEjZDAUMRkuzAC9kjxbbXfQPMwG9eGaF+HUxGzecch34yOKdRf 8DTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=sKSKwDew+MTysPNtCr0jPYv/3RFM4/CEBg8pCyehbhI=; b=pX9qOyEvTMFuBRGpVxDeVV0x0VHeB3fk6oJMgMJwssKjMebzyz92YE+LWEmfCZt+/q 36yx2aC/RNkSNoQvR2W+eCWl7L3KAn5o574bDpe/2XXVaG5i9npzON72T5p74iOwDPfg 9mvoj+qqaklrDz+THeUhqRyIgsnqfYAad5slrI3u9+HQE2VGFjbApMDc03DB1yR9yG4K bjFPpvVSz//1082KlonlrJPUwGKYmrtFlez6nMshmUi5nLipex87PXYkQTH1APHilaBG zI9DZaZ9FPSqAwCc81powZKzX+XXkJEFGkJo4tP2haECfSpJEQI07AUtI33lhBnDgU2A XKew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=carjUwsI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m7-20020a170906848700b0078cc4a8a956si5719343ejx.881.2022.12.25.03.56.28; Sun, 25 Dec 2022 03:56:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=carjUwsI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230366AbiLYLz4 (ORCPT + 99 others); Sun, 25 Dec 2022 06:55:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229445AbiLYLzy (ORCPT ); Sun, 25 Dec 2022 06:55:54 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E89ED267A for ; Sun, 25 Dec 2022 03:55:49 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 245C760B9C for ; Sun, 25 Dec 2022 11:55:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A281BC433EF; Sun, 25 Dec 2022 11:55:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671969348; bh=RwQyjxEXl7Jyeplt6hWs7So4CHmH2Z4NzeLEZxaefns=; h=From:To:Cc:Subject:Date:From; b=carjUwsIANq1z3T5rk0aCfvKQGbwtCy1jgPiLScEmv2oxiaj4cyqmR2AMb9lvJtVd cebt9yPAxUUzBASNGIu0uCskRzIFTt6hDbdDqgfE2p2JoKymNRQDMmbxog6ziZr3F1 XP5ZjKXgPzaUOhSkx7WQ/iXzl5JMryZnaKIQwJnTCLlt7COb79N2X0znBcR7ciegIY KtkbZcZH3hcf15LLznTeIIxzneTpAbxLO6BxAoeoMCuX8CgYvnYAQAEHHdjm9hEhKk 8l8a0bh6Ngy2Hp3z3y0wDJZdmjsLxsD2K8Dxq5YUYeLtc/6KOyxKhG7S4usDh5COby 5IwKQfv8PTdtg== From: guoren@kernel.org To: peterz@infradead.org, longman@redhat.com Cc: linux-kernel@vger.kernel.org, guoren@kernel.org, Guo Ren , Boqun Feng , Will Deacon , Ingo Molnar Subject: [RFC PATCH] locking/barriers: Introduce smp_cond_load_mask_relaxed & acquire Date: Sun, 25 Dec 2022 06:55:29 -0500 Message-Id: <20221225115529.490378-1-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753186997219789695?= X-GMAIL-MSGID: =?utf-8?q?1753186997219789695?= From: Guo Ren The current cond_load primitive contains two parts (condition expression and load value), but the usage of cond_load may require the sub-size condition expression of the load size. That means hardware could utilize a mask argument to optimize the wait condition. If the mask argument size is less than the hardware minimum wait size, the hardware uses its minimum size. The patch contains a qspinlock example: When it is at the head of the waitqueue, it waits for the owner & pending to go away. The forward progress condition only cares locked_pending part, but it needs to load the 32-bit lock value as a return. That also means WFE-liked instruction would need a mask argument of the load reservation set. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Waiman Long Cc: Peter Zijlstra Cc: Boqun Feng Cc: Will Deacon Cc: Ingo Molnar --- include/asm-generic/barrier.h | 22 ++++++++++++++++++++++ include/linux/atomic.h | 4 ++++ kernel/locking/qspinlock.c | 3 ++- 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index 961f4d88f9ef..fec61629f769 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h @@ -275,6 +275,28 @@ do { \ }) #endif +/** + * smp_cond_load_mask_relaxed() - (Spin) wait for cond with ACQUIRE ordering + * @ptr: pointer to the variable to wait on + * @cond: boolean expression to wait for + * @mask: mask *ptr to wait for (effect of 0 is the same with -1) + */ +#ifndef smp_cond_load_mask_relaxed +#define smp_cond_load_mask_relaxed(ptr, cond_expr, mask) \ + smp_cond_load_relaxed(ptr, cond_expr) +#endif + +/** + * smp_cond_load_mask_acquire() - (Spin) wait for cond with ACQUIRE ordering + * @ptr: pointer to the variable to wait on + * @cond: boolean expression to wait for + * @mask: mask *ptr to wait for (effect of 0 is the same with -1) + */ +#ifndef smp_cond_load_mask_acquire +#define smp_cond_load_mask_acquire(ptr, cond_expr, mask) \ + smp_cond_load_acquire(ptr, cond_expr) +#endif + /* * pmem_wmb() ensures that all stores for which the modification * are written to persistent storage by preceding instructions have diff --git a/include/linux/atomic.h b/include/linux/atomic.h index 8dd57c3a99e9..dc7351945f27 100644 --- a/include/linux/atomic.h +++ b/include/linux/atomic.h @@ -27,9 +27,13 @@ #define atomic_cond_read_acquire(v, c) smp_cond_load_acquire(&(v)->counter, (c)) #define atomic_cond_read_relaxed(v, c) smp_cond_load_relaxed(&(v)->counter, (c)) +#define atomic_cond_read_mask_acquire(v, c, m) smp_cond_load_mask_acquire(&(v)->counter, (c), (m)) +#define atomic_cond_read_mask_relaxed(v, c, m) smp_cond_load_mask_relaxed(&(v)->counter, (c), (m)) #define atomic64_cond_read_acquire(v, c) smp_cond_load_acquire(&(v)->counter, (c)) #define atomic64_cond_read_relaxed(v, c) smp_cond_load_relaxed(&(v)->counter, (c)) +#define atomic64_cond_read_mask_acquire(v, c, m) smp_cond_load_mask_acquire(&(v)->counter, (c), (m)) +#define atomic64_cond_read_mask_relaxed(v, c, m) smp_cond_load_mask_relaxed(&(v)->counter, (c), (m)) /* * The idea here is to build acquire/release variants by adding explicit diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ebe6b8ec7cb3..14fdd2ee752c 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -511,7 +511,8 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) if ((val = pv_wait_head_or_lock(lock, node))) goto locked; - val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); + val = atomic_cond_read_mask_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK), + _Q_LOCKED_PENDING_MASK); locked: /*