Message ID | 20230506062934.69652-1-qiuxu.zhuo@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp887685vqo; Fri, 5 May 2023 23:33:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7VkO3aVe+4uSPMxtg5Nb2GtvO81oG4rSwzd4zkkDRIaYnvrc9LImTX0IRbLDqyDQI+I2Se X-Received: by 2002:a05:6a00:cca:b0:643:7fcf:836d with SMTP id b10-20020a056a000cca00b006437fcf836dmr5776014pfv.25.1683354804974; Fri, 05 May 2023 23:33:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683354804; cv=none; d=google.com; s=arc-20160816; b=d9sia+2ZNwSgKFzB+Ut68BpxRl4jLH5J6JE27udbaVED9y4n7vcEVxKmjV95VdM+5j ikD1mSdmCndGCaJSP8nThHKVJS60U7sPaO08UF2Xa8Gm0uY895hQ81JMpbN/0UTCF5yh vrlqOtrrh6UaXdh9M+YfI/FpRMyh1hFMAh3y1YqpX0+VKKHiNh/gG36+81gjMFAhkHqp g9t0kbpPhsg5Kzin7K7CT+GP9gKtqjv7eiEawTORjM+55fz3IoRMRg/KrTRQTwRKZe0p SNMDwvyuhYPlOhMvXByUPP1OKHGZ/o53XgbCgbw0To2MdQ/KViLLhNGt5TqSPBDBItNm tuQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from :dkim-signature; bh=62XyWZjvhtxwCiz0HDpz1p6AiAoKnsydKh3YlN3gFIs=; b=t2to5L6j3vYvu315iCyEnyxfWXuObUFGLRl4aDwZ/a54i1GW39/spFfVpT03pnI3lw xpKRufbRFNB88mzomur3btPLj4Zsp1Lxn1BKhF9GGTJV/g5inFPW/peQr+OkMJoakYkE 2ZGcfw5U7TOX5lBV0J8D8hqGnEUSWuNXu5bBk9N0yNKRIKJp4a6/zu39sWr7ec/MXhLf cMTZIr3n98wxXzPm9aG+sJdiMXzdnvuZkVToMdMT+m5moQOwx46qvn1qzAQw2hXioC6h 0ddpO5yRAb5lG3GPNaLztQx2odGxInwvZNlpE12WFDTqL5L+ciyENgyQwMqlMvhrKdn/ OW8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Q8EbQd3R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l68-20020a622547000000b0062e024b49b8si3497398pfl.150.2023.05.05.23.33.11; Fri, 05 May 2023 23:33:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Q8EbQd3R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229607AbjEFGbZ (ORCPT <rfc822;baris.duru.linux@gmail.com> + 99 others); Sat, 6 May 2023 02:31:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbjEFGbM (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 6 May 2023 02:31:12 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05ACD93EA for <linux-kernel@vger.kernel.org>; Fri, 5 May 2023 23:30:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683354639; x=1714890639; h=from:to:cc:subject:date:message-id; bh=iyMpxKPgiMYjzyLqZMSzHcOUu2jMWbGXnRDg9G+cCnM=; b=Q8EbQd3RnEX3ProL3nlkJ/kb7J6vsM6oTw/EME3RM8SIfUCNDTdhXICS Zmk7dV3MQ8twF/cYb+qP4AxQ6c/obgq/G9wr8Q+9ddXpkzsBW3ad1xtUl k4Uf0RGhUVHb9g/xhgmp+bjzFAPT/NtwrkrsHynz4KqV6EnkmFMKdttT+ 9D4AbopJ+xDqcLDYe0EKjZKoZH02rvy9LQ5SiPkaIweYkpByNouBtQ767 Dxa4qGBWBeBNfSiMV62wGwilpH+xm0atj80uyI0shJD7l5Bu7UWqAXW9Z m0AsvcKXt1FH3hQFSAchZ+aivagZzYllX+VVZ8t2ky/V93mWgr7tgStIq w==; X-IronPort-AV: E=McAfee;i="6600,9927,10701"; a="328975922" X-IronPort-AV: E=Sophos;i="5.99,254,1677571200"; d="scan'208";a="328975922" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2023 23:30:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10701"; a="842068898" X-IronPort-AV: E=Sophos;i="5.99,254,1677571200"; d="scan'208";a="842068898" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2023 23:30:00 -0700 From: Qiuxu Zhuo <qiuxu.zhuo@intel.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>, Waiman Long <longman@redhat.com>, Boqun Feng <boqun.feng@gmail.com>, linux-kernel@vger.kernel.org Subject: [PATCH 1/1] locking/qspinlock: Fix state-transition changes in comments Date: Sat, 6 May 2023 14:29:34 +0800 Message-Id: <20230506062934.69652-1-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.17.1 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765125447759987832?= X-GMAIL-MSGID: =?utf-8?q?1765125447759987832?= |
Series |
[1/1] locking/qspinlock: Fix state-transition changes in comments
|
|
Commit Message
Qiuxu Zhuo
May 6, 2023, 6:29 a.m. UTC
1. There may be concurrent locker CPUs to set the qspinlock pending bit.
The first CPU (called pending CPU) of these CPUs sets the pending
bit to make the state transition (the qspinlock pending bit is set):
0,0,* -> 0,1,*
The rest of these CPUs are queued to the MCS queue to make the state
transition (the qspinlock tail is updated):
0,1,* -> *,1,*
The pending CPU waits until the locker owner goes away to make
the state transition (the qspinlock locked field is set to zero):
*,1,* -> *,1,0
The pending CPU takes the ownership and clears the pending bit
to make the state transition:
*,1,0 -> *,0,1
2. The header of the MCS queue takes the ownership and calls set_locked()
to make the state transition:
*,*,0 -> *,*,1
Fix the state-transition changes above in the code comments accordingly.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
kernel/locking/qspinlock.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
Comments
On 5/6/23 02:29, Qiuxu Zhuo wrote: > 1. There may be concurrent locker CPUs to set the qspinlock pending bit. > > The first CPU (called pending CPU) of these CPUs sets the pending > bit to make the state transition (the qspinlock pending bit is set): > > 0,0,* -> 0,1,* > > The rest of these CPUs are queued to the MCS queue to make the state > transition (the qspinlock tail is updated): > > 0,1,* -> *,1,* > > The pending CPU waits until the locker owner goes away to make > the state transition (the qspinlock locked field is set to zero): > > *,1,* -> *,1,0 > > The pending CPU takes the ownership and clears the pending bit > to make the state transition: > > *,1,0 -> *,0,1 > > 2. The header of the MCS queue takes the ownership and calls set_locked() > to make the state transition: > > *,*,0 -> *,*,1 That is not true. The pending bit owner has priority over the MCS queue head. So the pending bit must be 0 before the MCS queue head can take over the lock. So *,0,0 -> *,0,1 > > Fix the state-transition changes above in the code comments accordingly. > > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > --- > kernel/locking/qspinlock.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > index ebe6b8ec7cb3..efebbf19f887 100644 > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -257,7 +257,7 @@ static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lo > * set_locked - Set the lock bit and own the lock > * @lock: Pointer to queued spinlock structure > * > - * *,*,0 -> *,0,1 > + * *,*,0 -> *,*,1 set_locked() can only be called when it is sure that the pending bit isn't set. > */ > static __always_inline void set_locked(struct qspinlock *lock) > { > @@ -348,7 +348,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > /* > * trylock || pending > * > - * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock > + * 0,0,* -> 0,1,* -> ... -> *,0,1 pending, trylock By the time trylock is done, there may be entries in the queue. However, I doubt it helps by adding "..." in between possible multiple transitions. > */ > val = queued_fetch_set_pending_acquire(lock); > > @@ -358,6 +358,8 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > * Undo and queue; our setting of PENDING might have made the > * n,0,0 -> 0,0,0 transition fail and it will now be waiting > * on @next to become !NULL. > + * > + * 0,1,* -> *,1,* There is already a "n,0,0 -> 0,0,0" above, adding a new one may just confuse people. > */ > if (unlikely(val & ~_Q_LOCKED_MASK)) { > > @@ -371,7 +373,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > /* > * We're pending, wait for the owner to go away. > * > - * 0,1,1 -> *,1,0 > + * *,1,* -> *,1,0 This refers to the wait loop. We don't need to wait if the owner has gone. > * > * this wait loop must be a load-acquire such that we match the > * store-release that clears the locked bit and create lock > @@ -385,7 +387,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > /* > * take ownership and clear the pending bit. > * > - * 0,1,0 -> 0,0,1 > + * *,1,0 -> *,0,1 That is the part that we can make the change in the transition diagram as noted. Cheers, Longman
Hi Wainman, Thanks for your review. Please see the comments below. > From: Waiman Long <longman@redhat.com> > Sent: Monday, May 8, 2023 11:29 PM > To: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Peter Zijlstra > <peterz@infradead.org>; Ingo Molnar <mingo@redhat.com>; Will Deacon > <will@kernel.org> > Cc: Boqun Feng <boqun.feng@gmail.com>; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/1] locking/qspinlock: Fix state-transition changes in > comments > > > On 5/6/23 02:29, Qiuxu Zhuo wrote: > > 1. There may be concurrent locker CPUs to set the qspinlock pending bit. > > > > The first CPU (called pending CPU) of these CPUs sets the pending > > bit to make the state transition (the qspinlock pending bit is set): > > > > 0,0,* -> 0,1,* > > > > The rest of these CPUs are queued to the MCS queue to make the state > > transition (the qspinlock tail is updated): > > > > 0,1,* -> *,1,* > > > > The pending CPU waits until the locker owner goes away to make > > the state transition (the qspinlock locked field is set to zero): > > > > *,1,* -> *,1,0 > > > > The pending CPU takes the ownership and clears the pending bit > > to make the state transition: > > > > *,1,0 -> *,0,1 > > > > 2. The header of the MCS queue takes the ownership and calls set_locked() > > to make the state transition: > > > > *,*,0 -> *,*,1 > > That is not true. The pending bit owner has priority over the MCS queue > head. So the pending bit must be 0 before the MCS queue head can take over > the lock. So > > *,0,0 -> *,0,1 Yes, the pending bit must be 0 before the header can take the lock. But as the statement "There may be concurrent locker CPUs to set the qspinlock pending bit " at the beginning. So just after the header takes over the lock, there is also a possible concurrent locker CPU to set the pending bit. That means at this time point here, the pending bit could be either 0 or 1. > > > > Fix the state-transition changes above in the code comments accordingly. > > > > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > > --- > > kernel/locking/qspinlock.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > > index ebe6b8ec7cb3..efebbf19f887 100644 > > --- a/kernel/locking/qspinlock.c > > +++ b/kernel/locking/qspinlock.c > > @@ -257,7 +257,7 @@ static __always_inline u32 > queued_fetch_set_pending_acquire(struct qspinlock *lo > > * set_locked - Set the lock bit and own the lock > > * @lock: Pointer to queued spinlock structure > > * > > - * *,*,0 -> *,0,1 > > + * *,*,0 -> *,*,1 > set_locked() can only be called when it is sure that the pending bit isn't set. > > */ > > static __always_inline void set_locked(struct qspinlock *lock) > > { > > @@ -348,7 +348,7 @@ void __lockfunc queued_spin_lock_slowpath(struct > qspinlock *lock, u32 val) > > /* > > * trylock || pending > > * > > - * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock > > + * 0,0,* -> 0,1,* -> ... -> *,0,1 pending, trylock > > By the time trylock is done, there may be entries in the queue. However, I > doubt it helps by adding "..." in between possible multiple transitions. > The added "..." means there could be entries in the queue before trylock done. This is just for making the state transitions more complete ;-). If you think it doesn't help, I can remove it in the next version. > > */ > > val = queued_fetch_set_pending_acquire(lock); > > > > @@ -358,6 +358,8 @@ void __lockfunc queued_spin_lock_slowpath(struct > qspinlock *lock, u32 val) > > * Undo and queue; our setting of PENDING might have made the > > * n,0,0 -> 0,0,0 transition fail and it will now be waiting > > * on @next to become !NULL. > > + * > > + * 0,1,* -> *,1,* > There is already a "n,0,0 -> 0,0,0" above, adding a new one may just > confuse people. > > */ > > if (unlikely(val & ~_Q_LOCKED_MASK)) { > > > > @@ -371,7 +373,7 @@ void __lockfunc queued_spin_lock_slowpath(struct > qspinlock *lock, u32 val) > > /* > > * We're pending, wait for the owner to go away. > > * > > - * 0,1,1 -> *,1,0 > > + * *,1,* -> *,1,0 > > This refers to the wait loop. We don't need to wait if the owner has gone. But just before we wait for the locked field, the locked field could be either 0 (the locker can release the lock at any time) or 1. > > > * > > * this wait loop must be a load-acquire such that we match the > > * store-release that clears the locked bit and create lock > > @@ -385,7 +387,7 @@ void __lockfunc queued_spin_lock_slowpath(struct > qspinlock *lock, u32 val) > > /* > > * take ownership and clear the pending bit. > > * > > - * 0,1,0 -> 0,0,1 > > + * *,1,0 -> *,0,1 > > That is the part that we can make the change in the transition diagram > as noted. Sorry, I'm not clear about your request. Did you mean just make the change "*,1,0 -> *,0,1" above in the transition diagram or all the changes above in the transition diagram? Thanks! -Qiuxu > Cheers, > Longman
On 5/8/23 22:03, Zhuo, Qiuxu wrote: > Hi Wainman, > > Thanks for your review. Please see the comments below. > >> From: Waiman Long <longman@redhat.com> >> Sent: Monday, May 8, 2023 11:29 PM >> To: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Peter Zijlstra >> <peterz@infradead.org>; Ingo Molnar <mingo@redhat.com>; Will Deacon >> <will@kernel.org> >> Cc: Boqun Feng <boqun.feng@gmail.com>; linux-kernel@vger.kernel.org >> Subject: Re: [PATCH 1/1] locking/qspinlock: Fix state-transition changes in >> comments >> >> >> On 5/6/23 02:29, Qiuxu Zhuo wrote: >>> 1. There may be concurrent locker CPUs to set the qspinlock pending bit. >>> >>> The first CPU (called pending CPU) of these CPUs sets the pending >>> bit to make the state transition (the qspinlock pending bit is set): >>> >>> 0,0,* -> 0,1,* >>> >>> The rest of these CPUs are queued to the MCS queue to make the state >>> transition (the qspinlock tail is updated): >>> >>> 0,1,* -> *,1,* >>> >>> The pending CPU waits until the locker owner goes away to make >>> the state transition (the qspinlock locked field is set to zero): >>> >>> *,1,* -> *,1,0 >>> >>> The pending CPU takes the ownership and clears the pending bit >>> to make the state transition: >>> >>> *,1,0 -> *,0,1 >>> >>> 2. The header of the MCS queue takes the ownership and calls set_locked() >>> to make the state transition: >>> >>> *,*,0 -> *,*,1 >> That is not true. The pending bit owner has priority over the MCS queue >> head. So the pending bit must be 0 before the MCS queue head can take over >> the lock. So >> >> *,0,0 -> *,0,1 > Yes, the pending bit must be 0 before the header can take the lock. > But as the statement "There may be concurrent locker CPUs to set the qspinlock pending bit " at > the beginning. So just after the header takes over the lock, there is also a possible concurrent locker CPU > to set the pending bit. That means at this time point here, the pending bit could be either 0 or 1. OK, I am looking from the point of view of the CPU doing the transition. Otherwise, for an arch with very weak memory ordering, anything is possible. i.e. *,*,* -> *,*,* if we have to consider what all the CPUs in the system can see. > >>> Fix the state-transition changes above in the code comments accordingly. >>> >>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> >>> --- >>> kernel/locking/qspinlock.c | 10 ++++++---- >>> 1 file changed, 6 insertions(+), 4 deletions(-) >>> >>> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c >>> index ebe6b8ec7cb3..efebbf19f887 100644 >>> --- a/kernel/locking/qspinlock.c >>> +++ b/kernel/locking/qspinlock.c >>> @@ -257,7 +257,7 @@ static __always_inline u32 >> queued_fetch_set_pending_acquire(struct qspinlock *lo >>> * set_locked - Set the lock bit and own the lock >>> * @lock: Pointer to queued spinlock structure >>> * >>> - * *,*,0 -> *,0,1 >>> + * *,*,0 -> *,*,1 >> set_locked() can only be called when it is sure that the pending bit isn't set. >>> */ >>> static __always_inline void set_locked(struct qspinlock *lock) >>> { >>> @@ -348,7 +348,7 @@ void __lockfunc queued_spin_lock_slowpath(struct >> qspinlock *lock, u32 val) >>> /* >>> * trylock || pending >>> * >>> - * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock >>> + * 0,0,* -> 0,1,* -> ... -> *,0,1 pending, trylock >> By the time trylock is done, there may be entries in the queue. However, I >> doubt it helps by adding "..." in between possible multiple transitions. >> > The added "..." means there could be entries in the queue before trylock done. > This is just for making the state transitions more complete ;-). > If you think it doesn't help, I can remove it in the next version. The transition flow does not show whether there is entry in the queue or not. It just shows the state of the lock word. > >>> */ >>> val = queued_fetch_set_pending_acquire(lock); >>> >>> @@ -358,6 +358,8 @@ void __lockfunc queued_spin_lock_slowpath(struct >> qspinlock *lock, u32 val) >>> * Undo and queue; our setting of PENDING might have made the >>> * n,0,0 -> 0,0,0 transition fail and it will now be waiting >>> * on @next to become !NULL. >>> + * >>> + * 0,1,* -> *,1,* >> There is already a "n,0,0 -> 0,0,0" above, adding a new one may just >> confuse people. >>> */ >>> if (unlikely(val & ~_Q_LOCKED_MASK)) { >>> >>> @@ -371,7 +373,7 @@ void __lockfunc queued_spin_lock_slowpath(struct >> qspinlock *lock, u32 val) >>> /* >>> * We're pending, wait for the owner to go away. >>> * >>> - * 0,1,1 -> *,1,0 >>> + * *,1,* -> *,1,0 >> This refers to the wait loop. We don't need to wait if the owner has gone. > But just before we wait for the locked field, the locked field could be > either 0 (the locker can release the lock at any time) or 1. Again, I take the viewpoint of the CPU doing the wait. It will only wait if it observes that the lock isn't free. > >>> * >>> * this wait loop must be a load-acquire such that we match the >>> * store-release that clears the locked bit and create lock >>> @@ -385,7 +387,7 @@ void __lockfunc queued_spin_lock_slowpath(struct >> qspinlock *lock, u32 val) >>> /* >>> * take ownership and clear the pending bit. >>> * >>> - * 0,1,0 -> 0,0,1 >>> + * *,1,0 -> *,0,1 >> That is the part that we can make the change in the transition diagram >> as noted. > Sorry, I'm not clear about your request. > Did you mean just make the change "*,1,0 -> *,0,1" above in the transition diagram or > all the changes above in the transition diagram? What I meant is that this particular change is correct AFAICS. Sorry for the confusion. Cheers, Longman
Hi Longman, OK, so, you are from the point of view of the CPU (the lock word value it read last time) doing state transition in the uncontended scenario: [1]: Old lock word value (this CPU read it last time) ---> New lock word value I'm from the point of view of the current lock word possible value that its state is transited to the new lock word possible value (I don't think I'm the only one from this point of view when reading the qspinlock code ;-). [2]: Current lock word possible value ---> New lock word possible value I'm fine to keep the view of [1] in the current code and get [2] as the backup. I'll send out a v2 with just two minor comments' fixes. Thanks! -Qiuxu > From: Waiman Long <longman@redhat.com> > Sent: Tuesday, May 9, 2023 10:31 AM > To: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Peter Zijlstra > <peterz@infradead.org>; Ingo Molnar <mingo@redhat.com>; Will Deacon > <will@kernel.org> > Cc: Boqun Feng <boqun.feng@gmail.com>; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/1] locking/qspinlock: Fix state-transition changes in > comments > > On 5/8/23 22:03, Zhuo, Qiuxu wrote: > > Hi Wainman, > > > > Thanks for your review. Please see the comments below. > > > >> From: Waiman Long <longman@redhat.com> > >> Sent: Monday, May 8, 2023 11:29 PM > >> To: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Peter Zijlstra > >> <peterz@infradead.org>; Ingo Molnar <mingo@redhat.com>; Will Deacon > >> <will@kernel.org> > >> Cc: Boqun Feng <boqun.feng@gmail.com>; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH 1/1] locking/qspinlock: Fix state-transition > >> changes in comments > >> > >> > >> On 5/6/23 02:29, Qiuxu Zhuo wrote: > >>> 1. There may be concurrent locker CPUs to set the qspinlock pending bit. > >>> > >>> The first CPU (called pending CPU) of these CPUs sets the pending > >>> bit to make the state transition (the qspinlock pending bit is set): > >>> > >>> 0,0,* -> 0,1,* > >>> > >>> The rest of these CPUs are queued to the MCS queue to make the > state > >>> transition (the qspinlock tail is updated): > >>> > >>> 0,1,* -> *,1,* > >>> > >>> The pending CPU waits until the locker owner goes away to make > >>> the state transition (the qspinlock locked field is set to zero): > >>> > >>> *,1,* -> *,1,0 > >>> > >>> The pending CPU takes the ownership and clears the pending bit > >>> to make the state transition: > >>> > >>> *,1,0 -> *,0,1 > >>> > >>> 2. The header of the MCS queue takes the ownership and calls > set_locked() > >>> to make the state transition: > >>> > >>> *,*,0 -> *,*,1 > >> That is not true. The pending bit owner has priority over the MCS > >> queue head. So the pending bit must be 0 before the MCS queue head > >> can take over the lock. So > >> > >> *,0,0 -> *,0,1 > > Yes, the pending bit must be 0 before the header can take the lock. > > But as the statement "There may be concurrent locker CPUs to set the > > qspinlock pending bit " at the beginning. So just after the header > > takes over the lock, there is also a possible concurrent locker CPU to set the > pending bit. That means at this time point here, the pending bit could be > either 0 or 1. > > OK, I am looking from the point of view of the CPU doing the transition. > Otherwise, for an arch with very weak memory ordering, anything is possible. > i.e. *,*,* -> *,*,* if we have to consider what all the CPUs in the system can > see. > > > > > >>> Fix the state-transition changes above in the code comments accordingly. > >>> > >>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > >>> --- > >>> kernel/locking/qspinlock.c | 10 ++++++---- > >>> 1 file changed, 6 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > >>> index ebe6b8ec7cb3..efebbf19f887 100644 > >>> --- a/kernel/locking/qspinlock.c > >>> +++ b/kernel/locking/qspinlock.c > >>> @@ -257,7 +257,7 @@ static __always_inline u32 > >> queued_fetch_set_pending_acquire(struct qspinlock *lo > >>> * set_locked - Set the lock bit and own the lock > >>> * @lock: Pointer to queued spinlock structure > >>> * > >>> - * *,*,0 -> *,0,1 > >>> + * *,*,0 -> *,*,1 > >> set_locked() can only be called when it is sure that the pending bit isn't set. > >>> */ > >>> static __always_inline void set_locked(struct qspinlock *lock) > >>> { > >>> @@ -348,7 +348,7 @@ void __lockfunc > queued_spin_lock_slowpath(struct > >> qspinlock *lock, u32 val) > >>> /* > >>> * trylock || pending > >>> * > >>> - * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock > >>> + * 0,0,* -> 0,1,* -> ... -> *,0,1 pending, trylock > >> By the time trylock is done, there may be entries in the queue. > >> However, I doubt it helps by adding "..." in between possible multiple > transitions. > >> > > The added "..." means there could be entries in the queue before trylock > done. > > This is just for making the state transitions more complete ;-). > > If you think it doesn't help, I can remove it in the next version. > > The transition flow does not show whether there is entry in the queue or not. > It just shows the state of the lock word. > > > > > >>> */ > >>> val = queued_fetch_set_pending_acquire(lock); > >>> > >>> @@ -358,6 +358,8 @@ void __lockfunc > queued_spin_lock_slowpath(struct > >> qspinlock *lock, u32 val) > >>> * Undo and queue; our setting of PENDING might have made the > >>> * n,0,0 -> 0,0,0 transition fail and it will now be waiting > >>> * on @next to become !NULL. > >>> + * > >>> + * 0,1,* -> *,1,* > >> There is already a "n,0,0 -> 0,0,0" above, adding a new one may just > >> confuse people. > >>> */ > >>> if (unlikely(val & ~_Q_LOCKED_MASK)) { > >>> > >>> @@ -371,7 +373,7 @@ void __lockfunc > queued_spin_lock_slowpath(struct > >> qspinlock *lock, u32 val) > >>> /* > >>> * We're pending, wait for the owner to go away. > >>> * > >>> - * 0,1,1 -> *,1,0 > >>> + * *,1,* -> *,1,0 > >> This refers to the wait loop. We don't need to wait if the owner has gone. > > But just before we wait for the locked field, the locked field could be > > either 0 (the locker can release the lock at any time) or 1. > > Again, I take the viewpoint of the CPU doing the wait. It will only wait > if it observes that the lock isn't free. > > > > > >>> * > >>> * this wait loop must be a load-acquire such that we match the > >>> * store-release that clears the locked bit and create lock > >>> @@ -385,7 +387,7 @@ void __lockfunc > queued_spin_lock_slowpath(struct > >> qspinlock *lock, u32 val) > >>> /* > >>> * take ownership and clear the pending bit. > >>> * > >>> - * 0,1,0 -> 0,0,1 > >>> + * *,1,0 -> *,0,1 > >> That is the part that we can make the change in the transition diagram > >> as noted. > > Sorry, I'm not clear about your request. > > Did you mean just make the change "*,1,0 -> *,0,1" above in the transition > diagram or > > all the changes above in the transition diagram? > > What I meant is that this particular change is correct AFAICS. Sorry for > the confusion. > > Cheers, > Longman
On 5/9/23 02:57, Zhuo, Qiuxu wrote: > Hi Longman, > > OK, so, you are from the point of view of the CPU (the lock word value it read last time) > doing state transition in the uncontended scenario: > > [1]: Old lock word value (this CPU read it last time) ---> New lock word value > > I'm from the point of view of the current lock word possible value that its state is > transited to the new lock word possible value (I don't think I'm the only one from > this point of view when reading the qspinlock code ;-). > > [2]: Current lock word possible value ---> New lock word possible value > > I'm fine to keep the view of [1] in the current code and get [2] as the backup. > I'll send out a v2 with just two minor comments' fixes. The purpose of those transition flow charts is to help understand what the code is doing. So they should reflect the running CPU point of view. If you want to show what all the possible values can be, you have to explicitly state that to avoid the confusion. Thanks, Longman
> From: Waiman Long <longman@redhat.com> > ... > > > > OK, so, you are from the point of view of the CPU (the lock word value > > it read last time) doing state transition in the uncontended scenario: > > > > [1]: Old lock word value (this CPU read it last time) ---> New > > lock word value > > > > I'm from the point of view of the current lock word possible value > > that its state is transited to the new lock word possible value (I > > don't think I'm the only one from this point of view when reading the > qspinlock code ;-). > > > > [2]: Current lock word possible value ---> New lock word > > possible value > > > > I'm fine to keep the view of [1] in the current code and get [2] as the > backup. > > I'll send out a v2 with just two minor comments' fixes. > > The purpose of those transition flow charts is to help understand what the > code is doing. So they should reflect the running CPU point of view. > If you want to show what all the possible values can be, you have to explicitly > state that to avoid the confusion. Yes, from the uncontended case view can help readers understand the qspinlock code. After that, readers find that the qspinlock code also works for contended cases. Thank you for the clarification of the purpose of those state transition comments. -Qiuxu > Thanks, > Longman
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ebe6b8ec7cb3..efebbf19f887 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -257,7 +257,7 @@ static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lo * set_locked - Set the lock bit and own the lock * @lock: Pointer to queued spinlock structure * - * *,*,0 -> *,0,1 + * *,*,0 -> *,*,1 */ static __always_inline void set_locked(struct qspinlock *lock) { @@ -348,7 +348,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) /* * trylock || pending * - * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock + * 0,0,* -> 0,1,* -> ... -> *,0,1 pending, trylock */ val = queued_fetch_set_pending_acquire(lock); @@ -358,6 +358,8 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * Undo and queue; our setting of PENDING might have made the * n,0,0 -> 0,0,0 transition fail and it will now be waiting * on @next to become !NULL. + * + * 0,1,* -> *,1,* */ if (unlikely(val & ~_Q_LOCKED_MASK)) { @@ -371,7 +373,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) /* * We're pending, wait for the owner to go away. * - * 0,1,1 -> *,1,0 + * *,1,* -> *,1,0 * * this wait loop must be a load-acquire such that we match the * store-release that clears the locked bit and create lock @@ -385,7 +387,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) /* * take ownership and clear the pending bit. * - * 0,1,0 -> 0,0,1 + * *,1,0 -> *,0,1 */ clear_pending_set_locked(lock); lockevent_inc(lock_pending);