Message ID | 23cef5ac49494b9087953f529ae5df16@AcuMS.aculab.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp2793808dyb; Fri, 29 Dec 2023 12:58:56 -0800 (PST) X-Google-Smtp-Source: AGHT+IEglPIJdo7QI7uQWWyw7Iy2QbM1Cpe+0QiG1+gc5ObqJaxVA2gV6wIf6jmiZ8uzXYlJ3OTL X-Received: by 2002:a17:903:41c6:b0:1d4:13c5:b9dd with SMTP id u6-20020a17090341c600b001d413c5b9ddmr17154225ple.34.1703883535966; Fri, 29 Dec 2023 12:58:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703883535; cv=none; d=google.com; s=arc-20160816; b=JBggtB83T6P5SRT5LcUn2VHB3dkEPHRozqZjQm5ngcBwFD+hI909zzpJ+zXWJobqa1 LDo2Eyt+SEvyLhB2IXqwVwWPBSb3Q00Ffgvs5WF70xIZVA0M3tr6iy7jWcgfp5QeULVL b5vbY3MZgvDe9p3h7BACfUeab4qwiFiJdV74L9WXrl5vtJg6xN9k/w0tRnxJ6aL1QPhc YqdpjqvwX8k20sLh2wqyAYSebbfwUrtALXMgIZHZeXTy+pGpYam+lQyPu9Bh+q4wy33b dK6QN70ti1p8GdNyqKMcnl+W5E6D6B8k7xWYC0qNAnaZwYT95yV8tqzJCXyp/5Dhr9aG Wfyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:accept-language :in-reply-to:references:message-id:date:thread-index:thread-topic :subject:cc:to:from; bh=A3jMVLyZgOGdwKLMkdWHQZWJAg73fPcpjyIjD5hEWIs=; fh=9mjHhe+fO0FRmfh1SiQmAZ16aLlCaXA7AOG78MgCddY=; b=raYhZfcGMT/8mDBz2leuvoXBAfAi8qHp2AvAIZloyk9YBR7Ud7gjEBVk0SPdJKEPa+ SyvsbJ50HQ61wwWAlaHgKnXfBY7sIB5Sb3hlV7eSIBsrWQwCsoqPi+fjhadS2deElVC/ QcpVea2VFCB8JT66u5ZeGQwvx7BKBfAciU+zk4AHj9RBQfCS0IH+7nOHRP0B+yW46pQ8 fTyxz85gdCD1mj0ehC3INoS9EU1W92XpOAaZJwSMGZoDu+QTl3WOi/fVioovbgFqBc9S nPdvIsa0CneOrcp0SkuX7zVKSC8JnD1rCbpTvuCQCQ0+rBY+zx5LnBJlJ1GnmWJYFGs+ 4/Gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id w8-20020a1709027b8800b001d4a8f240f4si101523pll.425.2023.12.29.12.58.55 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Dec 2023 12:58:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13233-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id BA9CE283AE3 for <ouuuleilei@gmail.com>; Fri, 29 Dec 2023 20:58:55 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A090D14A8E; Fri, 29 Dec 2023 20:58:44 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACEA91426F for <linux-kernel@vger.kernel.org>; Fri, 29 Dec 2023 20:58:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-202-bl5_I54CNsuf5Ujw93Ax2g-1; Fri, 29 Dec 2023 20:58:32 +0000 X-MC-Unique: bl5_I54CNsuf5Ujw93Ax2g-1 Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Fri, 29 Dec 2023 20:58:15 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Fri, 29 Dec 2023 20:58:15 +0000 From: David Laight <David.Laight@ACULAB.COM> To: "'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>, "'peterz@infradead.org'" <peterz@infradead.org>, "'longman@redhat.com'" <longman@redhat.com> CC: "'mingo@redhat.com'" <mingo@redhat.com>, "'will@kernel.org'" <will@kernel.org>, "'boqun.feng@gmail.com'" <boqun.feng@gmail.com>, "'Linus Torvalds'" <torvalds@linux-foundation.org>, "'xinhui.pan@linux.vnet.ibm.com'" <xinhui.pan@linux.vnet.ibm.com>, "'virtualization@lists.linux-foundation.org'" <virtualization@lists.linux-foundation.org>, 'Zeng Heng' <zengheng4@huawei.com> Subject: [PATCH next 5/5] locking/osq_lock: Optimise vcpu_is_preempted() check. Thread-Topic: [PATCH next 5/5] locking/osq_lock: Optimise vcpu_is_preempted() check. Thread-Index: Ado6mcFsTi5k8LaETrKavOOIB4in0Q== Date: Fri, 29 Dec 2023 20:58:15 +0000 Message-ID: <23cef5ac49494b9087953f529ae5df16@AcuMS.aculab.com> References: <73a4b31c9c874081baabad9e5f2e5204@AcuMS.aculab.com> In-Reply-To: <73a4b31c9c874081baabad9e5f2e5204@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786651382705371963 X-GMAIL-MSGID: 1786651382705371963 |
Series |
locking/osq_lock: Optimisations to osq_lock code
|
|
Commit Message
David Laight
Dec. 29, 2023, 8:58 p.m. UTC
The vcpu_is_preempted() test stops osq_lock() spinning if a virtual
cpu is no longer running.
Although patched out for bare-metal the code still needs the cpu number.
Reading this from 'prev->cpu' is a pretty much guaranteed have a cache miss
when osq_unlock() is waking up the next cpu.
Instead save 'prev->cpu' in 'node->prev_cpu' and use that value instead.
Update in the osq_lock() 'unqueue' path when 'node->prev' is changed.
This is simpler than checking for 'node->prev' changing and caching
'prev->cpu'.
Signed-off-by: David Laight <david.laight@aculab.com>
---
kernel/locking/osq_lock.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
Comments
On 12/29/23 15:58, David Laight wrote: > The vcpu_is_preempted() test stops osq_lock() spinning if a virtual > cpu is no longer running. > Although patched out for bare-metal the code still needs the cpu number. > Reading this from 'prev->cpu' is a pretty much guaranteed have a cache miss > when osq_unlock() is waking up the next cpu. > > Instead save 'prev->cpu' in 'node->prev_cpu' and use that value instead. > Update in the osq_lock() 'unqueue' path when 'node->prev' is changed. > > This is simpler than checking for 'node->prev' changing and caching > 'prev->cpu'. > > Signed-off-by: David Laight <david.laight@aculab.com> > --- > kernel/locking/osq_lock.c | 14 ++++++-------- > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > index b60b0add0161..89be63627434 100644 > --- a/kernel/locking/osq_lock.c > +++ b/kernel/locking/osq_lock.c > @@ -14,8 +14,9 @@ > > struct optimistic_spin_node { > struct optimistic_spin_node *self, *next, *prev; > - int locked; /* 1 if lock acquired */ > - int cpu; /* encoded CPU # + 1 value */ > + int locked; /* 1 if lock acquired */ > + int cpu; /* encoded CPU # + 1 value */ > + int prev_cpu; /* actual CPU # for vpcu_is_preempted() */ > }; > > static DEFINE_PER_CPU_SHARED_ALIGNED(struct optimistic_spin_node, osq_node); > @@ -29,11 +30,6 @@ static inline int encode_cpu(int cpu_nr) > return cpu_nr + 1; > } > > -static inline int node_cpu(struct optimistic_spin_node *node) > -{ > - return node->cpu - 1; > -} > - > static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val) > { > int cpu_nr = encoded_cpu_val - 1; > @@ -114,6 +110,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) > if (old == OSQ_UNLOCKED_VAL) > return true; > > + node->prev_cpu = old - 1; > prev = decode_cpu(old); > node->prev = prev; > node->locked = 0; > @@ -148,7 +145,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) > * polling, be careful. > */ > if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() || > - vcpu_is_preempted(node_cpu(node->prev)))) > + vcpu_is_preempted(node->prev_cpu))) > return true; > > /* unqueue */ > @@ -205,6 +202,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) > * it will wait in Step-A. > */ > > + WRITE_ONCE(next->prev_cpu, prev->cpu - 1); > WRITE_ONCE(next->prev, prev); > WRITE_ONCE(prev->next, next); Reviewed-by: Waiman Long <longman@redhat.com> >
On 12/29/23 22:13, Waiman Long wrote: > > On 12/29/23 15:58, David Laight wrote: >> The vcpu_is_preempted() test stops osq_lock() spinning if a virtual >> cpu is no longer running. >> Although patched out for bare-metal the code still needs the cpu number. >> Reading this from 'prev->cpu' is a pretty much guaranteed have a >> cache miss >> when osq_unlock() is waking up the next cpu. >> >> Instead save 'prev->cpu' in 'node->prev_cpu' and use that value instead. >> Update in the osq_lock() 'unqueue' path when 'node->prev' is changed. >> >> This is simpler than checking for 'node->prev' changing and caching >> 'prev->cpu'. >> >> Signed-off-by: David Laight <david.laight@aculab.com> >> --- >> kernel/locking/osq_lock.c | 14 ++++++-------- >> 1 file changed, 6 insertions(+), 8 deletions(-) >> >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c >> index b60b0add0161..89be63627434 100644 >> --- a/kernel/locking/osq_lock.c >> +++ b/kernel/locking/osq_lock.c >> @@ -14,8 +14,9 @@ >> struct optimistic_spin_node { >> struct optimistic_spin_node *self, *next, *prev; >> - int locked; /* 1 if lock acquired */ >> - int cpu; /* encoded CPU # + 1 value */ >> + int locked; /* 1 if lock acquired */ >> + int cpu; /* encoded CPU # + 1 value */ >> + int prev_cpu; /* actual CPU # for vpcu_is_preempted() */ >> }; >> static DEFINE_PER_CPU_SHARED_ALIGNED(struct optimistic_spin_node, >> osq_node); >> @@ -29,11 +30,6 @@ static inline int encode_cpu(int cpu_nr) >> return cpu_nr + 1; >> } >> -static inline int node_cpu(struct optimistic_spin_node *node) >> -{ >> - return node->cpu - 1; >> -} >> - >> static inline struct optimistic_spin_node *decode_cpu(int >> encoded_cpu_val) >> { >> int cpu_nr = encoded_cpu_val - 1; >> @@ -114,6 +110,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) >> if (old == OSQ_UNLOCKED_VAL) >> return true; >> + node->prev_cpu = old - 1; >> prev = decode_cpu(old); >> node->prev = prev; >> node->locked = 0; >> @@ -148,7 +145,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) >> * polling, be careful. >> */ >> if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() || >> - vcpu_is_preempted(node_cpu(node->prev)))) >> + vcpu_is_preempted(node->prev_cpu))) On second thought, I believe it is more correct to use READ_ONCE() to access "node->prev_cpu" as this field is subjected to change by a WRITE_ONCE(). Cheers, Longman
From: Waiman Long > Sent: 30 December 2023 15:57 > > On 12/29/23 22:13, Waiman Long wrote: > > > > On 12/29/23 15:58, David Laight wrote: > >> The vcpu_is_preempted() test stops osq_lock() spinning if a virtual > >> cpu is no longer running. > >> Although patched out for bare-metal the code still needs the cpu number. > >> Reading this from 'prev->cpu' is a pretty much guaranteed have a > >> cache miss > >> when osq_unlock() is waking up the next cpu. > >> > >> Instead save 'prev->cpu' in 'node->prev_cpu' and use that value instead. > >> Update in the osq_lock() 'unqueue' path when 'node->prev' is changed. > >> > >> This is simpler than checking for 'node->prev' changing and caching > >> 'prev->cpu'. > >> > >> Signed-off-by: David Laight <david.laight@aculab.com> > >> --- > >> kernel/locking/osq_lock.c | 14 ++++++-------- > >> 1 file changed, 6 insertions(+), 8 deletions(-) > >> > >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > >> index b60b0add0161..89be63627434 100644 > >> --- a/kernel/locking/osq_lock.c > >> +++ b/kernel/locking/osq_lock.c > >> @@ -14,8 +14,9 @@ > >> struct optimistic_spin_node { > >> struct optimistic_spin_node *self, *next, *prev; > >> - int locked; /* 1 if lock acquired */ > >> - int cpu; /* encoded CPU # + 1 value */ > >> + int locked; /* 1 if lock acquired */ > >> + int cpu; /* encoded CPU # + 1 value */ > >> + int prev_cpu; /* actual CPU # for vpcu_is_preempted() */ > >> }; > >> static DEFINE_PER_CPU_SHARED_ALIGNED(struct optimistic_spin_node, > >> osq_node); > >> @@ -29,11 +30,6 @@ static inline int encode_cpu(int cpu_nr) > >> return cpu_nr + 1; > >> } > >> -static inline int node_cpu(struct optimistic_spin_node *node) > >> -{ > >> - return node->cpu - 1; > >> -} > >> - > >> static inline struct optimistic_spin_node *decode_cpu(int > >> encoded_cpu_val) > >> { > >> int cpu_nr = encoded_cpu_val - 1; > >> @@ -114,6 +110,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) > >> if (old == OSQ_UNLOCKED_VAL) > >> return true; > >> + node->prev_cpu = old - 1; > >> prev = decode_cpu(old); > >> node->prev = prev; > >> node->locked = 0; > >> @@ -148,7 +145,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) > >> * polling, be careful. > >> */ > >> if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() || > >> - vcpu_is_preempted(node_cpu(node->prev)))) > >> + vcpu_is_preempted(node->prev_cpu))) > > On second thought, I believe it is more correct to use READ_ONCE() to > access "node->prev_cpu" as this field is subjected to change by a > WRITE_ONCE(). It can be done... Aren't pretty much all the 'node' members accessed like that? There are a sprinkling of READ_ONCE() and WRITE_ONCE() but they are not always used. Maybe the structure member(s) should just be marked 'volatile' :-) That should have exactly the same effect as the volatile cast inside READ/WRITE_ONCE(). (I know there is a document about not using volatile...) I've not actually checked whether the two existing WRITE_ONCE() in 'Step C' need to be ordered - and whether that is guaranteed by the code, especially on out good old friend 'Alpha' (is that horrid cache system still supported?). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index b60b0add0161..89be63627434 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -14,8 +14,9 @@ struct optimistic_spin_node { struct optimistic_spin_node *self, *next, *prev; - int locked; /* 1 if lock acquired */ - int cpu; /* encoded CPU # + 1 value */ + int locked; /* 1 if lock acquired */ + int cpu; /* encoded CPU # + 1 value */ + int prev_cpu; /* actual CPU # for vpcu_is_preempted() */ }; static DEFINE_PER_CPU_SHARED_ALIGNED(struct optimistic_spin_node, osq_node); @@ -29,11 +30,6 @@ static inline int encode_cpu(int cpu_nr) return cpu_nr + 1; } -static inline int node_cpu(struct optimistic_spin_node *node) -{ - return node->cpu - 1; -} - static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val) { int cpu_nr = encoded_cpu_val - 1; @@ -114,6 +110,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (old == OSQ_UNLOCKED_VAL) return true; + node->prev_cpu = old - 1; prev = decode_cpu(old); node->prev = prev; node->locked = 0; @@ -148,7 +145,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * polling, be careful. */ if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() || - vcpu_is_preempted(node_cpu(node->prev)))) + vcpu_is_preempted(node->prev_cpu))) return true; /* unqueue */ @@ -205,6 +202,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * it will wait in Step-A. */ + WRITE_ONCE(next->prev_cpu, prev->cpu - 1); WRITE_ONCE(next->prev, prev); WRITE_ONCE(prev->next, next);