Message ID | 20230720163056.2564824-17-vschneid@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp3262549vqt; Thu, 20 Jul 2023 10:09:13 -0700 (PDT) X-Google-Smtp-Source: APBJJlHZnVybma663ikdt4YPiFM/8AYwW3vvgNPKCDuyO5G5FU5lZfXhvnc13ZoznnCvBqvr2zwm X-Received: by 2002:a05:6512:159c:b0:4fb:89bb:ca19 with SMTP id bp28-20020a056512159c00b004fb89bbca19mr2206221lfb.66.1689872952918; Thu, 20 Jul 2023 10:09:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689872952; cv=none; d=google.com; s=arc-20160816; b=C9lPWynh4WisCxuFFCx1uGG4p9aOz9cNfXearMMlFP4AolaStNq0Rr+3pqj1/Jn1Bf 4LoGFDJGueg19fSfGCIGLKVc5Ki3jlVFKkzFLlHmlkitih/5/YdEVGL06gQuKJd49N9y yBssQgexKu2pY3gY7LBK1n304X/kO5vrj4Ip7YabWNmZJs7h4C8kPbJyjiilHISuvDv9 mNel8ksqVUGdtwl1wt6kFdOg+RKBVZXLSglz+SPPE+1DrH2AP/ThLPhXkvwXbh6bsPwt eC2O17DzySoTcwUJEyKM/PXIuRDO0uvvDQmcKwn4UgEmzcjFCLFWTgS8+LplYfptPSxX rzBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gPdSzcTTJlOeO2aEdgyECLhQy37txJLWdi6TDmez9kY=; fh=KNS+F88uH/xc0YybXK2VtN2RP9EA+s15CVQCQPr6o3g=; b=UVCEpj6ds77FVe4rpPpAT6D4oqt9p07ImARpK15aru1kM2XP7EY92v+TmnLrSh3naK Psa/zHQJU+/jDs9++G4bmxtc51sqSFSD9qbOGspw+LHF8g5GOKbg9uZ4R1bWm/XVHD/p KhP9U9dEl/g2mFkL9LBOjCpH+qNkHH7a0Ip5dT8286bKI5rO2UOpfhDN0cmzFloYT8kj ELV0lEPCNeG7UxszNn5H56HyOmmW9if4bVT4hM7EeAVVEgdlhg8HU4PpUisOAUVx/KKC ppteXaM4o/cTv6PGqwUb9uIKaQI9BG8C76lMjLS1EAG8WVGi/jTPLGtQTDfV+aLxsWF4 eMCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=djwRwyoR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d6-20020a05640208c600b0051e0c1e0782si1037061edz.148.2023.07.20.10.08.48; Thu, 20 Jul 2023 10:09:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=djwRwyoR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230512AbjGTQiW (ORCPT <rfc822;assdfgzxcv4@gmail.com> + 99 others); Thu, 20 Jul 2023 12:38:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231200AbjGTQhr (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 20 Jul 2023 12:37:47 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8948119B6 for <linux-kernel@vger.kernel.org>; Thu, 20 Jul 2023 09:36:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689870852; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gPdSzcTTJlOeO2aEdgyECLhQy37txJLWdi6TDmez9kY=; b=djwRwyoRT4WBUlXX5F+QuYV/5Oo8VHV8+ZA2T1d4Bn3OxZW/wZbbtOEmBgsHyZr9dya+i9 6hCXz8OxlkYgsku+TWVuFaCg5Crl4eMjDPyswWwSD26caHOL/pLzL1SGCjubgxaF1TaJ5v O6qMRxE35rhDtVJiHPvrH1OSUxDkCoA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-230-kFK-c7oyNgC289gF1X8KZw-1; Thu, 20 Jul 2023 12:34:11 -0400 X-MC-Unique: kFK-c7oyNgC289gF1X8KZw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4FA07101A54E; Thu, 20 Jul 2023 16:34:08 +0000 (UTC) Received: from vschneid.remote.csb (unknown [10.42.28.48]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 29A9240C206F; Thu, 20 Jul 2023 16:33:59 +0000 (UTC) From: Valentin Schneider <vschneid@redhat.com> To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, rcu@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: "Paul E . McKenney" <paulmck@kernel.org>, Steven Rostedt <rostedt@goodmis.org>, Masami Hiramatsu <mhiramat@kernel.org>, Jonathan Corbet <corbet@lwn.net>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, Paolo Bonzini <pbonzini@redhat.com>, Wanpeng Li <wanpengli@tencent.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Frederic Weisbecker <frederic@kernel.org>, Neeraj Upadhyay <quic_neeraju@quicinc.com>, Joel Fernandes <joel@joelfernandes.org>, Josh Triplett <josh@joshtriplett.org>, Boqun Feng <boqun.feng@gmail.com>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Lai Jiangshan <jiangshanlai@gmail.com>, Zqiang <qiang.zhang1211@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Uladzislau Rezki <urezki@gmail.com>, Christoph Hellwig <hch@infradead.org>, Lorenzo Stoakes <lstoakes@gmail.com>, Josh Poimboeuf <jpoimboe@kernel.org>, Jason Baron <jbaron@akamai.com>, Kees Cook <keescook@chromium.org>, Sami Tolvanen <samitolvanen@google.com>, Ard Biesheuvel <ardb@kernel.org>, Nicholas Piggin <npiggin@gmail.com>, Juerg Haefliger <juerg.haefliger@canonical.com>, Nicolas Saenz Julienne <nsaenz@kernel.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Nadav Amit <namit@vmware.com>, Dan Carpenter <error27@gmail.com>, Chuang Wang <nashuiliang@gmail.com>, Yang Jihong <yangjihong1@huawei.com>, Petr Mladek <pmladek@suse.com>, "Jason A. Donenfeld" <Jason@zx2c4.com>, Song Liu <song@kernel.org>, Julian Pidancet <julian.pidancet@oracle.com>, Tom Lendacky <thomas.lendacky@amd.com>, Dionna Glaze <dionnaglaze@google.com>, =?utf-8?q?Thomas_Wei=C3=9Fschuh?= <linux@weissschuh.net>, Juri Lelli <juri.lelli@redhat.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>, Yair Podemsky <ypodemsk@redhat.com> Subject: [RFC PATCH v2 16/20] rcu: Make RCU dynticks counter size configurable Date: Thu, 20 Jul 2023 17:30:52 +0100 Message-Id: <20230720163056.2564824-17-vschneid@redhat.com> In-Reply-To: <20230720163056.2564824-1-vschneid@redhat.com> References: <20230720163056.2564824-1-vschneid@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771960221739679780 X-GMAIL-MSGID: 1771960221739679780 |
Series |
context_tracking,x86: Defer some IPIs until a user->kernel transition
|
|
Commit Message
Valentin Schneider
July 20, 2023, 4:30 p.m. UTC
CONTEXT_TRACKING_WORK reduces the size of the dynticks counter to free up
some bits for work deferral. Paul suggested making the actual counter size
configurable for rcutorture to poke at, so do that.
Make it only configurable under RCU_EXPERT. Previous commits have added
build-time checks that ensure a kernel with problematic dynticks counter
width can't be built.
Link: http://lore.kernel.org/r/4c2cb573-168f-4806-b1d9-164e8276e66a@paulmck-laptop
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
include/linux/context_tracking.h | 3 ++-
include/linux/context_tracking_state.h | 3 +--
kernel/rcu/Kconfig | 33 ++++++++++++++++++++++++++
3 files changed, 36 insertions(+), 3 deletions(-)
Comments
On 20/07/23 17:30, Valentin Schneider wrote: > index bdd7eadb33d8f..1ff2aab24e964 100644 > --- a/kernel/rcu/Kconfig > +++ b/kernel/rcu/Kconfig > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME > Say Y here if you need tighter callback-limit enforcement. > Say N here if you are unsure. > > +config RCU_DYNTICKS_RANGE_BEGIN > + int > + depends on !RCU_EXPERT > + default 31 if !CONTEXT_TRACKING_WORK You'll note that this should be 30 really, because the lower *2* bits are taken by the context state (CONTEXT_GUEST has a value of 3). This highlights the fragile part of this: the Kconfig values are hardcoded, but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The static_assert() will at least capture any misconfiguration, but having that enforced by the actual Kconfig ranges would be less awkward. Do we currently have a way of e.g. making a Kconfig file depend on and use values generated by a C header?
On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote: > On 20/07/23 17:30, Valentin Schneider wrote: > > index bdd7eadb33d8f..1ff2aab24e964 100644 > > --- a/kernel/rcu/Kconfig > > +++ b/kernel/rcu/Kconfig > > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME > > Say Y here if you need tighter callback-limit enforcement. > > Say N here if you are unsure. > > > > +config RCU_DYNTICKS_RANGE_BEGIN > > + int > > + depends on !RCU_EXPERT > > + default 31 if !CONTEXT_TRACKING_WORK > > You'll note that this should be 30 really, because the lower *2* bits are > taken by the context state (CONTEXT_GUEST has a value of 3). > > This highlights the fragile part of this: the Kconfig values are hardcoded, > but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The > static_assert() will at least capture any misconfiguration, but having that > enforced by the actual Kconfig ranges would be less awkward. > > Do we currently have a way of e.g. making a Kconfig file depend on and use > values generated by a C header? Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig option and let the C code work out what the number of bits should be? I suppose that there might be a failure whose frequency depended on the number of bits, which might be an argument for keeping something like RCU_DYNTICKS_RANGE_BEGIN for fault isolation. But still using RCU_DYNTICKS_TORTURE for normal testing. Thoughts? Thanx, Paul
On 21/07/23 07:10, Paul E. McKenney wrote: > On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote: >> On 20/07/23 17:30, Valentin Schneider wrote: >> > index bdd7eadb33d8f..1ff2aab24e964 100644 >> > --- a/kernel/rcu/Kconfig >> > +++ b/kernel/rcu/Kconfig >> > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME >> > Say Y here if you need tighter callback-limit enforcement. >> > Say N here if you are unsure. >> > >> > +config RCU_DYNTICKS_RANGE_BEGIN >> > + int >> > + depends on !RCU_EXPERT >> > + default 31 if !CONTEXT_TRACKING_WORK >> >> You'll note that this should be 30 really, because the lower *2* bits are >> taken by the context state (CONTEXT_GUEST has a value of 3). >> >> This highlights the fragile part of this: the Kconfig values are hardcoded, >> but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The >> static_assert() will at least capture any misconfiguration, but having that >> enforced by the actual Kconfig ranges would be less awkward. >> >> Do we currently have a way of e.g. making a Kconfig file depend on and use >> values generated by a C header? > > Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig > option and let the C code work out what the number of bits should be? > > I suppose that there might be a failure whose frequency depended on > the number of bits, which might be an argument for keeping something > like RCU_DYNTICKS_RANGE_BEGIN for fault isolation. But still using > RCU_DYNTICKS_TORTURE for normal testing. > > Thoughts? > AFAICT if we run tests with the minimum possible width, then intermediate values shouldn't have much value. Your RCU_DYNTICKS_TORTURE suggestion sounds like a saner option than what I came up with, as we can let the context tracking code figure out the widths itself and not expose any of that to Kconfig. > Thanx, Paul
On Fri, Jul 21, 2023 at 04:08:10PM +0100, Valentin Schneider wrote: > On 21/07/23 07:10, Paul E. McKenney wrote: > > On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote: > >> On 20/07/23 17:30, Valentin Schneider wrote: > >> > index bdd7eadb33d8f..1ff2aab24e964 100644 > >> > --- a/kernel/rcu/Kconfig > >> > +++ b/kernel/rcu/Kconfig > >> > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME > >> > Say Y here if you need tighter callback-limit enforcement. > >> > Say N here if you are unsure. > >> > > >> > +config RCU_DYNTICKS_RANGE_BEGIN > >> > + int > >> > + depends on !RCU_EXPERT > >> > + default 31 if !CONTEXT_TRACKING_WORK > >> > >> You'll note that this should be 30 really, because the lower *2* bits are > >> taken by the context state (CONTEXT_GUEST has a value of 3). > >> > >> This highlights the fragile part of this: the Kconfig values are hardcoded, > >> but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The > >> static_assert() will at least capture any misconfiguration, but having that > >> enforced by the actual Kconfig ranges would be less awkward. > >> > >> Do we currently have a way of e.g. making a Kconfig file depend on and use > >> values generated by a C header? > > > > Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig > > option and let the C code work out what the number of bits should be? > > > > I suppose that there might be a failure whose frequency depended on > > the number of bits, which might be an argument for keeping something > > like RCU_DYNTICKS_RANGE_BEGIN for fault isolation. But still using > > RCU_DYNTICKS_TORTURE for normal testing. > > > > Thoughts? > > > > AFAICT if we run tests with the minimum possible width, then intermediate > values shouldn't have much value. > > Your RCU_DYNTICKS_TORTURE suggestion sounds like a saner option than what I > came up with, as we can let the context tracking code figure out the widths > itself and not expose any of that to Kconfig. Agreed. If a need for variable numbers of bits ever does arise, we can worry about it at that time. And then we would have more information on what a variable-bit facility should look like. Thanx, Paul
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h index 8aee086d0a25f..9c0c622bc27bb 100644 --- a/include/linux/context_tracking.h +++ b/include/linux/context_tracking.h @@ -12,7 +12,8 @@ #ifdef CONFIG_CONTEXT_TRACKING_WORK static_assert(CONTEXT_WORK_MAX_OFFSET <= CONTEXT_WORK_END + 1 - CONTEXT_WORK_START, - "Not enough bits for CONTEXT_WORK"); + "Not enough bits for CONTEXT_WORK, " + "CONFIG_RCU_DYNTICKS_BITS might be too high"); #endif #ifdef CONFIG_CONTEXT_TRACKING_USER diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index 828fcdb801f73..292a0b7c06948 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -58,8 +58,7 @@ enum ctx_state { #define CONTEXT_STATE_START 0 #define CONTEXT_STATE_END (bits_per(CONTEXT_MAX - 1) - 1) -#define RCU_DYNTICKS_BITS (IS_ENABLED(CONFIG_CONTEXT_TRACKING_WORK) ? 16 : 31) -#define RCU_DYNTICKS_START (CT_STATE_SIZE - RCU_DYNTICKS_BITS) +#define RCU_DYNTICKS_START (CT_STATE_SIZE - CONFIG_RCU_DYNTICKS_BITS) #define RCU_DYNTICKS_END (CT_STATE_SIZE - 1) #define RCU_DYNTICKS_IDX BIT(RCU_DYNTICKS_START) diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index bdd7eadb33d8f..1ff2aab24e964 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME Say Y here if you need tighter callback-limit enforcement. Say N here if you are unsure. +config RCU_DYNTICKS_RANGE_BEGIN + int + depends on !RCU_EXPERT + default 31 if !CONTEXT_TRACKING_WORK + default 16 if CONTEXT_TRACKING_WORK + +config RCU_DYNTICKS_RANGE_BEGIN + int + depends on RCU_EXPERT + default 2 + +config RCU_DYNTICKS_RANGE_END + int + default 31 if !CONTEXT_TRACKING_WORK + default 16 if CONTEXT_TRACKING_WORK + +config RCU_DYNTICKS_BITS_DEFAULT + int + default 31 if !CONTEXT_TRACKING_WORK + default 16 if CONTEXT_TRACKING_WORK + +config RCU_DYNTICKS_BITS + int "Dynticks counter width" if CONTEXT_TRACKING_WORK + range RCU_DYNTICKS_RANGE_BEGIN RCU_DYNTICKS_RANGE_END + default RCU_DYNTICKS_BITS_DEFAULT + help + This option controls the width of the dynticks counter. + + Lower values will make overflows more frequent, which will increase + the likelihood of extending grace-periods. + + Don't touch this unless you are running some tests. + endmenu # "RCU Subsystem"