Message ID | 20221019200137.70343645@gandalf.local.home |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp596453wrs; Wed, 19 Oct 2022 17:06:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5vDLU+Lj/BguP2q2sibM6gL6osTg4czUwyoYbNStlcXK1DGycH6SXm46OYFHxmPe6XZ0in X-Received: by 2002:aa7:d80b:0:b0:45f:b9ed:6e3c with SMTP id v11-20020aa7d80b000000b0045fb9ed6e3cmr1778424edq.22.1666224419324; Wed, 19 Oct 2022 17:06:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666224419; cv=none; d=google.com; s=arc-20160816; b=g/ln7htYOR6o6qSIlRzg2Rcm+cuEMNlueGEc47YeHhKjV74koIqmeBB14J70vZvhEJ RV1DoD421lasXTg4j4PwMf9sKmAfDiB+iBQvAXKt814yum720RsD3dUM6bJj+XzBwLdB BkQu6yDaw4VXXPKwA8G+fNViRf65K9l66pCls0izTr+rREPpcM1E10dhDoPN50iY9jnA axcvMX+phW9rYSNoJhPvpBIHg+S67zTTJSOieJHVWgqr3J3TBjgGu2BBHXL2BfskRBtB 1I8NmMlDOIRX12kNIWiKQMuStypeccmupYO+FOGcHyfkttUiz8Ossgzal7a4exeJevR0 ltvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:subject:cc:to:from:date; bh=s8LxUq/bOlJlyGK3zYmQr9lK/EwcR8DTy2ruEQj9XzY=; b=sVFaKwq7tlW6+ILkPcTFSk9TayzpE6NOTrwUq8mEons1/gEx2ZPASOJOBbeyLoJ78B cwk2Cr19jGJRSav6OpodHB4fiXnbL3dq9mBv/jz6Sd8hLprQIR/WiaAiWtNzJyRk/QLw pB6Qu6pjy3JUa3ue3IjsPwHyKK0j5UgZMKsi3FthRZ22ffvteLuvAcH0G8TJMuTdrHXE 1JvwxsI5sIujKADlvwKRJaWPYVcJtjQe0nRRnWQ5y/5wNFJ5I5SApNuhclBDSzgkPVFr f+shY6/G0bgeeTtai92wS6IV0p1XeC/CqzVlNDOgJW0xm7fwBnT+kg1//+PL4ykpZQBz FFLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e18-20020a17090658d200b0078e0e8508fbsi17487297ejs.457.2022.10.19.17.06.30; Wed, 19 Oct 2022 17:06:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229722AbiJTABl (ORCPT <rfc822;samuel.l.nystrom@gmail.com> + 99 others); Wed, 19 Oct 2022 20:01:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229930AbiJTABj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 19 Oct 2022 20:01:39 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1F6E169CC1; Wed, 19 Oct 2022 17:01:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 67AD9B825B8; Thu, 20 Oct 2022 00:01:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 37AADC433D6; Thu, 20 Oct 2022 00:01:34 +0000 (UTC) Date: Wed, 19 Oct 2022 20:01:37 -0400 From: Steven Rostedt <rostedt@goodmis.org> To: LKML <linux-kernel@vger.kernel.org> Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>, Masami Hiramatsu <mhiramat@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, "Paul E. McKenney" <paulmck@kernel.org>, Joel Fernandes <joel@joelfernandes.org>, Tom Zanussi <zanussi@kernel.org> Subject: [PATCH] tracing: Add trace_trigger kernel command line option Message-ID: <20221019200137.70343645@gandalf.local.home> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747162936262937974?= X-GMAIL-MSGID: =?utf-8?q?1747162936262937974?= |
Series |
tracing: Add trace_trigger kernel command line option
|
|
Commit Message
Steven Rostedt
Oct. 20, 2022, 12:01 a.m. UTC
From: "Steven Rostedt (Google)" <rostedt@goodmis.org> Allow triggers to be enabled at kernel boot up. For example: trace_trigger="sched_switch.stacktrace if prev_state == 2" The above will enable the stacktrace trigger on top of the sched_switch event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then at boot up, a stacktrace will trigger and be recorded in the tracing ring buffer every time the sched_switch happens where the previous state is TASK_INTERRUPTIBLE. As this calls into tracepoint logic during very early boot (before interrupts are enabled), a check has to be done to see if early boot interrupts are still disabled, and if so, avoid any call to RCU synchronization, as that will enable interrupts and cause boot up issues. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> --- .../admin-guide/kernel-parameters.txt | 19 ++++++ include/linux/tracepoint.h | 4 ++ kernel/trace/trace.c | 3 +- kernel/trace/trace_events.c | 63 ++++++++++++++++++- kernel/tracepoint.c | 6 ++ 5 files changed, 92 insertions(+), 3 deletions(-)
Comments
On Wed, 19 Oct 2022 20:01:37 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > #ifdef CONFIG_TRACEPOINTS > static inline void tracepoint_synchronize_unregister(void) > { > + /* Early updates do not need synchronization */ > + if (early_boot_irqs_disabled) > + return; > + > synchronize_srcu(&tracepoint_srcu); > synchronize_rcu(); I wonder if this check should be just moved to the RCU synchronization code? That is, if early_boot_irqs_disabled is set, do nothing, as there should be nothing to synchronize against. It took me a bit to figure out why I was constantly getting: [ 1.072013] rcu: srcu_init: Setting srcu_struct sizes based on contention. [ 1.078888] ------------[ cut here ]------------ [ 1.083422] Interrupts were enabled early [ 1.087399] WARNING: CPU: 0 PID: 0 at init/main.c:1061 start_kernel+0x36f/0x4e8 [ 1.094656] Modules linked in: [ 1.097683] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc1-test+ #461 [ 1.104509] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 [ 1.113411] RIP: 0010:start_kernel+0x36f/0x4e8 [ 1.117817] Code: 48 27 5f e8 fd 7b 03 00 e8 55 72 e5 fe e8 6b 05 03 00 ff 15 2a f9 89 ff 0f ba e0 09 73 0e 48 c7 c7 98 26 31 a0 e8 01 5b df fe <0f> 0b c6 05 c7 2c af ff 00 e8 da 36 47 fe ff 15 14 f9 89 ff e8 56 [ 1.136485] RSP: 0000:ffffffffa0603f30 EFLAGS: 00010286 [ 1.141670] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1.148754] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 1.155840] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffa0603dd8 [ 1.162926] R10: 0000000000000003 R11: ffffffffa0734168 R12: 000000000000180a [ 1.170011] R13: 00000000010fecf0 R14: 0000000000000000 R15: 00000000d44e3018 [ 1.177097] FS: 0000000000000000(0000) GS:ffff91909aa00000(0000) knlGS:0000000000000000 [ 1.185131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.190835] CR2: ffff91909edff000 CR3: 000000007760a001 CR4: 00000000000606f0 [ 1.197924] Call Trace: [ 1.200345] <TASK> It was a bit of whack-a-mole in finding all the places that were calling synchronize RCU. -- Steve
Hi Steven, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on akpm-mm/mm-everything] [also build test WARNING on linus/master v6.1-rc1 next-20221020] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Steven-Rostedt/tracing-Add-trace_trigger-kernel-command-line-option/20221020-102230 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20221019200137.70343645%40gandalf.local.home patch subject: [PATCH] tracing: Add trace_trigger kernel command line option config: i386-randconfig-a001 compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/476ab107174fa2ac06ef7171b093235382008c77 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Steven-Rostedt/tracing-Add-trace_trigger-kernel-command-line-option/20221020-102230 git checkout 476ab107174fa2ac06ef7171b093235382008c77 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/trace/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): kernel/trace/trace_events.c: In function '__trace_early_add_new_event': >> kernel/trace/trace_events.c:2864:13: warning: unused variable 'i' [-Wunused-variable] 2864 | int i; | ^ vim +/i +2864 kernel/trace/trace_events.c 2852 2853 /* 2854 * Just create a descriptor for early init. A descriptor is required 2855 * for enabling events at boot. We want to enable events before 2856 * the filesystem is initialized. 2857 */ 2858 static int 2859 __trace_early_add_new_event(struct trace_event_call *call, 2860 struct trace_array *tr) 2861 { 2862 struct trace_event_file *file; 2863 int ret; > 2864 int i; 2865 2866 file = trace_create_new_event(call, tr); 2867 if (!file) 2868 return -ENOMEM; 2869 2870 ret = event_define_fields(call); 2871 if (ret) 2872 return ret; 2873
Hi Steven, Thank you for the patch! Yet something to improve: [auto build test ERROR on akpm-mm/mm-everything] [also build test ERROR on linus/master v6.1-rc1 next-20221020] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Steven-Rostedt/tracing-Add-trace_trigger-kernel-command-line-option/20221020-102230 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20221019200137.70343645%40gandalf.local.home patch subject: [PATCH] tracing: Add trace_trigger kernel command line option config: s390-allmodconfig compiler: s390-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/476ab107174fa2ac06ef7171b093235382008c77 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Steven-Rostedt/tracing-Add-trace_trigger-kernel-command-line-option/20221020-102230 git checkout 476ab107174fa2ac06ef7171b093235382008c77 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "early_boot_irqs_disabled" [mm/kasan/kasan_test.ko] undefined! >> ERROR: modpost: "early_boot_irqs_disabled" [mm/kfence/kfence_test.ko] undefined! ERROR: modpost: "devm_ioremap_resource" [drivers/dma/qcom/hdma.ko] undefined! ERROR: modpost: "devm_ioremap_resource" [drivers/dma/fsl-edma.ko] undefined! ERROR: modpost: "devm_ioremap_resource" [drivers/dma/idma64.ko] undefined! ERROR: modpost: "iounmap" [drivers/tty/ipwireless/ipwireless.ko] undefined! ERROR: modpost: "ioremap" [drivers/tty/ipwireless/ipwireless.ko] undefined! ERROR: modpost: "devm_platform_ioremap_resource" [drivers/char/xillybus/xillybus_of.ko] undefined! ERROR: modpost: "devm_memremap" [drivers/misc/open-dice.ko] undefined! ERROR: modpost: "devm_memunmap" [drivers/misc/open-dice.ko] undefined! WARNING: modpost: suppressed 20 unresolved symbol warnings because there were too many)
On Wed, 19 Oct 2022 20:01:37 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > Allow triggers to be enabled at kernel boot up. For example: > > trace_trigger="sched_switch.stacktrace if prev_state == 2" > > The above will enable the stacktrace trigger on top of the sched_switch > event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then > at boot up, a stacktrace will trigger and be recorded in the tracing ring > buffer every time the sched_switch happens where the previous state is > TASK_INTERRUPTIBLE. > > As this calls into tracepoint logic during very early boot (before > interrupts are enabled), a check has to be done to see if early boot > interrupts are still disabled, and if so, avoid any call to RCU > synchronization, as that will enable interrupts and cause boot up issues. Just out of curiousity, can you do it by boot-time tracer? (Is it too late for your issue?) $ cat >> stacktrace.bconf ftrace.event.sched.sched_switch.actions = "stacktrace if prev_state == 2" ^D $ bootconfig -a stacktrace.bconf initrd.img And boot kernel with "bootconfig". Then, ---------- init-1 [000] d..3. 0.546668: <stack trace> => trace_event_raw_event_sched_switch => __traceiter_sched_switch => __schedule => schedule => schedule_timeout => wait_for_completion_killable => __kthread_create_on_node => kthread_create_on_node => audit_init => do_one_initcall => kernel_init_freeable => kernel_init => ret_from_fork kauditd-57 [007] d..3. 0.546677: <stack trace> => trace_event_raw_event_sched_switch => __traceiter_sched_switch => __schedule => schedule => schedule_preempt_disabled => kthread => ret_from_fork ---------- Thank you, > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> > --- > .../admin-guide/kernel-parameters.txt | 19 ++++++ > include/linux/tracepoint.h | 4 ++ > kernel/trace/trace.c | 3 +- > kernel/trace/trace_events.c | 63 ++++++++++++++++++- > kernel/tracepoint.c | 6 ++ > 5 files changed, 92 insertions(+), 3 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index a465d5242774..ccf91a4bf113 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6257,6 +6257,25 @@ > See also Documentation/trace/ftrace.rst "trace options" > section. > > + trace_trigger=[trigger-list] > + [FTRACE] Add a event trigger on specific events. > + Set a trigger on top of a specific event, with an optional > + filter. > + > + The format is is "trace_trigger=<event>.<trigger>[ if <filter>],..." > + Where more than one trigger may be specified that are comma deliminated. > + > + For example: > + > + trace_trigger="sched_switch.stacktrace if prev_state == 2" > + > + The above will enable the "stacktrace" trigger on the "sched_switch" > + event but only trigger it if the "prev_state" of the "sched_switch" > + event is "2" (TASK_UNINTERUPTIBLE). > + > + See also "Event triggers" in Documentation/trace/events.rst > + > + > traceoff_on_warning > [FTRACE] enable this option to disable tracing when a > warning is hit. This turns off "tracing_on". Tracing can > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 4b33b95eb8be..a5c6b5772897 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > #ifdef CONFIG_TRACEPOINTS > static inline void tracepoint_synchronize_unregister(void) > { > + /* Early updates do not need synchronization */ > + if (early_boot_irqs_disabled) > + return; > + > synchronize_srcu(&tracepoint_srcu); > synchronize_rcu(); > } > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 47a44b055a1d..c03fd7037add 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2749,7 +2749,8 @@ void trace_buffered_event_disable(void) > preempt_enable(); > > /* Wait for all current users to finish */ > - synchronize_rcu(); > + if (!early_boot_irqs_disabled) > + synchronize_rcu(); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index 0356cae0cf74..06554939252c 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -2796,6 +2796,44 @@ trace_create_new_event(struct trace_event_call *call, > return file; > } > > +#ifdef CONFIG_HIST_TRIGGERS > +#define MAX_BOOT_TRIGGERS 32 > + > +static struct boot_triggers { > + const char *event; > + char *trigger; > +} bootup_triggers[MAX_BOOT_TRIGGERS]; > + > +static char bootup_trigger_buf[COMMAND_LINE_SIZE]; > +static int nr_boot_triggers; > + > +static __init int setup_trace_triggers(char *str) > +{ > + char *trigger; > + char *buf; > + int i; > + > + strlcpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE); > + ring_buffer_expanded = true; > + disable_tracing_selftest("running event triggers"); > + > + buf = bootup_trigger_buf; > + for (i = 0; i < MAX_BOOT_TRIGGERS; i++) { > + trigger = strsep(&buf, ","); > + if (!trigger) > + break; > + bootup_triggers[i].event = strsep(&trigger, "."); > + bootup_triggers[i].trigger = strsep(&trigger, "."); > + if (!bootup_triggers[i].trigger) > + break; > + } > + > + nr_boot_triggers = i; > + return 1; > +} > +__setup("trace_trigger=", setup_trace_triggers); > +#endif > + > /* Add an event to a trace directory */ > static int > __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr) > @@ -2822,12 +2860,32 @@ __trace_early_add_new_event(struct trace_event_call *call, > struct trace_array *tr) > { > struct trace_event_file *file; > + int ret; > + int i; > > file = trace_create_new_event(call, tr); > if (!file) > return -ENOMEM; > > - return event_define_fields(call); > + ret = event_define_fields(call); > + if (ret) > + return ret; > + > +#ifdef CONFIG_HIST_TRIGGERS > + for (i = 0; i < nr_boot_triggers; i++) { > + if (strcmp(trace_event_name(call), bootup_triggers[i].event)) > + continue; > + mutex_lock(&event_mutex); > + ret = trigger_process_regex(file, bootup_triggers[i].trigger); > + mutex_unlock(&event_mutex); > + if (ret) > + pr_err("Failed to register trigger '%s' on event %s\n", > + bootup_triggers[i].trigger, > + bootup_triggers[i].event); > + } > +#endif > + > + return 0; > } > > struct ftrace_module_file_ops; > @@ -3726,6 +3784,8 @@ static __init int event_trace_enable(void) > list_add(&call->list, &ftrace_events); > } > > + register_trigger_cmds(); > + > /* > * We need the top trace array to have a working set of trace > * points at early init, before the debug files and directories > @@ -3740,7 +3800,6 @@ static __init int event_trace_enable(void) > > register_event_cmds(); > > - register_trigger_cmds(); > > return 0; > } > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index f23144af5743..f6e4ee1e40b3 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -48,6 +48,9 @@ static void tp_rcu_get_state(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > /* Keep the latest get_state snapshot. */ > snapshot->rcu = get_state_synchronize_rcu(); > snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu); > @@ -58,6 +61,9 @@ static void tp_rcu_cond_sync(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > if (!snapshot->ongoing) > return; > cond_synchronize_rcu(snapshot->rcu); > -- > 2.35.1 >
Hi Steven, On Wed, Oct 19, 2022 at 8:01 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > Allow triggers to be enabled at kernel boot up. For example: > > trace_trigger="sched_switch.stacktrace if prev_state == 2" > > The above will enable the stacktrace trigger on top of the sched_switch > event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then > at boot up, a stacktrace will trigger and be recorded in the tracing ring > buffer every time the sched_switch happens where the previous state is > TASK_INTERRUPTIBLE. > > As this calls into tracepoint logic during very early boot (before > interrupts are enabled), a check has to be done to see if early boot > interrupts are still disabled, and if so, avoid any call to RCU > synchronization, as that will enable interrupts and cause boot up issues. > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> > --- rest_init() -> rcu_scheduler_starting() sets RCU_SCHEDULER_INIT start_kernel() disables interrupts and does the dance you mentioned, while setting that early_ flag. start_kernel() is called from arch code. From your crash, the trace trigger is happening before the early_ boot flag is set to true. rcu_blocking_is_gp() should return true at the point your trigger fires because RCU_SCHEDULER_INACTIVE is still set. So I would expect your synchronize_rcu() to not do anything. There's comments in rcu_blocking_is_gp() like so: * During early boot, any blocking grace-period wait automatically * implies a grace period. So it should be returning straight from here. I wonder where/why it's re-enabling interrupts for you. What does rcu_blocking_is_gp() return if you can print that, at the time of the crash? Cheers, - J > .../admin-guide/kernel-parameters.txt | 19 ++++++ > include/linux/tracepoint.h | 4 ++ > kernel/trace/trace.c | 3 +- > kernel/trace/trace_events.c | 63 ++++++++++++++++++- > kernel/tracepoint.c | 6 ++ > 5 files changed, 92 insertions(+), 3 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index a465d5242774..ccf91a4bf113 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6257,6 +6257,25 @@ > See also Documentation/trace/ftrace.rst "trace options" > section. > > + trace_trigger=[trigger-list] > + [FTRACE] Add a event trigger on specific events. > + Set a trigger on top of a specific event, with an optional > + filter. > + > + The format is is "trace_trigger=<event>.<trigger>[ if <filter>],..." > + Where more than one trigger may be specified that are comma deliminated. > + > + For example: > + > + trace_trigger="sched_switch.stacktrace if prev_state == 2" > + > + The above will enable the "stacktrace" trigger on the "sched_switch" > + event but only trigger it if the "prev_state" of the "sched_switch" > + event is "2" (TASK_UNINTERUPTIBLE). > + > + See also "Event triggers" in Documentation/trace/events.rst > + > + > traceoff_on_warning > [FTRACE] enable this option to disable tracing when a > warning is hit. This turns off "tracing_on". Tracing can > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 4b33b95eb8be..a5c6b5772897 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > #ifdef CONFIG_TRACEPOINTS > static inline void tracepoint_synchronize_unregister(void) > { > + /* Early updates do not need synchronization */ > + if (early_boot_irqs_disabled) > + return; > + > synchronize_srcu(&tracepoint_srcu); > synchronize_rcu(); > } > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 47a44b055a1d..c03fd7037add 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2749,7 +2749,8 @@ void trace_buffered_event_disable(void) > preempt_enable(); > > /* Wait for all current users to finish */ > - synchronize_rcu(); > + if (!early_boot_irqs_disabled) > + synchronize_rcu(); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index 0356cae0cf74..06554939252c 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -2796,6 +2796,44 @@ trace_create_new_event(struct trace_event_call *call, > return file; > } > > +#ifdef CONFIG_HIST_TRIGGERS > +#define MAX_BOOT_TRIGGERS 32 > + > +static struct boot_triggers { > + const char *event; > + char *trigger; > +} bootup_triggers[MAX_BOOT_TRIGGERS]; > + > +static char bootup_trigger_buf[COMMAND_LINE_SIZE]; > +static int nr_boot_triggers; > + > +static __init int setup_trace_triggers(char *str) > +{ > + char *trigger; > + char *buf; > + int i; > + > + strlcpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE); > + ring_buffer_expanded = true; > + disable_tracing_selftest("running event triggers"); > + > + buf = bootup_trigger_buf; > + for (i = 0; i < MAX_BOOT_TRIGGERS; i++) { > + trigger = strsep(&buf, ","); > + if (!trigger) > + break; > + bootup_triggers[i].event = strsep(&trigger, "."); > + bootup_triggers[i].trigger = strsep(&trigger, "."); > + if (!bootup_triggers[i].trigger) > + break; > + } > + > + nr_boot_triggers = i; > + return 1; > +} > +__setup("trace_trigger=", setup_trace_triggers); > +#endif > + > /* Add an event to a trace directory */ > static int > __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr) > @@ -2822,12 +2860,32 @@ __trace_early_add_new_event(struct trace_event_call *call, > struct trace_array *tr) > { > struct trace_event_file *file; > + int ret; > + int i; > > file = trace_create_new_event(call, tr); > if (!file) > return -ENOMEM; > > - return event_define_fields(call); > + ret = event_define_fields(call); > + if (ret) > + return ret; > + > +#ifdef CONFIG_HIST_TRIGGERS > + for (i = 0; i < nr_boot_triggers; i++) { > + if (strcmp(trace_event_name(call), bootup_triggers[i].event)) > + continue; > + mutex_lock(&event_mutex); > + ret = trigger_process_regex(file, bootup_triggers[i].trigger); > + mutex_unlock(&event_mutex); > + if (ret) > + pr_err("Failed to register trigger '%s' on event %s\n", > + bootup_triggers[i].trigger, > + bootup_triggers[i].event); > + } > +#endif > + > + return 0; > } > > struct ftrace_module_file_ops; > @@ -3726,6 +3784,8 @@ static __init int event_trace_enable(void) > list_add(&call->list, &ftrace_events); > } > > + register_trigger_cmds(); > + > /* > * We need the top trace array to have a working set of trace > * points at early init, before the debug files and directories > @@ -3740,7 +3800,6 @@ static __init int event_trace_enable(void) > > register_event_cmds(); > > - register_trigger_cmds(); > > return 0; > } > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index f23144af5743..f6e4ee1e40b3 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -48,6 +48,9 @@ static void tp_rcu_get_state(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > /* Keep the latest get_state snapshot. */ > snapshot->rcu = get_state_synchronize_rcu(); > snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu); > @@ -58,6 +61,9 @@ static void tp_rcu_cond_sync(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > if (!snapshot->ongoing) > return; > cond_synchronize_rcu(snapshot->rcu); > -- > 2.35.1 >
On Thu, Oct 20, 2022 at 10:43 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hi Steven, > > On Wed, Oct 19, 2022 at 8:01 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > > > Allow triggers to be enabled at kernel boot up. For example: > > > > trace_trigger="sched_switch.stacktrace if prev_state == 2" > > > > The above will enable the stacktrace trigger on top of the sched_switch > > event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then > > at boot up, a stacktrace will trigger and be recorded in the tracing ring > > buffer every time the sched_switch happens where the previous state is > > TASK_INTERRUPTIBLE. > > > > As this calls into tracepoint logic during very early boot (before > > interrupts are enabled), a check has to be done to see if early boot > > interrupts are still disabled, and if so, avoid any call to RCU > > synchronization, as that will enable interrupts and cause boot up issues. > > > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> > > --- > > rest_init() -> rcu_scheduler_starting() sets RCU_SCHEDULER_INIT > > start_kernel() disables interrupts and does the dance you mentioned, > while setting that early_ flag. start_kernel() is called from arch > code. > > From your crash, the trace trigger is happening before the early_ boot > flag is set to true. My apologies. I meant, before early_boot_irqs_disabled is set to false.
On Thu, 20 Oct 2022 23:33:18 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > On Wed, 19 Oct 2022 20:01:37 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > > > Allow triggers to be enabled at kernel boot up. For example: > > > > trace_trigger="sched_switch.stacktrace if prev_state == 2" > > > > The above will enable the stacktrace trigger on top of the sched_switch > > event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then > > at boot up, a stacktrace will trigger and be recorded in the tracing ring > > buffer every time the sched_switch happens where the previous state is > > TASK_INTERRUPTIBLE. > > > > As this calls into tracepoint logic during very early boot (before > > interrupts are enabled), a check has to be done to see if early boot > > interrupts are still disabled, and if so, avoid any call to RCU > > synchronization, as that will enable interrupts and cause boot up issues. > > Just out of curiousity, can you do it by boot-time tracer? > (Is it too late for your issue?) Yeah, I'm looking at adding triggers very early. I could even add "traceoff" trigger too here. I'm guessing that bootconf is done later in boot up? Especially since I needed to fix RCU calls to get this working. > > $ cat >> stacktrace.bconf > ftrace.event.sched.sched_switch.actions = "stacktrace if prev_state == 2" > ^D > $ bootconfig -a stacktrace.bconf initrd.img > Not to mention, I'm doing this for Chromebooks where it's easy to update the command line (on dev devices), but not as easy to modify the kernel. -- Steve
On Wed, Oct 19, 2022 at 08:07:45PM -0400, Steven Rostedt wrote: > On Wed, 19 Oct 2022 20:01:37 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > > #ifdef CONFIG_TRACEPOINTS > > static inline void tracepoint_synchronize_unregister(void) > > { > > + /* Early updates do not need synchronization */ > > + if (early_boot_irqs_disabled) > > + return; > > + > > synchronize_srcu(&tracepoint_srcu); > > synchronize_rcu(); > > I wonder if this check should be just moved to the RCU synchronization > code? That is, if early_boot_irqs_disabled is set, do nothing, as there > should be nothing to synchronize against. There already is a similar check, but it follows the lockdep check. Does the following patch help? Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index ca4b5dcec675b..356ef70d5442c 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1267,14 +1267,15 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm) { struct rcu_synchronize rcu; + if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) + return; + RCU_LOCKDEP_WARN(lockdep_is_held(ssp) || lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); - if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) - return; might_sleep(); check_init_srcu_struct(ssp); init_completion(&rcu.completion);
On Thu, 20 Oct 2022 08:59:00 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > On Wed, Oct 19, 2022 at 08:07:45PM -0400, Steven Rostedt wrote: > > On Wed, 19 Oct 2022 20:01:37 -0400 > > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > > > #ifdef CONFIG_TRACEPOINTS > > > static inline void tracepoint_synchronize_unregister(void) > > > { > > > + /* Early updates do not need synchronization */ > > > + if (early_boot_irqs_disabled) > > > + return; > > > + > > > synchronize_srcu(&tracepoint_srcu); > > > synchronize_rcu(); > > > > I wonder if this check should be just moved to the RCU synchronization > > code? That is, if early_boot_irqs_disabled is set, do nothing, as there > > should be nothing to synchronize against. > > There already is a similar check, but it follows the lockdep check. > > Does the following patch help? Not sure if it would. I added this: diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index c03fd7037add..79ac31a6a87b 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2748,9 +2748,12 @@ void trace_buffered_event_disable(void) disable_trace_buffered_event, NULL, 1); preempt_enable(); + if (!irqs_disabled()) + printk("IRQS DISABLED!! before %s:%d\n", __func__, __LINE__); /* Wait for all current users to finish */ - if (!early_boot_irqs_disabled) - synchronize_rcu(); + synchronize_rcu(); + if (!irqs_disabled()) + printk("IRQS DISABLED!! after %s:%d\n", __func__, __LINE__); for_each_tracing_cpu(cpu) { free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); Which produced this: [ 0.972867] ftrace: allocating 47021 entries in 184 pages [ 0.978285] ftrace section at ffffffffacef74c0 sorted properly [ 0.991153] ftrace: allocated 184 pages with 4 groups [ 0.996163] Dynamic Preempt: full [ 0.999354] rcu: Preemptible hierarchical RCU implementation. [ 1.005020] rcu: RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=8. [ 1.011680] Tasks-RCU CPU stall warnings timeout set to 120000 (rcu_task_stall_timeout). [ 1.019813] Trampoline variant of Tasks RCU enabled. [ 1.024831] Rude variant of Tasks RCU enabled. [ 1.029330] Tracing variant of Tasks RCU enabled. [ 1.034089] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. [ 1.041699] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 [ 1.052607] IRQS DISABLED!! after trace_buffered_event_disable:2756 [ 1.058712] NR_IRQS: 8448, nr_irqs: 488, preallocated irqs: 16 [ 1.064670] NO_HZ: Full dynticks CPUs: 3. [ 1.068501] rcu: Offload RCU callbacks from CPUs: 3. [ 1.073519] rcu: srcu_init: Setting srcu_struct sizes based on contention. [ 1.080400] ------------[ cut here ]------------ [ 1.084938] Interrupts were enabled early [ 1.088919] WARNING: CPU: 0 PID: 0 at init/main.c:1061 start_kernel+0x36f/0x4e8 [ 1.096184] Modules linked in: [ 1.099213] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc1-test+ #468 [ 1.106047] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 [ 1.114960] RIP: 0010:start_kernel+0x36f/0x4e8 [ 1.119373] Code: 48 27 53 e8 fd 7b 03 00 e8 55 72 e5 fe e8 6b 05 03 00 ff 15 2a f9 89 ff 0f ba e0 09 73 0e 48 c7 c7 98 25 31 ac e8 01 5b df fe <0f> 0b c6 05 c7 2c af ff 00 e8 ba 36 47 fe ff 15 14 f9 89 ff e8 56 [ 1.138057] RSP: 0000:ffffffffac603f30 EFLAGS: 00010286 [ 1.143248] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1.150343] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 1.157438] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffac603dd8 [ 1.164537] R10: 0000000000000003 R11: ffffffffac734168 R12: 000000000000180a [ 1.171633] R13: 00000000010fecf0 R14: 0000000000000000 R15: 00000000d44e3018 [ 1.178730] FS: 0000000000000000(0000) GS:ffff93ee1aa00000(0000) knlGS:0000000000000000 [ 1.186778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.192489] CR2: ffff93ee1edff000 CR3: 000000002e60a001 CR4: 00000000000606f0 [ 1.199584] Call Trace: [ 1.202009] <TASK> So I'm not sure how just updating srcu will do anything to prevent this. -- Steve > > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index ca4b5dcec675b..356ef70d5442c 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1267,14 +1267,15 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm) > { > struct rcu_synchronize rcu; > > + if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) > + return; > + > RCU_LOCKDEP_WARN(lockdep_is_held(ssp) || > lock_is_held(&rcu_bh_lock_map) || > lock_is_held(&rcu_lock_map) || > lock_is_held(&rcu_sched_lock_map), > "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); > > - if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) > - return; > might_sleep(); > check_init_srcu_struct(ssp); > init_completion(&rcu.completion);
> On Oct 20, 2022, at 12:33 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 20 Oct 2022 08:59:00 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > >>> On Wed, Oct 19, 2022 at 08:07:45PM -0400, Steven Rostedt wrote: >>> On Wed, 19 Oct 2022 20:01:37 -0400 >>> Steven Rostedt <rostedt@goodmis.org> wrote: >>> >>>> @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) >>>> #ifdef CONFIG_TRACEPOINTS >>>> static inline void tracepoint_synchronize_unregister(void) >>>> { >>>> + /* Early updates do not need synchronization */ >>>> + if (early_boot_irqs_disabled) >>>> + return; >>>> + >>>> synchronize_srcu(&tracepoint_srcu); >>>> synchronize_rcu(); >>> >>> I wonder if this check should be just moved to the RCU synchronization >>> code? That is, if early_boot_irqs_disabled is set, do nothing, as there >>> should be nothing to synchronize against. >> >> There already is a similar check, but it follows the lockdep check. >> >> Does the following patch help? > > Not sure if it would. > > I added this: I’m not working today, so I can’t look very closely. But could you remove the might_sleep() from rcu_gp_is_blocking() and see if it fixes your issue? I think that function call might be incompatible with early boot. Just a hunch. - Joel > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index c03fd7037add..79ac31a6a87b 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2748,9 +2748,12 @@ void trace_buffered_event_disable(void) > disable_trace_buffered_event, NULL, 1); > preempt_enable(); > > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! before %s:%d\n", __func__, __LINE__); > /* Wait for all current users to finish */ > - if (!early_boot_irqs_disabled) > - synchronize_rcu(); > + synchronize_rcu(); > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! after %s:%d\n", __func__, __LINE__); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > > Which produced this: > > > [ 0.972867] ftrace: allocating 47021 entries in 184 pages > [ 0.978285] ftrace section at ffffffffacef74c0 sorted properly > [ 0.991153] ftrace: allocated 184 pages with 4 groups > [ 0.996163] Dynamic Preempt: full > [ 0.999354] rcu: Preemptible hierarchical RCU implementation. > [ 1.005020] rcu: RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=8. > [ 1.011680] Tasks-RCU CPU stall warnings timeout set to 120000 (rcu_task_stall_timeout). > [ 1.019813] Trampoline variant of Tasks RCU enabled. > [ 1.024831] Rude variant of Tasks RCU enabled. > [ 1.029330] Tracing variant of Tasks RCU enabled. > [ 1.034089] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. > [ 1.041699] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 > [ 1.052607] IRQS DISABLED!! after trace_buffered_event_disable:2756 > [ 1.058712] NR_IRQS: 8448, nr_irqs: 488, preallocated irqs: 16 > [ 1.064670] NO_HZ: Full dynticks CPUs: 3. > [ 1.068501] rcu: Offload RCU callbacks from CPUs: 3. > [ 1.073519] rcu: srcu_init: Setting srcu_struct sizes based on contention. > [ 1.080400] ------------[ cut here ]------------ > [ 1.084938] Interrupts were enabled early > [ 1.088919] WARNING: CPU: 0 PID: 0 at init/main.c:1061 start_kernel+0x36f/0x4e8 > [ 1.096184] Modules linked in: > [ 1.099213] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc1-test+ #468 > [ 1.106047] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 > [ 1.114960] RIP: 0010:start_kernel+0x36f/0x4e8 > [ 1.119373] Code: 48 27 53 e8 fd 7b 03 00 e8 55 72 e5 fe e8 6b 05 03 00 ff 15 2a f9 89 ff 0f ba e0 09 73 0e 48 c7 c7 98 25 31 ac e8 01 5b df fe <0f> 0b c6 05 c7 2c af ff 00 e8 ba 36 47 fe ff 15 14 f9 89 ff e8 56 > [ 1.138057] RSP: 0000:ffffffffac603f30 EFLAGS: 00010286 > [ 1.143248] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > [ 1.150343] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: 0000000000000001 > [ 1.157438] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffac603dd8 > [ 1.164537] R10: 0000000000000003 R11: ffffffffac734168 R12: 000000000000180a > [ 1.171633] R13: 00000000010fecf0 R14: 0000000000000000 R15: 00000000d44e3018 > [ 1.178730] FS: 0000000000000000(0000) GS:ffff93ee1aa00000(0000) knlGS:0000000000000000 > [ 1.186778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.192489] CR2: ffff93ee1edff000 CR3: 000000002e60a001 CR4: 00000000000606f0 > [ 1.199584] Call Trace: > [ 1.202009] <TASK> > > > So I'm not sure how just updating srcu will do anything to prevent this. > > -- Steve > >> >> Thanx, Paul >> >> ------------------------------------------------------------------------ >> >> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c >> index ca4b5dcec675b..356ef70d5442c 100644 >> --- a/kernel/rcu/srcutree.c >> +++ b/kernel/rcu/srcutree.c >> @@ -1267,14 +1267,15 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm) >> { >> struct rcu_synchronize rcu; >> >> + if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) >> + return; >> + >> RCU_LOCKDEP_WARN(lockdep_is_held(ssp) || >> lock_is_held(&rcu_bh_lock_map) || >> lock_is_held(&rcu_lock_map) || >> lock_is_held(&rcu_sched_lock_map), >> "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); >> >> - if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) >> - return; >> might_sleep(); >> check_init_srcu_struct(ssp); >> init_completion(&rcu.completion); >
On Thu, 20 Oct 2022 12:33:57 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index c03fd7037add..79ac31a6a87b 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2748,9 +2748,12 @@ void trace_buffered_event_disable(void) > disable_trace_buffered_event, NULL, 1); > preempt_enable(); > > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! before %s:%d\n", __func__, __LINE__); > /* Wait for all current users to finish */ > - if (!early_boot_irqs_disabled) > - synchronize_rcu(); > + synchronize_rcu(); > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! after %s:%d\n", __func__, __LINE__); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); Continuing the above printk()s I found the culprit. synchronize_rcu() { rcu_poll_gp_seq_start_unlocked() { struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); raw_spin_lock_irq_rcu_node(rnp); } rcu_poll_gp_seq_start(snap); if (rcu_init_invoked()) raw_spin_unlock_irq_rcu_node(rnp); <<-- Interrupts enabled here } -- Steve
On Thu, Oct 20, 2022 at 12:33:57PM -0400, Steven Rostedt wrote: > On Thu, 20 Oct 2022 08:59:00 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > On Wed, Oct 19, 2022 at 08:07:45PM -0400, Steven Rostedt wrote: > > > On Wed, 19 Oct 2022 20:01:37 -0400 > > > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > > > > #ifdef CONFIG_TRACEPOINTS > > > > static inline void tracepoint_synchronize_unregister(void) > > > > { > > > > + /* Early updates do not need synchronization */ > > > > + if (early_boot_irqs_disabled) > > > > + return; > > > > + > > > > synchronize_srcu(&tracepoint_srcu); > > > > synchronize_rcu(); > > > > > > I wonder if this check should be just moved to the RCU synchronization > > > code? That is, if early_boot_irqs_disabled is set, do nothing, as there > > > should be nothing to synchronize against. > > > > There already is a similar check, but it follows the lockdep check. > > > > Does the following patch help? > > Not sure if it would. > > I added this: > > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index c03fd7037add..79ac31a6a87b 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2748,9 +2748,12 @@ void trace_buffered_event_disable(void) > disable_trace_buffered_event, NULL, 1); > preempt_enable(); > > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! before %s:%d\n", __func__, __LINE__); > /* Wait for all current users to finish */ > - if (!early_boot_irqs_disabled) > - synchronize_rcu(); > + synchronize_rcu(); > + if (!irqs_disabled()) > + printk("IRQS DISABLED!! after %s:%d\n", __func__, __LINE__); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > > Which produced this: > > > [ 0.972867] ftrace: allocating 47021 entries in 184 pages > [ 0.978285] ftrace section at ffffffffacef74c0 sorted properly > [ 0.991153] ftrace: allocated 184 pages with 4 groups > [ 0.996163] Dynamic Preempt: full > [ 0.999354] rcu: Preemptible hierarchical RCU implementation. > [ 1.005020] rcu: RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=8. > [ 1.011680] Tasks-RCU CPU stall warnings timeout set to 120000 (rcu_task_stall_timeout). > [ 1.019813] Trampoline variant of Tasks RCU enabled. > [ 1.024831] Rude variant of Tasks RCU enabled. > [ 1.029330] Tracing variant of Tasks RCU enabled. > [ 1.034089] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. > [ 1.041699] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 > [ 1.052607] IRQS DISABLED!! after trace_buffered_event_disable:2756 > [ 1.058712] NR_IRQS: 8448, nr_irqs: 488, preallocated irqs: 16 > [ 1.064670] NO_HZ: Full dynticks CPUs: 3. > [ 1.068501] rcu: Offload RCU callbacks from CPUs: 3. > [ 1.073519] rcu: srcu_init: Setting srcu_struct sizes based on contention. > [ 1.080400] ------------[ cut here ]------------ > [ 1.084938] Interrupts were enabled early > [ 1.088919] WARNING: CPU: 0 PID: 0 at init/main.c:1061 start_kernel+0x36f/0x4e8 > [ 1.096184] Modules linked in: > [ 1.099213] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc1-test+ #468 > [ 1.106047] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 > [ 1.114960] RIP: 0010:start_kernel+0x36f/0x4e8 > [ 1.119373] Code: 48 27 53 e8 fd 7b 03 00 e8 55 72 e5 fe e8 6b 05 03 00 ff 15 2a f9 89 ff 0f ba e0 09 73 0e 48 c7 c7 98 25 31 ac e8 01 5b df fe <0f> 0b c6 05 c7 2c af ff 00 e8 ba 36 47 fe ff 15 14 f9 89 ff e8 56 > [ 1.138057] RSP: 0000:ffffffffac603f30 EFLAGS: 00010286 > [ 1.143248] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > [ 1.150343] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: 0000000000000001 > [ 1.157438] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffac603dd8 > [ 1.164537] R10: 0000000000000003 R11: ffffffffac734168 R12: 000000000000180a > [ 1.171633] R13: 00000000010fecf0 R14: 0000000000000000 R15: 00000000d44e3018 > [ 1.178730] FS: 0000000000000000(0000) GS:ffff93ee1aa00000(0000) knlGS:0000000000000000 > [ 1.186778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.192489] CR2: ffff93ee1edff000 CR3: 000000002e60a001 CR4: 00000000000606f0 > [ 1.199584] Call Trace: > [ 1.202009] <TASK> > > > So I'm not sure how just updating srcu will do anything to prevent this. Apologies, SRCU on the brain due to a different email thread... :-/ Thanx, Paul > -- Steve > > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index ca4b5dcec675b..356ef70d5442c 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1267,14 +1267,15 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm) > > { > > struct rcu_synchronize rcu; > > > > + if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) > > + return; > > + > > RCU_LOCKDEP_WARN(lockdep_is_held(ssp) || > > lock_is_held(&rcu_bh_lock_map) || > > lock_is_held(&rcu_lock_map) || > > lock_is_held(&rcu_sched_lock_map), > > "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); > > > > - if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) > > - return; > > might_sleep(); > > check_init_srcu_struct(ssp); > > init_completion(&rcu.completion); >
On Thu, Oct 20, 2022 at 01:05:28PM -0400, Steven Rostedt wrote: > On Thu, 20 Oct 2022 12:33:57 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > > index c03fd7037add..79ac31a6a87b 100644 > > --- a/kernel/trace/trace.c > > +++ b/kernel/trace/trace.c > > @@ -2748,9 +2748,12 @@ void trace_buffered_event_disable(void) > > disable_trace_buffered_event, NULL, 1); > > preempt_enable(); > > > > + if (!irqs_disabled()) > > + printk("IRQS DISABLED!! before %s:%d\n", __func__, __LINE__); > > /* Wait for all current users to finish */ > > - if (!early_boot_irqs_disabled) > > - synchronize_rcu(); > > + synchronize_rcu(); > > + if (!irqs_disabled()) > > + printk("IRQS DISABLED!! after %s:%d\n", __func__, __LINE__); > > > > for_each_tracing_cpu(cpu) { > > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > > > Continuing the above printk()s I found the culprit. > > synchronize_rcu() { > rcu_poll_gp_seq_start_unlocked() { > struct rcu_node *rnp = rcu_get_root(); > > if (rcu_init_invoked()) { > lockdep_assert_irqs_enabled(); > raw_spin_lock_irq_rcu_node(rnp); > } > rcu_poll_gp_seq_start(snap); > if (rcu_init_invoked()) > raw_spin_unlock_irq_rcu_node(rnp); <<-- Interrupts enabled here > } Thank you for digging into this! Does the following help? Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index abc615808b6e8..45ad9ef00d647 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1425,30 +1425,32 @@ static void rcu_poll_gp_seq_end(unsigned long *snap) // where caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_start_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_start(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } // Make the polled API aware of the end of a grace period, but where // caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_end(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } /*
On Thu, 20 Oct 2022 10:55:12 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > > synchronize_rcu() { > > rcu_poll_gp_seq_start_unlocked() { > > struct rcu_node *rnp = rcu_get_root(); > > > > if (rcu_init_invoked()) { > > lockdep_assert_irqs_enabled(); > > raw_spin_lock_irq_rcu_node(rnp); > > } > > rcu_poll_gp_seq_start(snap); > > if (rcu_init_invoked()) > > raw_spin_unlock_irq_rcu_node(rnp); <<-- Interrupts enabled here > > } > > Thank you for digging into this! > > Does the following help? It does indeed! I can remove the early_boot_irqs_disabled checks if you can add this to this rc cycle and mark it stable. -- Steve
On Thu, Oct 20, 2022 at 02:53:40PM -0400, Steven Rostedt wrote: > On Thu, 20 Oct 2022 10:55:12 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > synchronize_rcu() { > > > rcu_poll_gp_seq_start_unlocked() { > > > struct rcu_node *rnp = rcu_get_root(); > > > > > > if (rcu_init_invoked()) { > > > lockdep_assert_irqs_enabled(); > > > raw_spin_lock_irq_rcu_node(rnp); > > > } > > > rcu_poll_gp_seq_start(snap); > > > if (rcu_init_invoked()) > > > raw_spin_unlock_irq_rcu_node(rnp); <<-- Interrupts enabled here > > > } > > > > Thank you for digging into this! > > > > Does the following help? > > It does indeed! > > I can remove the early_boot_irqs_disabled checks if you can add this to > this rc cycle and mark it stable. My plan is to push it in after a bit of testing. The Fixes tag should make the right -stable stuff happen. Thanx, Paul ------------------------------------------------------------------------ commit 952643575669332565c77066fb42d67987680f12 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu Oct 20 10:58:14 2022 -0700 rcu: Keep synchronize_rcu() from enabling irqs in early boot Making polled RCU grace periods account for expedited grace periods required acquiring the leaf rcu_node structure's lock during early boot, but after rcu_init() was called. This lock is irq-disabled, but the code incorrectly assumes that irqs are always disabled when invoking synchronize_rcu(). The exception is early boot before the scheduler has started, which means that upon return from synchronize_rcu(), irqs will be incorrectly enabled. This commit fixes this bug by using irqsave/irqrestore locking primitives. Fixes: bf95b2bc3e42 ("rcu: Switch polled grace-period APIs to ->gp_seq_polled") Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index abc615808b6e8..45ad9ef00d647 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1425,30 +1425,32 @@ static void rcu_poll_gp_seq_end(unsigned long *snap) // where caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_start_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_start(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } // Make the polled API aware of the end of a grace period, but where // caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_end(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } /*
On Thu, 20 Oct 2022 15:21:08 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > My plan is to push it in after a bit of testing. The Fixes tag should > make the right -stable stuff happen. Great, thanks! I'll resend my patch without the synchronize_rcu() avoidance. -- Steve
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a465d5242774..ccf91a4bf113 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6257,6 +6257,25 @@ See also Documentation/trace/ftrace.rst "trace options" section. + trace_trigger=[trigger-list] + [FTRACE] Add a event trigger on specific events. + Set a trigger on top of a specific event, with an optional + filter. + + The format is is "trace_trigger=<event>.<trigger>[ if <filter>],..." + Where more than one trigger may be specified that are comma deliminated. + + For example: + + trace_trigger="sched_switch.stacktrace if prev_state == 2" + + The above will enable the "stacktrace" trigger on the "sched_switch" + event but only trigger it if the "prev_state" of the "sched_switch" + event is "2" (TASK_UNINTERUPTIBLE). + + See also "Event triggers" in Documentation/trace/events.rst + + traceoff_on_warning [FTRACE] enable this option to disable tracing when a warning is hit. This turns off "tracing_on". Tracing can diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 4b33b95eb8be..a5c6b5772897 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) #ifdef CONFIG_TRACEPOINTS static inline void tracepoint_synchronize_unregister(void) { + /* Early updates do not need synchronization */ + if (early_boot_irqs_disabled) + return; + synchronize_srcu(&tracepoint_srcu); synchronize_rcu(); } diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 47a44b055a1d..c03fd7037add 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2749,7 +2749,8 @@ void trace_buffered_event_disable(void) preempt_enable(); /* Wait for all current users to finish */ - synchronize_rcu(); + if (!early_boot_irqs_disabled) + synchronize_rcu(); for_each_tracing_cpu(cpu) { free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 0356cae0cf74..06554939252c 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2796,6 +2796,44 @@ trace_create_new_event(struct trace_event_call *call, return file; } +#ifdef CONFIG_HIST_TRIGGERS +#define MAX_BOOT_TRIGGERS 32 + +static struct boot_triggers { + const char *event; + char *trigger; +} bootup_triggers[MAX_BOOT_TRIGGERS]; + +static char bootup_trigger_buf[COMMAND_LINE_SIZE]; +static int nr_boot_triggers; + +static __init int setup_trace_triggers(char *str) +{ + char *trigger; + char *buf; + int i; + + strlcpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE); + ring_buffer_expanded = true; + disable_tracing_selftest("running event triggers"); + + buf = bootup_trigger_buf; + for (i = 0; i < MAX_BOOT_TRIGGERS; i++) { + trigger = strsep(&buf, ","); + if (!trigger) + break; + bootup_triggers[i].event = strsep(&trigger, "."); + bootup_triggers[i].trigger = strsep(&trigger, "."); + if (!bootup_triggers[i].trigger) + break; + } + + nr_boot_triggers = i; + return 1; +} +__setup("trace_trigger=", setup_trace_triggers); +#endif + /* Add an event to a trace directory */ static int __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr) @@ -2822,12 +2860,32 @@ __trace_early_add_new_event(struct trace_event_call *call, struct trace_array *tr) { struct trace_event_file *file; + int ret; + int i; file = trace_create_new_event(call, tr); if (!file) return -ENOMEM; - return event_define_fields(call); + ret = event_define_fields(call); + if (ret) + return ret; + +#ifdef CONFIG_HIST_TRIGGERS + for (i = 0; i < nr_boot_triggers; i++) { + if (strcmp(trace_event_name(call), bootup_triggers[i].event)) + continue; + mutex_lock(&event_mutex); + ret = trigger_process_regex(file, bootup_triggers[i].trigger); + mutex_unlock(&event_mutex); + if (ret) + pr_err("Failed to register trigger '%s' on event %s\n", + bootup_triggers[i].trigger, + bootup_triggers[i].event); + } +#endif + + return 0; } struct ftrace_module_file_ops; @@ -3726,6 +3784,8 @@ static __init int event_trace_enable(void) list_add(&call->list, &ftrace_events); } + register_trigger_cmds(); + /* * We need the top trace array to have a working set of trace * points at early init, before the debug files and directories @@ -3740,7 +3800,6 @@ static __init int event_trace_enable(void) register_event_cmds(); - register_trigger_cmds(); return 0; } diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index f23144af5743..f6e4ee1e40b3 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -48,6 +48,9 @@ static void tp_rcu_get_state(enum tp_transition_sync sync) { struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; + if (early_boot_irqs_disabled) + return; + /* Keep the latest get_state snapshot. */ snapshot->rcu = get_state_synchronize_rcu(); snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu); @@ -58,6 +61,9 @@ static void tp_rcu_cond_sync(enum tp_transition_sync sync) { struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; + if (early_boot_irqs_disabled) + return; + if (!snapshot->ongoing) return; cond_synchronize_rcu(snapshot->rcu);