Message ID | 20230302195618.156940-17-john.ogness@linutronix.de |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp42318wrd; Thu, 2 Mar 2023 12:00:28 -0800 (PST) X-Google-Smtp-Source: AK7set8QA24H9icxmfNnGEcyPO9/Vito9/UezAG9yRoF2NreRb8/eW5h56/ykXVYXAod31eG55oV X-Received: by 2002:a05:6a20:3ca2:b0:cc:e0fb:a835 with SMTP id b34-20020a056a203ca200b000cce0fba835mr15186637pzj.47.1677787227813; Thu, 02 Mar 2023 12:00:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677787227; cv=none; d=google.com; s=arc-20160816; b=Wk0OOWX1CiaPigEojWuaMEjWR/BGgjkR+llI+B7sInl1XiFFeD4kPGWvyjS9QjWqam EF7YKz5baUEaQZDhbpB9Cb9F+i4C8X70+cBiLrf2y1lXbWJzTHLTn9fmjeacb2RpHIZc FMOW9zX+VwO89KLcQxjn8YyyJdLkVGJ1PgMY3SbUu0osK5HKgkL19klrQ3hKrCLkPYRG MVWoCK0nBhvqbD0apomhaSdSkaRr5DU8/zYjPfo+CeK6affgMGltkcmb8H8wMf8k0Ztf cz1QVz9VcUNyvNiprzr+aVSGCmbGW2NTkR95EKq7arSg8+bl6J6U7dDTbv5FK6UbEOIB xzOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=5Ukjcgr8xBhy+g3OZs13uetHsjpUVZC4eHKa1M+r91I=; b=akDLbRgFGql0pe1CgICFziqxx03fjTJmuBMQ1mDXu949xraNxzBZl7GDZKM/EE/xhg MTjXgDd2m86fv6xOnREU4bc+Er+ISKDARiji7d4BxSU16FcQ5vPQvoc8pJpZCF6NElvz X4n9CSj0oZPr69wlepqrIYaEVPgZdZgqs4xcopg++MQSGuWklqvQj0Z+kkDL6KIvFRhl 47e2yC2FUgZwwvuyi0H2gyypZd0d4G7Ec1gkkEpihBvGxWnxQgjn6RbxZerIqeedJM5J QiRG3TbrZpagKSsMFC6CoBBbL51hU4SBJb0YPyAh2OHVEpHZiTrdYqz05w109JuVol+5 QBfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=HRhjUOhk; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j31-20020a63231f000000b004df5fe7a161si123019pgj.660.2023.03.02.12.00.14; Thu, 02 Mar 2023 12:00:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=HRhjUOhk; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230216AbjCBT6W (ORCPT <rfc822;davidbtadokoro@gmail.com> + 99 others); Thu, 2 Mar 2023 14:58:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230025AbjCBT5w (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF6AA4743A for <linux-kernel@vger.kernel.org>; Thu, 2 Mar 2023 11:57:50 -0800 (PST) From: John Ogness <john.ogness@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Ukjcgr8xBhy+g3OZs13uetHsjpUVZC4eHKa1M+r91I=; b=HRhjUOhkaz87s/S228utMAWOjOX7FX/XBqnyxw9Bux9d/cig1aDf5yJMr0p+mk1ysQcQgt kMkxq30KNB2XnDG8LUhTyGDmMi57b1J7H6XDH+n1yRivsxyfO0CK7dLFLuaj+3jHSfGPSF 5vxtaoPWz6YIUF+LBhCSe7q4DBqUoNr5UlnfeJg2iLMf4Q9/iRRROb7dlDOmKuYw7Zdty/ fSrcUARLWJ26OulLCfpUIxkxoR9xaeglM0N/Ap1d3KCD5UUVUWOP79m1/mqGCKoEpAM4EB D/vQwxci3QQT5XlVS7j7OnE1VHpyhlzMpiJ7y+qA1oon2q/ZAwEb/6sYyL8haQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Ukjcgr8xBhy+g3OZs13uetHsjpUVZC4eHKa1M+r91I=; b=1bCHwZvCMxdX8QJcoPVDUANjvlSfXk6CGVNBnj/3eTMfWMiBWX2aIEYW056QPWZNJ1jXzM SjDkNN5cm/XiV2DA== To: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>, Steven Rostedt <rostedt@goodmis.org>, Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, "Guilherme G. Piccoli" <gpiccoli@igalia.com>, Luis Chamberlain <mcgrof@kernel.org>, David Gow <davidgow@google.com>, Tiezhu Yang <yangtiezhu@loongson.cn>, Daniel Vetter <daniel.vetter@ffwll.ch>, tangmeng <tangmeng@uniontech.com> Subject: [PATCH printk v1 16/18] kernel/panic: Add atomic write enforcement to warn/panic Date: Thu, 2 Mar 2023 21:02:16 +0106 Message-Id: <20230302195618.156940-17-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,INVALID_DATE_TZ_ABSURD, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759287420435331126?= X-GMAIL-MSGID: =?utf-8?q?1759287420435331126?= |
Series |
threaded/atomic console support
|
|
Commit Message
John Ogness
March 2, 2023, 7:56 p.m. UTC
From: Thomas Gleixner <tglx@linutronix.de> Invoke the atomic write enforcement functions for warn/panic to ensure that the information gets out to the consoles. For the panic case, add explicit intermediate atomic flush calls to ensure immediate flushing at important points. Otherwise the atomic flushing only occurs when dropping out of the elevated priority, which for panic may never happen. It is important to note that if there are any legacy consoles registered, they will be attempting to directly print from the printk-caller context, which may jeopardize the reliability of the atomic consoles. Optimally there should be no legacy consoles registered. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> --- kernel/panic.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
Comments
On Thu 2023-03-02 21:02:16, John Ogness wrote: > From: Thomas Gleixner <tglx@linutronix.de> > > Invoke the atomic write enforcement functions for warn/panic to > ensure that the information gets out to the consoles. > > For the panic case, add explicit intermediate atomic flush calls to > ensure immediate flushing at important points. Otherwise the atomic > flushing only occurs when dropping out of the elevated priority, > which for panic may never happen. > > It is important to note that if there are any legacy consoles > registered, they will be attempting to directly print from the > printk-caller context, which may jeopardize the reliability of the > atomic consoles. Optimally there should be no legacy consoles > registered. > > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -329,6 +332,8 @@ void panic(const char *fmt, ...) > if (_crash_kexec_post_notifiers) > __crash_kexec(NULL); > > + cons_atomic_flush(NULL, true); Do we need to explicitly flush the messages here? cons_atomic_flush() is called also from vprintk_emit(). And there are many messages printed with the PANIC priority above. This makes an assumption that either printk() in PANIC context does not try to show the messages immediately or that this explicit console_atomic_flush() tries harder. I think that both assumptions are wrong. > + > console_unblank(); > > /* > @@ -353,6 +358,7 @@ void panic(const char *fmt, ...) > * We can't use the "normal" timers since we just panicked. > */ > pr_emerg("Rebooting in %d seconds..\n", panic_timeout); > + cons_atomic_flush(NULL, true); Same here. > > for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) { > touch_nmi_watchdog(); > @@ -371,6 +377,7 @@ void panic(const char *fmt, ...) > */ > if (panic_reboot_mode != REBOOT_UNDEFINED) > reboot_mode = panic_reboot_mode; > + cons_atomic_flush(NULL, true); And here. > emergency_restart(); > } > #ifdef __sparc__ > @@ -383,12 +390,16 @@ void panic(const char *fmt, ...) > } > #endif > #if defined(CONFIG_S390) > + cons_atomic_flush(NULL, true); And here. > disabled_wait(); > #endif > pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); > > /* Do not scroll important messages printed above */ > suppress_printk = 1; > + > + cons_atomic_exit(CONS_PRIO_PANIC, prev_prio); On the contrary, I would explicitly call cons_atomic_flush(NULL, false) here instead of hiding it in cons_atomic_exit(). It would make it clear that this is the POINT where panic() tries harder to get the messages out on NOBKL consoles. > + > local_irq_enable(); > for (i = 0; ; i += PANIC_TIMER_STEP) { > touch_softlockup_watchdog(); > @@ -599,6 +610,10 @@ struct warn_args { > void __warn(const char *file, int line, void *caller, unsigned taint, > struct pt_regs *regs, struct warn_args *args) > { > + enum cons_prio prev_prio; > + > + prev_prio = cons_atomic_enter(CONS_PRIO_EMERGENCY); > + > disable_trace_on_warning(); > > if (file) > @@ -630,6 +645,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint, > > /* Just a warning, don't kill lockdep. */ > add_taint(taint, LOCKDEP_STILL_OK); > + > + cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio); > } I would more this into separate patch and keep this one only for the PANIC handling. Also I think that we want to set the EMERGENCY prio also in oops_enter()? Best Regards, Petr
On 2023-04-13, Petr Mladek <pmladek@suse.com> wrote: >> --- a/kernel/panic.c >> +++ b/kernel/panic.c >> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...) >> if (_crash_kexec_post_notifiers) >> __crash_kexec(NULL); >> >> + cons_atomic_flush(NULL, true); > > Do we need to explicitly flush the messages here? This is where the atomic printing actually starts (after the full dump has been inserted into the ringbuffer). > cons_atomic_flush() is called also from vprintk_emit(). And there are > many messages printed with the PANIC priority above. vprintk_emit() does not print in this case. From cons_atomic_flush(): /* * When in an elevated priority, the printk() calls are not * individually flushed. This is to allow the full output to * be dumped to the ringbuffer before starting with printing * the backlog. */ if (cpu_state->prio > NBCON_PRIO_NORMAL && printk_caller_wctxt) return; > This makes an assumption that either printk() in PANIC context > does not try to show the messages immediately or that this > explicit console_atomic_flush() tries harder. I think > that both assumptions are wrong. Both assumptions are correct, because until this point there has been no effort to print. >> @@ -353,6 +358,7 @@ void panic(const char *fmt, ...) >> * We can't use the "normal" timers since we just panicked. >> */ >> pr_emerg("Rebooting in %d seconds..\n", panic_timeout); >> + cons_atomic_flush(NULL, true); > > Same here. This flush is just to make sure the rebooting message is output. For nbcon consoles printk() calls are never synchronous except for during early boot (before kthreads are ready). The same goes for the other cons_atomic_flush() calls in this function. >> disabled_wait(); >> #endif >> pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); >> >> /* Do not scroll important messages printed above */ >> suppress_printk = 1; >> + >> + cons_atomic_exit(CONS_PRIO_PANIC, prev_prio); > > On the contrary, I would explicitly call cons_atomic_flush(NULL, false) > here instead of hiding it in cons_atomic_exit(). It is not hiding there. That is the semantic. After entering an atomic block all printk's are only writing to the ringbuffer. On exiting the atomic block the ringbuffer is flushed via atomic printing. Exiting CONS_PRIO_PANIC has a special condition that it first tries to safely flush all consoles, then will try the unsafe variant for consoles that were not flushed. > Also I think that we want to set the EMERGENCY prio also in > oops_enter()? Agreed. John
On Thu 2023-04-13 14:19:13, John Ogness wrote: > On 2023-04-13, Petr Mladek <pmladek@suse.com> wrote: > >> --- a/kernel/panic.c > >> +++ b/kernel/panic.c > >> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...) > >> if (_crash_kexec_post_notifiers) > >> __crash_kexec(NULL); > >> > >> + cons_atomic_flush(NULL, true); > > > > Do we need to explicitly flush the messages here? > > This is where the atomic printing actually starts (after the full dump > has been inserted into the ringbuffer). > > > cons_atomic_flush() is called also from vprintk_emit(). And there are > > many messages printed with the PANIC priority above. > > vprintk_emit() does not print in this case. From cons_atomic_flush(): > > /* > * When in an elevated priority, the printk() calls are not > * individually flushed. This is to allow the full output to > * be dumped to the ringbuffer before starting with printing > * the backlog. > */ > if (cpu_state->prio > NBCON_PRIO_NORMAL && printk_caller_wctxt) > return; OK, what is the motivation for this behavior, please? Does it has any advantages? > > > This makes an assumption that either printk() in PANIC context > > does not try to show the messages immediately or that this > > explicit console_atomic_flush() tries harder. I think > > that both assumptions are wrong. > > Both assumptions are correct, because until this point there has been no > effort to print. Honestly, this makes me nervous. It means that panic() messages will not reach the console unless they are explicitly flushed. First, it is error-prone because it requires calling console_atomic_flush() in all relevant code paths on the right locations. Second, it expects that panic() code could never fail between the explicit console_atomic_flush() calls. If it failed, it might be pretty useful to see the last printed message. Third, messages might get lost when there are too many. And it is realistic. For example, see panic_print_sys_info() it might add quite long reports. I would really prefer to flush atomic consoles with every printk() unless there is a really good reason not to do it. Best Regards, Petr
diff --git a/kernel/panic.c b/kernel/panic.c index da323209f583..db9834fbdf26 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -209,6 +209,7 @@ static void panic_print_sys_info(bool console_flush) */ void panic(const char *fmt, ...) { + enum cons_prio prev_prio; static char buf[1024]; va_list args; long i, i_next = 0, len; @@ -256,6 +257,8 @@ void panic(const char *fmt, ...) if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) panic_smp_self_stop(); + prev_prio = cons_atomic_enter(CONS_PRIO_PANIC); + console_verbose(); bust_spinlocks(1); va_start(args, fmt); @@ -329,6 +332,8 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); + cons_atomic_flush(NULL, true); + console_unblank(); /* @@ -353,6 +358,7 @@ void panic(const char *fmt, ...) * We can't use the "normal" timers since we just panicked. */ pr_emerg("Rebooting in %d seconds..\n", panic_timeout); + cons_atomic_flush(NULL, true); for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) { touch_nmi_watchdog(); @@ -371,6 +377,7 @@ void panic(const char *fmt, ...) */ if (panic_reboot_mode != REBOOT_UNDEFINED) reboot_mode = panic_reboot_mode; + cons_atomic_flush(NULL, true); emergency_restart(); } #ifdef __sparc__ @@ -383,12 +390,16 @@ void panic(const char *fmt, ...) } #endif #if defined(CONFIG_S390) + cons_atomic_flush(NULL, true); disabled_wait(); #endif pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); /* Do not scroll important messages printed above */ suppress_printk = 1; + + cons_atomic_exit(CONS_PRIO_PANIC, prev_prio); + local_irq_enable(); for (i = 0; ; i += PANIC_TIMER_STEP) { touch_softlockup_watchdog(); @@ -599,6 +610,10 @@ struct warn_args { void __warn(const char *file, int line, void *caller, unsigned taint, struct pt_regs *regs, struct warn_args *args) { + enum cons_prio prev_prio; + + prev_prio = cons_atomic_enter(CONS_PRIO_EMERGENCY); + disable_trace_on_warning(); if (file) @@ -630,6 +645,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint, /* Just a warning, don't kill lockdep. */ add_taint(taint, LOCKDEP_STILL_OK); + + cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio); } #ifndef __WARN_FLAGS