Message ID | 20231214214201.499426-12-john.ogness@linutronix.de |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp8871538dys; Thu, 14 Dec 2023 13:45:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IGm8WhBSp64G4a21pGu+PXeEuBXkW4lsiyOj7uGVCIpiiPOcpgwwZtzxHbhvhUh6jgwgC9Q X-Received: by 2002:a05:620a:495b:b0:77d:a463:e9cc with SMTP id vz27-20020a05620a495b00b0077da463e9ccmr11086842qkn.19.1702590332395; Thu, 14 Dec 2023 13:45:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702590332; cv=none; d=google.com; s=arc-20160816; b=MvQ3/x6GqhEYSwRPmxGC2ugx2M3M8PyZdiO8DZC3hEFblZu7k/s3pB25ysI70KnuxN omTcEBIjyCb5eNs20MwsGTHtfU3wPC1ucANGAWPklYmp+BIjnaWMAj6x09OlOvY8G59Z b21jjMiA08EtObLDQrC5uCC+JNWWG6xDUmc3vCStj0Ir2rrwWI0ic+1Fb6vupzy+degw Xp+y5bi4xID41HWzfU42Sn8hU7DWjUs/TAaot0wus//4n24yeh1l6ntVDckMhBz6EGfI ssTJQtjmiOEpMlY89KjaoeyBARacAiDlkd8quubvXmjy++Z9hdXjwj6hPBcaSR5ZppWc tabQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:dkim-signature:dkim-signature:from; bh=w+Rj9ow8fovhfkl47C5quy90KNsRy8XmYEJxuaPSPxc=; fh=pKnPKUsir0uEGSsor+4Zc2vgbu+g+ayvUgsdzkuXaoA=; b=dAJKP4274iuhd0bw0Axe/VL7pp3B5b/ADY1X3nh80HHfXolnS7CyGCyTgA/+9M14dY ng4vYY7fHDsmNKxDYtwq+/P5A/mLHFwgCdPHmT9yw1N1uAmtkkkn7OwZhsmQs+cXoGTQ FMGVUv+THW/nGRnx5IkTcyrQTu9OKl95zGwO9wY9fGlLUNixpXUu6HEjjVK8PdFYSb0v DsCfkFxUBtCOslKyVYHvAT6pLiYCGxJOVsX2Fl8DWPJYLbISNLuq+DKjH2vNYevOqkxO lno0anwQZTUswUyn0d3hcZYOlWGvpHCrfVMZ94xXVegs7m33bMF8bo+pX6VgJC/QVx68 lMwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=rYKzl1gL; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=E4nJ9zge; spf=pass (google.com: domain of linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id v2-20020a05620a440200b0077f357665d5si17063261qkp.400.2023.12.14.13.45.32 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 13:45:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=rYKzl1gL; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=E4nJ9zge; spf=pass (google.com: domain of linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 34A661C21BD3 for <ouuuleilei@gmail.com>; Thu, 14 Dec 2023 21:45:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A4D642C684; Thu, 14 Dec 2023 21:42:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="rYKzl1gL"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="E4nJ9zge" X-Original-To: linux-kernel@vger.kernel.org Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4572B6E2DB for <linux-kernel@vger.kernel.org>; Thu, 14 Dec 2023 21:42:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: John Ogness <john.ogness@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1702590128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w+Rj9ow8fovhfkl47C5quy90KNsRy8XmYEJxuaPSPxc=; b=rYKzl1gLfyiH9IJP+TFLliwyltsoWallmmQO2rIWW8Z/fGewTSWXqFjmYrmYTF+CeN4zKV wPontEoTCZuXFuB3Ybr6bj8SPuuXyeVscircT/Ol/ro4/4dtG60znLbdixzbjBmpbJlFq3 J3TKHO8N6BO1RKK25l+KflE1Xq/GC2yT5PjTw3ZWCPgmMuFfXHba9D9gMifVHVXsifgcar /6IPOTBBy9xorvPK9ZfRGwCxypgXJ6rrybFNb9io6crG3P4hKA5Ph1junGYvzLOycp+ckA d1JYd/lzSbUDSdRQ2G9pcAEg7G9bpuDMUoSFxsEw+Kaq+kojiFKqWrbCnXIIhg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1702590128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w+Rj9ow8fovhfkl47C5quy90KNsRy8XmYEJxuaPSPxc=; b=E4nJ9zgeLAGTpX7uf+GwQSKNUOUmHjHZFCXuM4dmVdN4NYPZT8iNldWycwr5Ytv/BL7qeg JD1ZUoqpcg9ry0DQ== To: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>, Steven Rostedt <rostedt@goodmis.org>, Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org Subject: [PATCH printk v3 11/14] printk: ringbuffer: Consider committed as finalized in panic Date: Thu, 14 Dec 2023 22:47:58 +0106 Message-Id: <20231214214201.499426-12-john.ogness@linutronix.de> In-Reply-To: <20231214214201.499426-1-john.ogness@linutronix.de> References: <20231214214201.499426-1-john.ogness@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785295360232112656 X-GMAIL-MSGID: 1785295360232112656 |
Series |
fix console flushing
|
|
Commit Message
John Ogness
Dec. 14, 2023, 9:41 p.m. UTC
A descriptor in the committed state means the record does not yet
exist for the reader. However, for the panic CPU, committed
records should be handled as finalized records since they contain
message data in a consistent state and may contain additional
hints as to the cause of the panic.
Add an exception for records in the commit state to not be
considered non-existing when reading from the panic CPU.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
kernel/printk/printk_ringbuffer.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
Comments
On Thu 2023-12-14 22:47:58, John Ogness wrote: > A descriptor in the committed state means the record does not yet > exist for the reader. However, for the panic CPU, committed > records should be handled as finalized records since they contain > message data in a consistent state and may contain additional > hints as to the cause of the panic. > > Add an exception for records in the commit state to not be > considered non-existing when reading from the panic CPU. IMHO, it is important to describe effects of this change in more details. And I think that it actually does not work as expected, see below. > --- a/kernel/printk/printk_ringbuffer.c > +++ b/kernel/printk/printk_ringbuffer.c > @@ -1875,16 +1877,25 @@ static int desc_read_finalized_seq(struct prb_desc_ring *desc_ring, > > /* > * An unexpected @id (desc_miss) or @seq mismatch means the record > - * does not exist. A descriptor in the reserved or committed state > - * means the record does not yet exist for the reader. > + * does not exist. A descriptor in the reserved state means the > + * record does not yet exist for the reader. > */ > if (d_state == desc_miss || > d_state == desc_reserved || > - d_state == desc_committed || > s != seq) { > return -EINVAL; > } > > + /* > + * A descriptor in the committed state means the record does not yet > + * exist for the reader. However, for the panic CPU, committed > + * records are also handled as finalized records since they contain > + * message data in a consistent state and may contain additional > + * hints as to the cause of the panic. > + */ > + if (d_state == desc_committed && !this_cpu_in_panic()) > + return -EINVAL; If I get it correctly, this causes that panic CPU would see a non-finalized continuous line as finalized. And it would flush the existing piece to consoles. The problem is that pr_cont() would append the message into the same record. But the consoles would already wait for the next record. They would miss the appended pieces. It might be fixed by finalizing the record before we read the non-finalized piece. It is doable but it would add another lock-less scenario. I am not sure if it would work with the existing barriers or how complicated it would be to prove the correctness. Now, in practice, it would allow to flush pieces of continuous lines printed on panic CPU immediately. It would not affect messages printed by other CPUs because of a mix of reasons: 1. The current code tries hard to move the console owner to panic CPU. It allows the panic CPU to flush the messages a safe way even when other CPUs are stopped or somehow locked. It means that the consoles are ideally flushed when the panic CPU adds a message. 2. Only the last record might be in a committed state. Any older record is automatically finalized when it reaches the committed state. 3. The previous patch causes that messages from non-panic CPUs are skipped when they are not committed or finalized, see https://lore.kernel.org/all/20231214214201.499426-11-john.ogness@linutronix.de Now, this patch would really help only when the panic CPU dies in the middle of a continuous message or when the final message does not have a newline character. Honestly, I think that it is not worth the effort. It would add another complexity to the memory barriers. The real effect is not easy to understand. And the benefit is minimal from my POV. Alternative solutions: 1. Flush the non-finalized continuous line only before reaching the infinite loop in panic(). printk() gets disabled at this point anyway. It would solve only this one scenario, though. 2. Always finalize messages in vprintk_store() when in_panic(). It would have the desired effect in all panic() situations. And it would not add another complexity to the lock-less ringbuffer code. Best Regards, Petr
On 2024-02-01, Petr Mladek <pmladek@suse.com> wrote: > On Thu 2023-12-14 22:47:58, John Ogness wrote: >> A descriptor in the committed state means the record does not yet >> exist for the reader. However, for the panic CPU, committed >> records should be handled as finalized records since they contain >> message data in a consistent state and may contain additional >> hints as to the cause of the panic. >> >> Add an exception for records in the commit state to not be >> considered non-existing when reading from the panic CPU. > > IMHO, it is important to describe effects of this change in more > details. And I think that it actually does not work as expected, > see below. I reviewed my notes from our meeting in Richmond. We had agreed that this feature should not apply to the latest message. That would change the commit message to be as follows: printk: ringbuffer: Consider committed as finalized in panic A descriptor in the committed state means the record does not yet exist for the reader. However, for the panic CPU, committed records should be handled as finalized records since they contain message data in a consistent state and may contain additional hints as to the cause of the panic. The only exception is the last record. The panic CPU may be usig LOG_CONT and the individual pieces should not be printed separately. Add a special-case check for records in the commit state to not be considered non-existing when reading from the panic CPU and it is not the last record. >> --- a/kernel/printk/printk_ringbuffer.c >> +++ b/kernel/printk/printk_ringbuffer.c >> @@ -1875,16 +1877,25 @@ static int desc_read_finalized_seq(struct prb_desc_ring *desc_ring, >> >> /* >> * An unexpected @id (desc_miss) or @seq mismatch means the record >> - * does not exist. A descriptor in the reserved or committed state >> - * means the record does not yet exist for the reader. >> + * does not exist. A descriptor in the reserved state means the >> + * record does not yet exist for the reader. >> */ >> if (d_state == desc_miss || >> d_state == desc_reserved || >> - d_state == desc_committed || >> s != seq) { >> return -EINVAL; >> } >> >> + /* >> + * A descriptor in the committed state means the record does not yet >> + * exist for the reader. However, for the panic CPU, committed >> + * records are also handled as finalized records since they contain >> + * message data in a consistent state and may contain additional >> + * hints as to the cause of the panic. >> + */ >> + if (d_state == desc_committed && !this_cpu_in_panic()) >> + return -EINVAL; And this code would change to: + /* + * A descriptor in the committed state means the record does not yet + * exist for the reader. However, for the panic CPU, committed + * records are also handled as finalized records since they contain + * message data in a consistent state and may contain additional + * hints as to the cause of the panic. The only exception is the + * last record, which may still be appended by the panic CPU and so + * is not available to the panic CPU for reading. + */ + if (d_state == desc_committed && + (!this_cpu_in_panic() || id == atomic_long_read(&desc_ring->head_id))) { + return -EINVAL; + } > If I get it correctly, this causes that panic CPU would see a > non-finalized continuous line as finalized. And it would flush > the existing piece to consoles. > > The problem is that pr_cont() would append the message into > the same record. But the consoles would already wait > for the next record. They would miss the appended pieces. Exactly. That is why we said that the last message would not be available. Maybe this new version is acceptable. > Honestly, I think that it is not worth the effort. It would add > another complexity to the memory barriers. The real effect is not easy > to understand. And the benefit is minimal from my POV. I am OK with dropping this patch from the series. It is questionable how valuable a LOG_CONT piece from a non-panic CPU is anyway. And if the non-panic CPU managed to reopen the record, it would be skipped anyway. I will drop this patch unless you want to keep the new version. John
On Mon 2024-02-05 15:14:14, John Ogness wrote: > On 2024-02-01, Petr Mladek <pmladek@suse.com> wrote: > > On Thu 2023-12-14 22:47:58, John Ogness wrote: > >> A descriptor in the committed state means the record does not yet > >> exist for the reader. However, for the panic CPU, committed > >> records should be handled as finalized records since they contain > >> message data in a consistent state and may contain additional > >> hints as to the cause of the panic. > >> > >> Add an exception for records in the commit state to not be > >> considered non-existing when reading from the panic CPU. > > > > IMHO, it is important to describe effects of this change in more > > details. And I think that it actually does not work as expected, > > see below. > > I reviewed my notes from our meeting in Richmond. We had agreed that > this feature should not apply to the latest message. That would change > the commit message to be as follows: > > printk: ringbuffer: Consider committed as finalized in panic > > A descriptor in the committed state means the record does not yet > exist for the reader. However, for the panic CPU, committed > records should be handled as finalized records since they contain > message data in a consistent state and may contain additional > hints as to the cause of the panic. > > The only exception is the last record. The panic CPU may be > usig LOG_CONT and the individual pieces should not be printed > separately. This was my first understanding. But then I realized that appended pieces would not be printed at all when a committed part was printed. And it might be even worse. I have realized yesterday that an attempt to print the last record in the committed state might cause missing the trailing '\0' in the copied data. Just imagine the following race: CPU0 CPU1 prb_read() // read last record in committed state copy_data(.., info->text_len) prb_reserve_in_last() printk_sprint(&r.text_buf[0], reserve_size, ... prb_commit() // copy data_size before appending // from already appended buffer // The trailing '\0' is not in the copied part memcpy(&buf[0], data, data_size); BANG: CPU0 would try to use a string without the trailing '\0'. We should probably add a sanity check and fixup for this potential problem. > Add a special-case check for records in the commit state to not > be considered non-existing when reading from the panic CPU and > it is not the last record. > > >> --- a/kernel/printk/printk_ringbuffer.c > >> +++ b/kernel/printk/printk_ringbuffer.c > >> @@ -1875,16 +1877,25 @@ static int desc_read_finalized_seq(struct prb_desc_ring *desc_ring, > >> > >> /* > >> * An unexpected @id (desc_miss) or @seq mismatch means the record > >> - * does not exist. A descriptor in the reserved or committed state > >> - * means the record does not yet exist for the reader. > >> + * does not exist. A descriptor in the reserved state means the > >> + * record does not yet exist for the reader. > >> */ > >> if (d_state == desc_miss || > >> d_state == desc_reserved || > >> - d_state == desc_committed || > >> s != seq) { > >> return -EINVAL; > >> } > >> > >> + /* > >> + * A descriptor in the committed state means the record does not yet > >> + * exist for the reader. However, for the panic CPU, committed > >> + * records are also handled as finalized records since they contain > >> + * message data in a consistent state and may contain additional > >> + * hints as to the cause of the panic. > >> + */ > >> + if (d_state == desc_committed && !this_cpu_in_panic()) > >> + return -EINVAL; > > And this code would change to: > > + /* > + * A descriptor in the committed state means the record does not yet > + * exist for the reader. However, for the panic CPU, committed > + * records are also handled as finalized records since they contain > + * message data in a consistent state and may contain additional > + * hints as to the cause of the panic. The only exception is the > + * last record, which may still be appended by the panic CPU and so > + * is not available to the panic CPU for reading. > + */ > + if (d_state == desc_committed && > + (!this_cpu_in_panic() || id == atomic_long_read(&desc_ring->head_id))) { > + return -EINVAL; > + } This is clever. Well, it would help only with a very small race window. Older records are automatically finalized when committed. It is because they could not be reopened. See prb_commit(). > > If I get it correctly, this causes that panic CPU would see a > > non-finalized continuous line as finalized. And it would flush > > the existing piece to consoles. > > > > The problem is that pr_cont() would append the message into > > the same record. But the consoles would already wait > > for the next record. They would miss the appended pieces. > > Exactly. That is why we said that the last message would not be > available. Maybe this new version is acceptable. > > > Honestly, I think that it is not worth the effort. It would add > > another complexity to the memory barriers. The real effect is not easy > > to understand. And the benefit is minimal from my POV. > > I am OK with dropping this patch from the series. It is questionable how > valuable a LOG_CONT piece from a non-panic CPU is anyway. And if the > non-panic CPU managed to reopen the record, it would be skipped anyway. > > I will drop this patch unless you want to keep the new version. Honestly, I would drop it. It is kind of tricky code. And it would help only with a very small race window with messages from non-panic() CPUs. Especially with the 14th patch which blocks messages from other CPUs during panic() anyway. Best Regards, Petr
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c index d6ed33683b8b..e7b808b829a0 100644 --- a/kernel/printk/printk_ringbuffer.c +++ b/kernel/printk/printk_ringbuffer.c @@ -1857,6 +1857,8 @@ static bool copy_data(struct prb_data_ring *data_ring, * descriptor. However, it also verifies that the record is finalized and has * the sequence number @seq. On success, 0 is returned. * + * For the panic CPU, committed descriptors are also considered finalized. + * * Error return values: * -EINVAL: A finalized record with sequence number @seq does not exist. * -ENOENT: A finalized record with sequence number @seq exists, but its data @@ -1875,16 +1877,25 @@ static int desc_read_finalized_seq(struct prb_desc_ring *desc_ring, /* * An unexpected @id (desc_miss) or @seq mismatch means the record - * does not exist. A descriptor in the reserved or committed state - * means the record does not yet exist for the reader. + * does not exist. A descriptor in the reserved state means the + * record does not yet exist for the reader. */ if (d_state == desc_miss || d_state == desc_reserved || - d_state == desc_committed || s != seq) { return -EINVAL; } + /* + * A descriptor in the committed state means the record does not yet + * exist for the reader. However, for the panic CPU, committed + * records are also handled as finalized records since they contain + * message data in a consistent state and may contain additional + * hints as to the cause of the panic. + */ + if (d_state == desc_committed && !this_cpu_in_panic()) + return -EINVAL; + /* * A descriptor in the reusable state may no longer have its data * available; report it as existing but with lost data. Or the record