Message ID | 20231214214201.499426-3-john.ogness@linutronix.de |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp8870654dys; Thu, 14 Dec 2023 13:43:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IHWTGK3AZB1ZWLotTPRcWhjkrQR0QxN2G/5GSiM9zg0xqMqT8om9ExpzIKcpSKocip/kQpE X-Received: by 2002:a17:902:e789:b0:1d0:6ffd:e2d5 with SMTP id cp9-20020a170902e78900b001d06ffde2d5mr11893504plb.111.1702590218717; Thu, 14 Dec 2023 13:43:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702590218; cv=none; d=google.com; s=arc-20160816; b=PZL9RMCQfV8OgCyYWKWCfEopH2aMLre9taeG6JFEVTtn0tABijLLADgAjUwcrALPz+ Yn5cra2YF33KuHCwHi+c/Z/Z77Eqp00Hbxn8L+ZmrB7r57pjVajX1dEsOTFYCUO543YW LAO4ZJMQIwPZ0oG9oXRwDhWy4tYJk8G6cN9EpRWgz99t86+94/2gloQ+4q4WCKY/CVug Q55L5AXnnGjxe1Kjlv1z59E0W85kfa07yswHLsAVRkoaS9yQPuE1kc7oj4ADQSb+l+JD 99SqCV6Cm9KtBavQrwTU+fqRAA7Zr09cAeXc+yNSeLkOH6Mzfuxi6FTnMXc5FlU46LsN Z05g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:dkim-signature:dkim-signature:from; bh=n7/AQ14CtAPSsdBF4bK1N6nAJCYWmYHbSVQMVRCyl58=; fh=FsZvNuK0qXzIHpKngach73/EWdkC91/zDh9w34lnxX4=; b=qERmMrusINe1IYGiJ4beM2r15O6VTx5ZFhDrK3ueA5fC0TEQm1mYz9wDzKzhksYoqc kwwRRWNjAG45ZmXDIgz6w8RoZC9/DO7gy7dSz58Tm5WzTSaA5qXZxRoJSY4zQJ5HdYUk Ui0XsbI0m6JmMRPS0tEJqNcxLUWDlX1xKa26fy2ZIktBIB3pbc1UeVzsFvtHobFwZMAl FnQYfkWhtpv497jCKHtyWaqmdV9GVmIU0P5rfgmFmNkVx/zTgBJdPYqBB3FUPPyo8psd +1+kf+B15He6KViV5V5CWRrMY2QQFvdVHf7lHMKlIe14AZgS/Q1JGxnWa/f4UA2nfkNt mfCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=ELKFiU6Q; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id b5-20020a170902d50500b001d0b5aa3000si451387plg.456.2023.12.14.13.43.38 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 13:43:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=ELKFiU6Q; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-146-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 835D1B21B3B for <ouuuleilei@gmail.com>; Thu, 14 Dec 2023 21:43:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4232E6F630; Thu, 14 Dec 2023 21:42:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="ELKFiU6Q"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="N18HNVVu" X-Original-To: linux-kernel@vger.kernel.org Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37D636A34D for <linux-kernel@vger.kernel.org>; Thu, 14 Dec 2023 21:42:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: John Ogness <john.ogness@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1702590125; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n7/AQ14CtAPSsdBF4bK1N6nAJCYWmYHbSVQMVRCyl58=; b=ELKFiU6QcHpmyjj4y3/ThWrC6NQsGQrxCPGsHKykjs/CeBfb1lK742gwL4JdDM8lCJjxFZ w+ymCW6goNmRY5vxbSXxHJ/onItb2M5fnM2QrmKkat5qnFhhrwKT8jrB4n+R6Ukd6zzinb 1jdOKuWs/T04n+S6Aou6k+sjVMrAS2iMYCfHDme3aNQKFJ3O1cdTPyagsQhPbkVA0wx2gR V5HQuORKoXg6WyYNjOyoMgwv1vDu+x9W84Rq46CcnV3SrIbOOOi5+btRwkOr9jrxFKuM2r 2u4tWmuFFrnavCGbKEMk/FpbZNgfhRGBR/Q8ByC4CaMGJNea1odvqekSvqiJNg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1702590125; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n7/AQ14CtAPSsdBF4bK1N6nAJCYWmYHbSVQMVRCyl58=; b=N18HNVVuiXztnaW/iSDFaWUXfIQT0nam9vm/nj3pVKOO4M1DD9GYxXgV0hGTwzBUbPMjDr 9e6CQjVJbikgkJCg== To: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>, Steven Rostedt <rostedt@goodmis.org>, Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org, Francesco Dolcini <francesco@dolcini.it>, kernel test robot <oliver.sang@intel.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: [PATCH printk v3 02/14] printk: Adjust mapping for 32bit seq macros Date: Thu, 14 Dec 2023 22:47:49 +0106 Message-Id: <20231214214201.499426-3-john.ogness@linutronix.de> In-Reply-To: <20231214214201.499426-1-john.ogness@linutronix.de> References: <20231214214201.499426-1-john.ogness@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785295240938266427 X-GMAIL-MSGID: 1785295240938266427 |
Series |
fix console flushing
|
|
Commit Message
John Ogness
Dec. 14, 2023, 9:41 p.m. UTC
Note: This change only applies to 32bit architectures. On 64bit architectures the macros are NOPs. __ulseq_to_u64seq() computes the upper 32 bits of the passed argument value (@ulseq). The upper bits are derived from a base value (@rb_next_seq) in a way that assumes @ulseq represents a 64bit number that is less than or equal to @rb_next_seq. Until now this mapping has been correct for all call sites. However, in a follow-up commit, values of @ulseq will be passed in that are higher than the base value. This requires a change to how the 32bit value is mapped to a 64bit sequence number. Rather than mapping @ulseq such that the base value is the end of a 32bit block, map @ulseq such that the base value is in the middle of a 32bit block. This allows supporting 31 bits before and after the base value, which is deemed acceptable for the console sequence number during runtime. Here is an example to illustrate the previous and new mappings. For a base value (@rb_next_seq) of 2 2000 0000... Before this change the range of possible return values was: 1 2000 0001 to 2 2000 0000 __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff __ulseq_to_u64seq(2000 0000) => 2 2000 0000 __ulseq_to_u64seq(2000 0001) => 1 2000 0001 __ulseq_to_u64seq(9fff ffff) => 1 9fff ffff __ulseq_to_u64seq(a000 0000) => 1 a000 0000 __ulseq_to_u64seq(a000 0001) => 1 a000 0001 After this change the range of possible return values are: 1 a000 0001 to 2 a000 0000 __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff __ulseq_to_u64seq(2000 0000) => 2 2000 0000 __ulseq_to_u64seq(2000 0001) => 2 2000 0001 __ulseq_to_u64seq(9fff ffff) => 2 9fff ffff __ulseq_to_u64seq(a000 0000) => 2 a000 0000 __ulseq_to_u64seq(a000 0001) => 1 a000 0001 [ john.ogness: Rewrite commit message. ] Reported-by: Francesco Dolcini <francesco@dolcini.it> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202311161555.3ee16fc9-oliver.sang@intel.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> --- kernel/printk/printk_ringbuffer.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On 2023-12-14 22:47:49 [+0106], John Ogness wrote: > Note: This change only applies to 32bit architectures. On 64bit > architectures the macros are NOPs. You lost my authorship. But serious matter: Is this bug already present in the current printk tree or is this code just not used and enabled later? Patch #1 from this series moves the sequence around so is this problem introduced in the previous patch and fixed here? Sebastian
On 2023-12-15, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > On 2023-12-14 22:47:49 [+0106], John Ogness wrote: >> Note: This change only applies to 32bit architectures. On 64bit >> architectures the macros are NOPs. > > You lost my authorship. OK. Thanks for speaking up. (And thanks for doing the heavy lifting researching the needed change.) > But serious matter: Is this bug already present in the current printk > tree or is this code just not used and enabled later? Patch #1 from > this series moves the sequence around so is this problem introduced in > the previous patch and fixed here? No, it is not a bug at this point. As the commit message writes: "Until now this mapping has been correct for all call sites." But if the mapping is not changed here, it will become a bug after applying patch 3 of this series. John
On 2023-12-15 11:16:44 [+0106], John Ogness wrote: > No, it is not a bug at this point. As the commit message writes: > > "Until now this mapping has been correct for all call sites." > > But if the mapping is not changed here, it will become a bug after > applying patch 3 of this series. Okay. Thank you clarifying this. > John Sebastian
On Thu 2023-12-14 22:47:49, John Ogness wrote: > Note: This change only applies to 32bit architectures. On 64bit > architectures the macros are NOPs. > > __ulseq_to_u64seq() computes the upper 32 bits of the passed > argument value (@ulseq). The upper bits are derived from a base > value (@rb_next_seq) in a way that assumes @ulseq represents a > 64bit number that is less than or equal to @rb_next_seq. > > Until now this mapping has been correct for all call sites. However, > in a follow-up commit, values of @ulseq will be passed in that are > higher than the base value. This requires a change to how the 32bit > value is mapped to a 64bit sequence number. > > Rather than mapping @ulseq such that the base value is the end of a > 32bit block, map @ulseq such that the base value is in the middle of > a 32bit block. This allows supporting 31 bits before and after the > base value, which is deemed acceptable for the console sequence > number during runtime. > > Here is an example to illustrate the previous and new mappings. > > For a base value (@rb_next_seq) of 2 2000 0000... > > Before this change the range of possible return values was: > > 1 2000 0001 to 2 2000 0000 > > __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff > __ulseq_to_u64seq(2000 0000) => 2 2000 0000 > __ulseq_to_u64seq(2000 0001) => 1 2000 0001 > __ulseq_to_u64seq(9fff ffff) => 1 9fff ffff > __ulseq_to_u64seq(a000 0000) => 1 a000 0000 > __ulseq_to_u64seq(a000 0001) => 1 a000 0001 > > After this change the range of possible return values are: > 1 a000 0001 to 2 a000 0000 > > __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff > __ulseq_to_u64seq(2000 0000) => 2 2000 0000 > __ulseq_to_u64seq(2000 0001) => 2 2000 0001 > __ulseq_to_u64seq(9fff ffff) => 2 9fff ffff > __ulseq_to_u64seq(a000 0000) => 2 a000 0000 > __ulseq_to_u64seq(a000 0001) => 1 a000 0001 > > [ john.ogness: Rewrite commit message. ] > > Reported-by: Francesco Dolcini <francesco@dolcini.it> > Reported-by: kernel test robot <oliver.sang@intel.com> > Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com > Reported-by: kernel test robot <oliver.sang@intel.com> > Closes: https://lore.kernel.org/oe-lkp/202311161555.3ee16fc9-oliver.sang@intel.com > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Signed-off-by: John Ogness <john.ogness@linutronix.de> Great catch! It must have been complicated to debug this. Reviewed-by: Petr Mladek <pmladek@suse.com> Best Regards, Petr
On Fri 2024-01-12 11:28:35, Petr Mladek wrote: > On Thu 2023-12-14 22:47:49, John Ogness wrote: > > Note: This change only applies to 32bit architectures. On 64bit > > architectures the macros are NOPs. > > > > __ulseq_to_u64seq() computes the upper 32 bits of the passed > > argument value (@ulseq). The upper bits are derived from a base > > value (@rb_next_seq) in a way that assumes @ulseq represents a > > 64bit number that is less than or equal to @rb_next_seq. > > > > Until now this mapping has been correct for all call sites. However, > > in a follow-up commit, values of @ulseq will be passed in that are > > higher than the base value. This requires a change to how the 32bit > > value is mapped to a 64bit sequence number. > > > > Rather than mapping @ulseq such that the base value is the end of a > > 32bit block, map @ulseq such that the base value is in the middle of > > a 32bit block. This allows supporting 31 bits before and after the > > base value, which is deemed acceptable for the console sequence > > number during runtime. > > > > Here is an example to illustrate the previous and new mappings. > > > > For a base value (@rb_next_seq) of 2 2000 0000... > > > > Before this change the range of possible return values was: > > > > 1 2000 0001 to 2 2000 0000 > > > > __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff > > __ulseq_to_u64seq(2000 0000) => 2 2000 0000 > > __ulseq_to_u64seq(2000 0001) => 1 2000 0001 > > __ulseq_to_u64seq(9fff ffff) => 1 9fff ffff > > __ulseq_to_u64seq(a000 0000) => 1 a000 0000 > > __ulseq_to_u64seq(a000 0001) => 1 a000 0001 > > > > After this change the range of possible return values are: > > 1 a000 0001 to 2 a000 0000 > > > > __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff > > __ulseq_to_u64seq(2000 0000) => 2 2000 0000 > > __ulseq_to_u64seq(2000 0001) => 2 2000 0001 > > __ulseq_to_u64seq(9fff ffff) => 2 9fff ffff > > __ulseq_to_u64seq(a000 0000) => 2 a000 0000 > > __ulseq_to_u64seq(a000 0001) => 1 a000 0001 > > > > [ john.ogness: Rewrite commit message. ] > > > > Reported-by: Francesco Dolcini <francesco@dolcini.it> > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Closes: https://lore.kernel.org/oe-lkp/202311161555.3ee16fc9-oliver.sang@intel.com > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Signed-off-by: John Ogness <john.ogness@linutronix.de> > > Great catch! It must have been complicated to debug this. > > Reviewed-by: Petr Mladek <pmladek@suse.com> That said, I am a bit nervous that a bug like this might cause workqueue stall and panic() the kernel. At least, this is how I read https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com/ [ 125.917650][ C0] workqueue events: flags=0x0 [ 125.917979][ C0] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 125.917984][ C0] pending: deferred_probe_timeout_work_func [ 125.917992][ C0] workqueue events_power_efficient: flags=0x82 [ 125.919409][ C0] pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256 refcnt=2 [ 125.919412][ C0] pending: do_cache_clean [ 125.919419][ C0] workqueue mm_percpu_wq: flags=0x8 [ 125.920619][ C0] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 125.920622][ C0] pending: vmstat_update [ 125.920632][ C0] Showing backtraces of running workers in stalled CPU-bound worker pools: [ 129.568547][ C1] rcu: INFO: rcu_sched self-detected stall on CPU [ 129.568553][ C1] rcu: 1-....: (26250 ticks this GP) idle=3c24/1/0x40000000 softirq=5858/5934 fqs=13122 [ 129.568558][ C1] rcu: (t=26251 jiffies g=301 q=598 ncpus=2) [ 129.568562][ C1] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W N 6.6.0-00134-g768c33033e41 #1 [ 129.568565][ C1] EIP: desc_read (kernel/printk/printk_ringbuffer.c:439) [ 129.568573][ C1] Code: fb 83 c4 08 5b 5e 5f 5d c3 8d 74 26 00 0f 0b eb d2 3e 8d 74 26 00 55 89 e5 57 89 cf 56 53 89 c3 83 ec 08 8b 08 b8 ff ff ff ff <89> 55 f0 d3 e0 f7 d0 21 d0 6b f0 58 83 f9 1f 0f 87 a7 e6 83 00 03 [ 129.568575][ C1] EAX: ffffffff EBX: c1d5cd60 ECX: 0000000f EDX: 3fff8001 [ 129.568576][ C1] ESI: 00000001 EDI: c320fd78 EBP: c320fd28 ESP: c320fd14 [ 129.568578][ C1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00000286 [ 129.568582][ C1] CR0: 80050033 CR2: b7d77b93 CR3: 022b4000 CR4: 00040690 [ 129.568583][ C1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 129.568584][ C1] DR6: fffe0ff0 DR7: 00000400 [ 129.568585][ C1] Call Trace: [ 129.568589][ C1] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 129.568596][ C1] ? dump_cpu_task (kernel/sched/core.c:11572) [ 129.568602][ C1] ? rcu_dump_cpu_stacks (include/linux/cpumask.h:143 include/linux/cpumask.h:150 include/linux/cpumask.h:212 kernel/rcu/tree_stall.h:367) [ 129.568606][ C1] ? vprintk (kernel/printk/printk_safe.c:46) [ 129.568611][ C1] ? print_cpu_stall (kernel/rcu/tree_stall.h:692) [ 129.568616][ C1] ? check_cpu_stall (kernel/rcu/tree_stall.h:775) [ 129.568620][ C1] ? rcu_sched_clock_irq (kernel/rcu/tree.c:3869 kernel/rcu/tree.c:2240) [ 129.568623][ C1] ? update_process_times (arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2080) [ 129.568628][ C1] ? tick_sched_timer (kernel/time/tick-sched.c:1497) [ 129.568632][ C1] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752) [ 129.568636][ C1] ? cpumask_weight (arch/x86/kernel/alternative.c:1766) [ 129.568640][ C1] ? hrtimer_interrupt (kernel/time/hrtimer.c:1817) [ 129.568645][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) [ 129.568650][ C1] ? __sysvec_apic_timer_interrupt (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:444 include/linux/jump_label.h:260 include/linux/jump_label.h:270 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1081) [ 129.568653][ C1] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074) [ 129.568656][ C1] ? handle_exception (arch/x86/entry/entry_32.S:1049) [ 129.568662][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) [ 129.568664][ C1] ? desc_read (kernel/printk/printk_ringbuffer.c:439) [ 129.568668][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) [ 129.568670][ C1] ? desc_read (kernel/printk/printk_ringbuffer.c:439) [ 129.568675][ C1] desc_read_finalized_seq (kernel/printk/printk_ringbuffer.c:1907) [ 129.568680][ C1] prb_next_reserve_seq (kernel/printk/printk_ringbuffer.c:2084) [ 129.568682][ C1] ? nbcon_get_cpu_emergency_nesting (kernel/printk/nbcon.c:984) [ 129.568689][ C1] nbcon_atomic_flush_all (kernel/printk/nbcon.c:1133) [ 129.568692][ C1] ? nbcon_atomic_flush_all (kernel/printk/nbcon.c:1133) [ 129.568695][ C1] printk_trigger_flush (arch/x86/include/asm/preempt.h:85 kernel/printk/printk.c:3993) [ 129.568698][ C1] nbcon_cpu_emergency_exit (kernel/printk/nbcon.c:1193) [ 129.568701][ C1] ? refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) [ 129.568706][ C1] __warn (kernel/panic.c:692) [ 129.568711][ C1] ? refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) [ 129.568714][ C1] report_bug (lib/bug.c:201 lib/bug.c:219) [ 129.568720][ C1] ? exc_overflow (arch/x86/kernel/traps.c:250) [ 129.568722][ C1] handle_bug (arch/x86/kernel/traps.c:216) [ 129.568724][ C1] exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [ 129.568727][ C1] handle_exception (arch/x86/entry/entry_32.S:1049) [ 129.568729][ C1] EIP: refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) [ 129.568731][ C1] Code: 79 18 c2 01 68 a8 b9 c3 c1 e8 03 7b cf ff 0f 0b 58 c9 c3 8d b6 00 00 00 00 c6 05 86 79 18 c2 01 68 50 b9 c3 c1 e8 e7 7a cf ff <0f> 0b 59 c9 c3 66 90 89 c2 8b 00 3d 00 00 00 c0 74 12 83 f8 01 74 It looks like it caused some loop and refcout overlow or so. But I might be wrong. I would like to better understand this and check if we could prevent it somehow. Best Regards, Petr
On 2024-01-12 19:14:44 [+0100], Petr Mladek wrote: > > That said, I am a bit nervous that a bug like this might cause > workqueue stall and panic() the kernel. > At least, this is how I read > https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com/ well, workqueue stalls and RCU as well because the CPU spins. That is a natural consequence because the CPU makes no progress (at boot). The panic _might_ be due to panic_on_error or so. There is no scheduler, nothing so one CPU is blocked and the world ends… > It looks like it caused some loop and refcout overlow or so. > But I might be wrong. > > I would like to better understand this and check if we could prevent > it somehow. Based on memory: the problem is that the sign extension bug (the fixed bug) returned the wrong or too low sequence number. So the printk code tried again to obtain a new sequence number. And got the wrong again. And this is what looped during boot. I'm not sure if this sort of lockup can happen now after the bug is fixed. I can issue a NMI backtrace on all CPUs (32) without the sync (so they all can printk immediately and not one after the other) and it prints and continues… > Best Regards, > Petr Sebastian
On 2024-01-12, Petr Mladek <pmladek@suse.com> wrote: >> > Reported-by: Francesco Dolcini <francesco@dolcini.it> >> > Reported-by: kernel test robot <oliver.sang@intel.com> >> > Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com >> > Reported-by: kernel test robot <oliver.sang@intel.com> >> > Closes: https://lore.kernel.org/oe-lkp/202311161555.3ee16fc9-oliver.sang@intel.com >> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> > Signed-off-by: John Ogness <john.ogness@linutronix.de> >> >> Great catch! It must have been complicated to debug this. >> >> Reviewed-by: Petr Mladek <pmladek@suse.com> > > That said, I am a bit nervous that a bug like this might cause > workqueue stall and panic() the kernel. > > At least, this is how I read > https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com/ Note that the bug is reported for code that mainline will never have. This patch "fixes" the problem before it is a problem. From the perspective of mainline, the problem never existed. Perhaps it is inappropriate to list these with the Closes tag. > It looks like it caused some loop and refcout overlow or so. > But I might be wrong. > > I would like to better understand this and check if we could prevent > it somehow. The code in question can be found here: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/kernel/printk/printk_ringbuffer.c?id=768c33033e41ad195a9304ebb183afb730d5ae98#n2031 The URL above points to prb_next_reserve_seq(), which is in an infinite loop. desc_last_finalized_seq() is returning a huge garbage value due to the 32bit sequence value 1 being incorrectly mapped to 0xffffffff00000001 (which is 18446744069414584321). When desc_read_finalized_seq() is called (line 2082), it obviously returns -EINVAL because there is no record with this huge sequence number. The code interrupts -EINVAL to mean that the record has been overwritten, and so it tries again with an updated last finalized sequence number => infinite loop. With this patch applied, the 32bit value 1 is correctly mapped to sequence number 1 and all is OK. The problem was that desc_last_finalized_seq() returned a sequence number that was higher than the highest record. That was the bug. As long as desc_last_finalized_seq() always returns a sequence number that exists or used to exist, the retry loop is fine. And since the tail record is always in the finalized state, there will always be a finalized record available. John
On Mon 2024-01-15 11:58:39, John Ogness wrote: > On 2024-01-12, Petr Mladek <pmladek@suse.com> wrote: > >> > Reported-by: Francesco Dolcini <francesco@dolcini.it> > >> > Reported-by: kernel test robot <oliver.sang@intel.com> > >> > Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com > >> > Reported-by: kernel test robot <oliver.sang@intel.com> > >> > Closes: https://lore.kernel.org/oe-lkp/202311161555.3ee16fc9-oliver.sang@intel.com > >> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > >> > Signed-off-by: John Ogness <john.ogness@linutronix.de> > >> > >> Great catch! It must have been complicated to debug this. > >> > >> Reviewed-by: Petr Mladek <pmladek@suse.com> > > > > That said, I am a bit nervous that a bug like this might cause > > workqueue stall and panic() the kernel. > > > > At least, this is how I read > > https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com/ > > Note that the bug is reported for code that mainline will never > have. This patch "fixes" the problem before it is a problem. From the > perspective of mainline, the problem never existed. Perhaps it is > inappropriate to list these with the Closes tag. I think that it is better to keep the reference. It might help archaeologists to understand the problem ;-) > > It looks like it caused some loop and refcout overlow or so. > > But I might be wrong. > > > > I would like to better understand this and check if we could prevent > > it somehow. > > The code in question can be found here: > > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/kernel/printk/printk_ringbuffer.c?id=768c33033e41ad195a9304ebb183afb730d5ae98#n2031 > > The URL above points to prb_next_reserve_seq(), which is in an infinite > loop. > > desc_last_finalized_seq() is returning a huge garbage value due to the > 32bit sequence value 1 being incorrectly mapped to 0xffffffff00000001 > (which is 18446744069414584321). When desc_read_finalized_seq() is > called (line 2082), it obviously returns -EINVAL because there is no > record with this huge sequence number. > > The code interrupts -EINVAL to mean that the record has been > overwritten, and so it tries again with an updated last finalized > sequence number => infinite loop. > > With this patch applied, the 32bit value 1 is correctly mapped to > sequence number 1 and all is OK. > > The problem was that desc_last_finalized_seq() returned a sequence > number that was higher than the highest record. That was the bug. As > long as desc_last_finalized_seq() always returns a sequence number that > exists or used to exist, the retry loop is fine. And since the tail > record is always in the finalized state, there will always be a > finalized record available. Thank you both for explanation. You are right that this was kind of special. You know, the code around reading the messages is getting more and more complex. Also the u32 -> u64 transition is not 100% safe. So some sanity checks might be useful. That said, I do not see any obvious trivial one. Just the following came to my mind. prb_first_seq() is reliable and we could do: void next_seq_sanity_check(struct printk_ringbuffer *rb, u64 next_seq) { struct prb_desc_ring *desc_ring = &rb->desc_ring; u64 first_seq = prb_first_seq(rb); u64 desc_count = DESC_COUNT(&rb->desc_ring)). WARN_ONCE(seq > first_seq + DESC_COUNT(desc_ring)); } Well, I am not sure if it is worth it. Also the WARN() won't be printed on consoles when the reading is broken. Best Regards, Petr
On 2024-01-15, Petr Mladek <pmladek@suse.com> wrote: > You know, the code around reading the messages is getting more > and more complex. Also the u32 -> u64 transition is not 100% safe. > So some sanity checks might be useful. This complexity only exists for 32bit systems, but yeah, it is still important. > That said, I do not see any obvious trivial one. Just the following > came to my mind. prb_first_seq() is reliable and we could do: > > void next_seq_sanity_check(struct printk_ringbuffer *rb, u64 next_seq) > { > struct prb_desc_ring *desc_ring = &rb->desc_ring; > u64 first_seq = prb_first_seq(rb); > u64 desc_count = DESC_COUNT(&rb->desc_ring)). > > WARN_ONCE(seq > first_seq + DESC_COUNT(desc_ring)); > } @seq is allowed to be 2^31 before or after @first_seq. The check would look more like this: WARN_ONCE((rb_first_seq < 0xffffffff80000000 && seq > rb_first_seq + 0x80000000) || (rb_first_seq > 0x80000000 && seq < rb_first_seq - 0x80000000)); > Well, I am not sure if it is worth it. Also the WARN() won't be > printed on consoles when the reading is broken. Broken printing is irrelevant. There are plenty of debug methods to get at the ringbuffer. I am OK with adding this sanity check (again, only for 32bit, within __ulseq_to_u64seq()). John
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h index b82a96dc2ea2..12f60c782e46 100644 --- a/kernel/printk/printk_ringbuffer.h +++ b/kernel/printk/printk_ringbuffer.h @@ -407,7 +407,7 @@ static inline u64 __ulseq_to_u64seq(struct printk_ringbuffer *rb, u32 ulseq) * Also the access to the ring buffer is always safe. */ rb_next_seq = prb_next_seq(rb); - seq = rb_next_seq - ((u32)rb_next_seq - ulseq); + seq = rb_next_seq - (s32)((u32)rb_next_seq - ulseq); return seq; }