Message ID | 20221026185954.991547-1-pierre.gondois@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp436780wru; Wed, 26 Oct 2022 12:03:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4Em4YPv55nwlaJbiCTLnjQhV+uo682+lKozsBWz/wtVcHbKML6a18fXZIIifnQQXUCUgZ4 X-Received: by 2002:a63:8943:0:b0:46f:3a91:3618 with SMTP id v64-20020a638943000000b0046f3a913618mr6317869pgd.16.1666810982200; Wed, 26 Oct 2022 12:03:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666810982; cv=none; d=google.com; s=arc-20160816; b=mnic3t3mzRmTimXk3GUf4rLKKoo+x6UcjS/cX0k0DG9JqWeJZqxacJ3twPsex4wVi3 4PCUapDuP5tGpimOK9jdaxVzZVAMUGMakjURTHU1ilR0BSGhkFEMVKloJOW9rrXO4QQf +e7L309ooz/xkZJwCaiSxEu3fWU9d28VmLytJ0fabJIIFOP19xXoxrEeuW9hJrnR6XiL f6HxjsXXa8tO6q6vksgbnOkX1HtOppVKaFbHz1PZzNG1Th5AZPRiPEYSLfVjxXeSepD9 hLSePAWcVOydX1KDpIH6RsNrelMblHN5DmBSbsmqvBOd0qRGdwJYivX8AHC2rgwO/S0T Tx2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=9cOkhfL+bV+e8m2etM0HOfk8nawWJUaGuXm8XWIejFc=; b=IXFfXFgdxpK2vuvhs7LM1FaDhpXbVoYYiFkIL60iEngnqIxOcT755j+c6siVH1L/q4 g8VFdFB2Pzu6Le7kXe5DQg+2sGtE9Q9ajiMUGfLuBq7fP8rwjtsrF/57lbtjKjMHgJ14 ccpADwB1pz9g8HvhhGTTd4rnDkBrBP/Z04pwgb0NQEJbZdnuGFTnefs7i4RA/gc226fh gmImBTlo1UdqSQxhZTVTPS+mW2OsfzeBqAxAzgjApx9C/n1JZcTU0JtRNUS/+lq91RfG tzaCvkiQlFSLDzvVB//cn8EQd+QZpYTL+i5jgbFUPiJF5XOXPJTUrk4NwRwj1qXu9KZj YtxQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e15-20020a656bcf000000b0043966935b33si7214252pgw.166.2022.10.26.12.02.48; Wed, 26 Oct 2022 12:03:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234583AbiJZTCX (ORCPT <rfc822;pwkd43@gmail.com> + 99 others); Wed, 26 Oct 2022 15:02:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234678AbiJZTA7 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 26 Oct 2022 15:00:59 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9C43226AEF for <linux-kernel@vger.kernel.org>; Wed, 26 Oct 2022 12:00:05 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 118E223A; Wed, 26 Oct 2022 12:00:11 -0700 (PDT) Received: from pierre123.arm.com (unknown [10.57.2.137]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2F3073F792; Wed, 26 Oct 2022 12:00:02 -0700 (PDT) From: Pierre Gondois <pierre.gondois@arm.com> To: linux-kernel@vger.kernel.org Cc: Rob.Herring@arm.com, sudeep.holla@arm.com, Pierre Gondois <pierre.gondois@arm.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, "Rafael J. Wysocki" <rafael@kernel.org> Subject: [PATCH] cacheinfo: Decrement refcount in cache_setup_of_node() Date: Wed, 26 Oct 2022 20:59:54 +0200 Message-Id: <20221026185954.991547-1-pierre.gondois@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747777992641489448?= X-GMAIL-MSGID: =?utf-8?q?1747777992641489448?= |
Series |
cacheinfo: Decrement refcount in cache_setup_of_node()
|
|
Commit Message
Pierre Gondois
Oct. 26, 2022, 6:59 p.m. UTC
Refcounts to DT nodes are only incremented in the function
and never decremented. Decrease the refcounts when necessary.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
drivers/base/cacheinfo.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
Comments
On Wed, Oct 26, 2022 at 08:59:54PM +0200, Pierre Gondois wrote: > Refcounts to DT nodes are only incremented in the function > and never decremented. Decrease the refcounts when necessary. > Thanks for fixing this. I had observed this once before but I wasn't sure on of_node_get usage in general. Since we assign and use device_node pointers for comparison later, should be keep the reference or drop it. I still don't know and keep getting such questions whenever I touch anything around device_node reference. IIUC as long as we don't use that you access the node itself, we are safe to drop the reference. That said, this change looks good to me. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> -- Regards, Sudeep
Hi Pierre, On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote: > Refcounts to DT nodes are only incremented in the function > and never decremented. Decrease the refcounts when necessary. > > Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Thanks for your patch, which is now commit 3da72e18371c41a6 ("cacheinfo: Decrement refcount in cache_setup_of_node()") in driver-core-next. I have bisected a refcount underflow during s2ram to this commit: PM: suspend entry (deep) Filesystems sync: 0.003 seconds Freezing user space processes ... (elapsed 0.009 seconds) done. OOM killer disabled. Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done. sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk Disabling non-boot CPUs ... psci: CPU1 killed (polled 0 ms) ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 2 PID: 22 at lib/refcount.c:25 refcount_warn_saturate+0x98/0x140 CPU: 2 PID: 22 Comm: cpuhp/2 Not tainted 6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870 Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0x98/0x140 lr : refcount_warn_saturate+0x98/0x140 sp : ffff80000a3b3710 x29: ffff80000a3b3710 x28: ffff0a00ffffff05 x27: ffff80000a3b3aa2 x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0 x23: ffff80000a3b3a88 x22: 0000000000000000 x21: ffff80000a3b3aa7 x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010 x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00 x14: 00000000000001e6 x13: ffff0004c028ca00 x12: 00000000ffffffea x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4 x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027 x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: refcount_warn_saturate+0x98/0x140 kobject_get+0x94/0x98 of_node_get+0x1c/0x30 of_fwnode_get+0x30/0x50 fwnode_handle_get+0x2c/0x48 fwnode_get_nth_parent+0x70/0x80 fwnode_full_name_string+0x44/0xb8 device_node_string+0x318/0x3f0 pointer+0x248/0x350 vsnprintf+0x1fc/0x720 vprintk_store+0x104/0x4c8 vprintk_emit+0x104/0x340 vprintk_default+0x34/0x40 vprintk+0xc4/0xe8 _printk+0x5c/0x80 of_node_release+0xc0/0xe0 kobject_put+0xa4/0xf0 of_node_put+0x14/0x20 free_cache_attributes.part.0+0x138/0x190 cacheinfo_cpu_pre_down+0x70/0x88 cpuhp_invoke_callback+0x1d8/0xbc0 cpuhp_thread_fun+0x1b0/0x218 smpboot_thread_fn+0x1b4/0x270 kthread+0x10c/0x118 ret_from_fork+0x10/0x20 irq event stamp: 438 hardirqs last enabled at (437): [<ffff80000827c668>] put_cpu_partial+0x1e8/0x258 hardirqs last disabled at (438): [<ffff8000080ff52c>] vprintk_store+0x394/0x4c8 softirqs last enabled at (0): [<ffff8000080820dc>] copy_process+0x6bc/0x1888 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 2 PID: 22 at lib/refcount.c:28 refcount_warn_saturate+0xec/0x140 CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W 6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870 Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0xec/0x140 lr : refcount_warn_saturate+0xec/0x140 sp : ffff80000a3b3720 x29: ffff80000a3b3720 x28: ffff0a00ffffff05 x27: ffff80000a3b3aa2 x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0 x23: ffff80000a3b3a88 x22: 00000000ffffffff x21: ffff80000a3b3aba x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010 x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00 x14: 0000000000000219 x13: ffff0004c028ca00 x12: 00000000ffffffea x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4 x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027 x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: refcount_warn_saturate+0xec/0x140 kobject_put+0xe4/0xf0 of_node_put+0x14/0x20 of_fwnode_put+0x30/0x40 fwnode_handle_put.part.0+0x1c/0x28 fwnode_handle_put+0x18/0x28 fwnode_full_name_string+0x94/0xb8 device_node_string+0x318/0x3f0 pointer+0x248/0x350 vsnprintf+0x1fc/0x720 vprintk_store+0x104/0x4c8 vprintk_emit+0x104/0x340 vprintk_default+0x34/0x40 vprintk+0xc4/0xe8 _printk+0x5c/0x80 of_node_release+0xc0/0xe0 kobject_put+0xa4/0xf0 of_node_put+0x14/0x20 free_cache_attributes.part.0+0x138/0x190 cacheinfo_cpu_pre_down+0x70/0x88 cpuhp_invoke_callback+0x1d8/0xbc0 cpuhp_thread_fun+0x1b0/0x218 smpboot_thread_fn+0x1b4/0x270 kthread+0x10c/0x118 ret_from_fork+0x10/0x20 irq event stamp: 438 hardirqs last enabled at (437): [<ffff80000827c668>] put_cpu_partial+0x1e8/0x258 hardirqs last disabled at (438): [<ffff8000080ff52c>] vprintk_store+0x394/0x4c8 softirqs last enabled at (0): [<ffff8000080820dc>] copy_process+0x6bc/0x1888 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 0000000000000000 ]--- OF: ERROR: Bad of_node_put() on /cpus/cache-controller-0 ------------[ cut here ]------------ refcount_t: saturated; leaking memory. WARNING: CPU: 2 PID: 22 at lib/refcount.c:22 refcount_warn_saturate+0x6c/0x140 CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W 6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870 Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0x6c/0x140 lr : refcount_warn_saturate+0x6c/0x140 sp : ffff80000a3b3670 x29: ffff80000a3b3670 x28: ffff0a00ffffff05 x27: ffff80000a1f8bfa x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0 x23: ffff80000a1f8c14 x22: 0000000000000000 x21: ffff80000a1f8bff x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010 x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00 x14: 000000000000024a x13: ffff0004c028ca00 x12: 00000000ffffffea x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4 x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027 x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: refcount_warn_saturate+0x6c/0x140 kobject_get+0x7c/0x98 of_node_get+0x1c/0x30 of_fwnode_get+0x30/0x50 fwnode_handle_get+0x2c/0x48 fwnode_get_nth_parent+0x70/0x80 fwnode_full_name_string+0x44/0xb8 device_node_string+0x318/0x3f0 pointer+0x248/0x350 vsnprintf+0x1fc/0x720 vscnprintf+0x30/0x50 printk_sprint+0x3c/0x1a8 vprintk_store+0x414/0x4c8 vprintk_emit+0x104/0x340 vprintk_default+0x34/0x40 vprintk+0xc4/0xe8 _printk+0x5c/0x80 of_node_release+0xc0/0xe0 kobject_put+0xa4/0xf0 of_node_put+0x14/0x20 free_cache_attributes.part.0+0x138/0x190 cacheinfo_cpu_pre_down+0x70/0x88 cpuhp_invoke_callback+0x1d8/0xbc0 cpuhp_thread_fun+0x1b0/0x218 smpboot_thread_fn+0x1b4/0x270 kthread+0x10c/0x118 ret_from_fork+0x10/0x20 irq event stamp: 438 hardirqs last enabled at (437): [<ffff80000827c668>] put_cpu_partial+0x1e8/0x258 hardirqs last disabled at (438): [<ffff8000080ff52c>] vprintk_store+0x394/0x4c8 softirqs last enabled at (0): [<ffff8000080820dc>] copy_process+0x6bc/0x1888 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 0000000000000000 ]--- CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W 6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870 Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) Call trace: dump_backtrace+0xcc/0xd8 show_stack+0x14/0x38 dump_stack_lvl+0x88/0xb0 dump_stack+0x14/0x2c of_node_release+0xc4/0xe0 kobject_put+0xa4/0xf0 of_node_put+0x14/0x20 free_cache_attributes.part.0+0x138/0x190 cacheinfo_cpu_pre_down+0x70/0x88 cpuhp_invoke_callback+0x1d8/0xbc0 cpuhp_thread_fun+0x1b0/0x218 smpboot_thread_fn+0x1b4/0x270 kthread+0x10c/0x118 ret_from_fork+0x10/0x20 psci: CPU2 killed (polled 0 ms) psci: CPU3 killed (polled 0 ms) psci: CPU4 killed (polled 4 ms) OF: ERROR: Bad of_node_put() on /cpus/cache-controller-1 CPU: 5 PID: 37 Comm: cpuhp/5 Tainted: G W 6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870 Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) Call trace: dump_backtrace+0xcc/0xd8 show_stack+0x14/0x38 dump_stack_lvl+0x88/0xb0 dump_stack+0x14/0x2c of_node_release+0xc4/0xe0 kobject_put+0xa4/0xf0 of_node_put+0x14/0x20 free_cache_attributes.part.0+0x138/0x190 cacheinfo_cpu_pre_down+0x70/0x88 cpuhp_invoke_callback+0x1d8/0xbc0 cpuhp_thread_fun+0x1b0/0x218 smpboot_thread_fn+0x1b4/0x270 kthread+0x10c/0x118 ret_from_fork+0x10/0x20 psci: CPU5 killed (polled 0 ms) psci: CPU6 killed (polled 0 ms) psci: CPU7 killed (polled 0 ms) Reverting this commit fixes the issue. > --- a/drivers/base/cacheinfo.c > +++ b/drivers/base/cacheinfo.c > @@ -196,7 +196,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf, > > static int cache_setup_of_node(unsigned int cpu) > { > - struct device_node *np; > + struct device_node *np, *prev; > struct cacheinfo *this_leaf; > unsigned int index = 0; > > @@ -206,19 +206,24 @@ static int cache_setup_of_node(unsigned int cpu) > return -ENOENT; > } > > + prev = np; > + > while (index < cache_leaves(cpu)) { > this_leaf = per_cpu_cacheinfo_idx(cpu, index); > - if (this_leaf->level != 1) > + if (this_leaf->level != 1) { > np = of_find_next_cache_node(np); > - else > - np = of_node_get(np);/* cpu node itself */ > - if (!np) > - break; > + of_node_put(prev); > + prev = np; > + if (!np) > + break; > + } > cache_of_set_props(this_leaf, np); > this_leaf->fw_token = np; > index++; > } > > + of_node_put(np); > + > if (index != cache_leaves(cpu)) /* not all OF nodes populated */ > return -ENOENT; > Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert, On 15.11.2022 14:06, Geert Uytterhoeven wrote: > On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote: >> Refcounts to DT nodes are only incremented in the function >> and never decremented. Decrease the refcounts when necessary. >> >> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> > Thanks for your patch, which is now commit 3da72e18371c41a6 > ("cacheinfo: Decrement refcount in cache_setup_of_node()") in > driver-core-next. > > I have bisected a refcount underflow during s2ram to this commit: Similar issue can be reproduced with qemu/arm64 'virt' machine during boot: cacheinfo: Unable to detect cache hierarchy for CPU 0 ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x144 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc1+ #13076 Hardware name: linux,dummy-virt (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0xa0/0x144 lr : refcount_warn_saturate+0xa0/0x144 ... Call trace: refcount_warn_saturate+0xa0/0x144 kobject_get+0xbc/0xc0 of_node_get+0x20/0x34 of_fwnode_get+0x34/0x54 fwnode_handle_get+0x30/0x4c fwnode_get_nth_parent+0xec/0x104 fwnode_full_name_string+0x48/0xc0 device_node_string+0x278/0x530 pointer+0x3a0/0x4bc vsnprintf+0x23c/0x750 vprintk_store+0xf8/0x4a0 vprintk_emit+0x110/0x360 vprintk_default+0x38/0x44 vprintk+0xc0/0xe4 _printk+0x5c/0x84 of_node_release+0x140/0x150 kobject_put+0xa4/0x120 of_node_put+0x18/0x2c free_cache_attributes+0x13c/0x1e0 detect_cache_attributes+0x4f4/0x580 update_siblings_masks+0x28/0x300 store_cpu_topology+0x58/0x80 smp_prepare_cpus+0x38/0x120 kernel_init_freeable+0x1b8/0x350 kernel_init+0x24/0x130 ret_from_fork+0x10/0x20 irq event stamp: 74 hardirqs last enabled at (73): [<ffffb81addf20ac0>] __up_console_sem+0x6c/0x70 hardirqs last disabled at (74): [<ffffb81addf22658>] vprintk_store+0x298/0x4a0 softirqs last enabled at (10): [<ffffb81adde1046c>] _stext+0x46c/0x5d8 softirqs last disabled at (3): [<ffffb81adde166d0>] ____do_softirq+0x10/0x20 ---[ end trace 0000000000000000 ]--- I've also observed it during the manual CPU hot plug test (the log comes from OdroidC4 board): # for i in /sys/devices/system/cpu/cpu[1-8]; do echo 0 >$i/online; psci: CPU1 killed (polled 4 ms) ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 2 PID: 23 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x144 Modules linked in: ipv6 meson_gxl dw_hdmi_i2s_audio dwmac_generic rc_odroid axg_audio sclk_div clk_phase mdio_mux_meson_g12a meson_ir snd_soc_meson_axg_tdmout snd son_g12a_tohdmitx snd_soc_meson_codec_glue crct10dif_ce snd_soc_meson_axg_fifo meson_gxbb_wdt reset_meson_audio_arb pwm_meson meson_dw_hdmi meson_rng rtc_meson_vr _platform snd_soc_meson_axg_sound_card drm_display_helper rng_core snd_soc_meson_card_utils stmmac panfrost meson_drm drm_shmem_helper snd_soc_meson_axg_tdm_inter anvas nvmem_meson_efuse snd_soc_meson_axg_tdm_formatter display_connector CPU: 2 PID: 23 Comm: cpuhp/2 Not tainted 6.1.0-rc5-next-20221115+ #5935 Hardware name: Hardkernel ODROID-C4 (DT) pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : refcount_warn_saturate+0xa0/0x144 lr : refcount_warn_saturate+0xa0/0x144 ... Call trace: refcount_warn_saturate+0xa0/0x144 kobject_get+0xbc/0xc0 of_node_get+0x20/0x34 of_fwnode_get+0x34/0x54 fwnode_handle_get+0x30/0x4c fwnode_get_nth_parent+0xec/0x104 fwnode_full_name_string+0x48/0xc0 device_node_string+0x278/0x530 pointer+0x3a0/0x4bc vsnprintf+0x23c/0x750 vprintk_store+0xf8/0x4a0 vprintk_emit+0x10c/0x35c vprintk_default+0x38/0x44 vprintk+0xc0/0xe4 _printk+0x5c/0x84 of_node_release+0x140/0x150 kobject_put+0xa4/0x120 of_node_put+0x18/0x2c free_cache_attributes+0x13c/0x1e0 cacheinfo_cpu_pre_down+0x80/0x90 cpuhp_invoke_callback+0x16c/0x2b0 cpuhp_thread_fun+0x18c/0x250 smpboot_thread_fn+0x1f4/0x2a0 kthread+0x118/0x11c ret_from_fork+0x10/0x20 irq event stamp: 204 hardirqs last enabled at (203): [<ffff800008330260>] put_cpu_partial+0x1bc/0x1fc hardirqs last disabled at (204): [<ffff8000081247ac>] vprintk_store+0x298/0x4a0 softirqs last enabled at (16): [<ffff80000801046c>] _stext+0x46c/0x5d8 softirqs last disabled at (3): [<ffff800008016700>] ____do_softirq+0x10/0x20 ---[ end trace 0000000000000000 ]--- Best regards
On 11/15/22 20:30, Marek Szyprowski wrote: > Hi Geert, > > On 15.11.2022 14:06, Geert Uytterhoeven wrote: >> On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote: >>> Refcounts to DT nodes are only incremented in the function >>> and never decremented. Decrease the refcounts when necessary. >>> >>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> >> Thanks for your patch, which is now commit 3da72e18371c41a6 >> ("cacheinfo: Decrement refcount in cache_setup_of_node()") in >> driver-core-next. >> >> I have bisected a refcount underflow during s2ram to this commit: > > Similar issue can be reproduced with qemu/arm64 'virt' machine during boot: > Hello Geert, Marek, Thanks for reporting the issue. The patch at: https://lore.kernel.org/all/20221115220520.2076189-1-pierre.gondois@arm.com/ should fix it, Regards, Pierre
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 1448a329f66d..10ddb05a6e39 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -196,7 +196,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf, static int cache_setup_of_node(unsigned int cpu) { - struct device_node *np; + struct device_node *np, *prev; struct cacheinfo *this_leaf; unsigned int index = 0; @@ -206,19 +206,24 @@ static int cache_setup_of_node(unsigned int cpu) return -ENOENT; } + prev = np; + while (index < cache_leaves(cpu)) { this_leaf = per_cpu_cacheinfo_idx(cpu, index); - if (this_leaf->level != 1) + if (this_leaf->level != 1) { np = of_find_next_cache_node(np); - else - np = of_node_get(np);/* cpu node itself */ - if (!np) - break; + of_node_put(prev); + prev = np; + if (!np) + break; + } cache_of_set_props(this_leaf, np); this_leaf->fw_token = np; index++; } + of_node_put(np); + if (index != cache_leaves(cpu)) /* not all OF nodes populated */ return -ENOENT;