[1/1] driver core: Avoid using fwnode in __fwnode_link_del()

Message ID 20231110170121.769221-1-herve.codina@bootlin.com
State New
Headers
Series [1/1] driver core: Avoid using fwnode in __fwnode_link_del() |

Commit Message

Herve Codina Nov. 10, 2023, 5:01 p.m. UTC
  A refcount issue can appeared in __fwnode_link_del() due to the
pr_debug() call:
  WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
  Call Trace:
  <TASK>
  ? refcount_warn_saturate+0xe5/0x110
  ? __warn+0x81/0x130
  ? refcount_warn_saturate+0xe5/0x110
  ? report_bug+0x191/0x1c0
  ? srso_alias_return_thunk+0x5/0x7f
  ? prb_read_valid+0x1b/0x30
  ? handle_bug+0x3c/0x80
  ? exc_invalid_op+0x17/0x70
  ? asm_exc_invalid_op+0x1a/0x20
  ? refcount_warn_saturate+0xe5/0x110
  kobject_get+0x68/0x70
  of_node_get+0x1e/0x30
  of_fwnode_get+0x28/0x40
  fwnode_full_name_string+0x34/0x90
  fwnode_string+0xdb/0x140
  vsnprintf+0x17b/0x630
  va_format.isra.0+0x71/0x130
  vsnprintf+0x17b/0x630
  vprintk_store+0x162/0x4d0
  ? srso_alias_return_thunk+0x5/0x7f
  ? srso_alias_return_thunk+0x5/0x7f
  ? srso_alias_return_thunk+0x5/0x7f
  ? try_to_wake_up+0x9c/0x620
  ? rwsem_mark_wake+0x1b2/0x310
  vprintk_emit+0xe4/0x2b0
  _printk+0x5c/0x80
  __dynamic_pr_debug+0x131/0x160
  ? srso_alias_return_thunk+0x5/0x7f
  __fwnode_link_del+0x25/0xa0
  fwnode_links_purge+0x39/0xb0
  of_node_release+0xd9/0x180
  kobject_put+0x7b/0x190
  ...

Indeed, an of_node is destroyed and so, of_node_release() is called
because the of_node refcount reached 0.
of_node_release() calls fwnode_links_purge() to purge the links and
ended with __fwnode_link_del() calls.
__fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes)
involved in the link and so this call is done while one of them is no
more available (ie the one related to the of_node_release() call)

Remove the pr_debug() call to avoid the use of the links fwnode while
destroying the fwnode itself.

Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted")
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
---
 drivers/base/core.c | 2 --
 1 file changed, 2 deletions(-)
  

Comments

Saravana Kannan Nov. 10, 2023, 8:09 p.m. UTC | #1
On Fri, Nov 10, 2023 at 9:01 AM Herve Codina <herve.codina@bootlin.com> wrote:
>
> A refcount issue can appeared in __fwnode_link_del() due to the
> pr_debug() call:
>   WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
>   Call Trace:
>   <TASK>
>   ? refcount_warn_saturate+0xe5/0x110
>   ? __warn+0x81/0x130
>   ? refcount_warn_saturate+0xe5/0x110
>   ? report_bug+0x191/0x1c0
>   ? srso_alias_return_thunk+0x5/0x7f
>   ? prb_read_valid+0x1b/0x30
>   ? handle_bug+0x3c/0x80
>   ? exc_invalid_op+0x17/0x70
>   ? asm_exc_invalid_op+0x1a/0x20
>   ? refcount_warn_saturate+0xe5/0x110
>   kobject_get+0x68/0x70
>   of_node_get+0x1e/0x30
>   of_fwnode_get+0x28/0x40
>   fwnode_full_name_string+0x34/0x90
>   fwnode_string+0xdb/0x140
>   vsnprintf+0x17b/0x630
>   va_format.isra.0+0x71/0x130
>   vsnprintf+0x17b/0x630
>   vprintk_store+0x162/0x4d0
>   ? srso_alias_return_thunk+0x5/0x7f
>   ? srso_alias_return_thunk+0x5/0x7f
>   ? srso_alias_return_thunk+0x5/0x7f
>   ? try_to_wake_up+0x9c/0x620
>   ? rwsem_mark_wake+0x1b2/0x310
>   vprintk_emit+0xe4/0x2b0
>   _printk+0x5c/0x80
>   __dynamic_pr_debug+0x131/0x160
>   ? srso_alias_return_thunk+0x5/0x7f
>   __fwnode_link_del+0x25/0xa0
>   fwnode_links_purge+0x39/0xb0
>   of_node_release+0xd9/0x180
>   kobject_put+0x7b/0x190
>   ...
>
> Indeed, an of_node is destroyed and so, of_node_release() is called
> because the of_node refcount reached 0.
> of_node_release() calls fwnode_links_purge() to purge the links and
> ended with __fwnode_link_del() calls.
> __fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes)
> involved in the link and so this call is done while one of them is no
> more available (ie the one related to the of_node_release() call)
>
> Remove the pr_debug() call to avoid the use of the links fwnode while
> destroying the fwnode itself.
>
> Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted")
> Cc: stable@vger.kernel.org
> Signed-off-by: Herve Codina <herve.codina@bootlin.com>
> ---
>  drivers/base/core.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index f4b09691998e..62088c663014 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -109,8 +109,6 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
>   */
>  static void __fwnode_link_del(struct fwnode_link *link)
>  {
> -       pr_debug("%pfwf Dropping the fwnode link to %pfwf\n",
> -                link->consumer, link->supplier);

Valid issue, but a NACK for the patch.

The pr_debug has been very handy, so I don't want to delete it. Also,
the fwnode link can't get deleted before the supplier/consumer. If it
is, I need to take a closer look as I'd expect the list_del() to cause
corruption. My guess is that the %pfwf is traversing stuff that's
causing an issue. But let me take a closer look next week when I'll be
at LPC.

-Saravana

>         list_del(&link->s_hook);
>         list_del(&link->c_hook);
>         kfree(link);
> --
> 2.41.0
>
  
Herve Codina Nov. 13, 2023, 5:35 p.m. UTC | #2
Hi Saravan,

On Fri, 10 Nov 2023 12:09:02 -0800
Saravana Kannan <saravanak@google.com> wrote:

> On Fri, Nov 10, 2023 at 9:01 AM Herve Codina <herve.codina@bootlin.com> wrote:
> >
> > A refcount issue can appeared in __fwnode_link_del() due to the
> > pr_debug() call:
> >   WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
> >   Call Trace:
> >   <TASK>
> >   ? refcount_warn_saturate+0xe5/0x110
> >   ? __warn+0x81/0x130
> >   ? refcount_warn_saturate+0xe5/0x110
> >   ? report_bug+0x191/0x1c0
> >   ? srso_alias_return_thunk+0x5/0x7f
> >   ? prb_read_valid+0x1b/0x30
> >   ? handle_bug+0x3c/0x80
> >   ? exc_invalid_op+0x17/0x70
> >   ? asm_exc_invalid_op+0x1a/0x20
> >   ? refcount_warn_saturate+0xe5/0x110
> >   kobject_get+0x68/0x70
> >   of_node_get+0x1e/0x30
> >   of_fwnode_get+0x28/0x40
> >   fwnode_full_name_string+0x34/0x90
> >   fwnode_string+0xdb/0x140
> >   vsnprintf+0x17b/0x630
> >   va_format.isra.0+0x71/0x130
> >   vsnprintf+0x17b/0x630
> >   vprintk_store+0x162/0x4d0
> >   ? srso_alias_return_thunk+0x5/0x7f
> >   ? srso_alias_return_thunk+0x5/0x7f
> >   ? srso_alias_return_thunk+0x5/0x7f
> >   ? try_to_wake_up+0x9c/0x620
> >   ? rwsem_mark_wake+0x1b2/0x310
> >   vprintk_emit+0xe4/0x2b0
> >   _printk+0x5c/0x80
> >   __dynamic_pr_debug+0x131/0x160
> >   ? srso_alias_return_thunk+0x5/0x7f
> >   __fwnode_link_del+0x25/0xa0
> >   fwnode_links_purge+0x39/0xb0
> >   of_node_release+0xd9/0x180
> >   kobject_put+0x7b/0x190
> >   ...
> >
> > Indeed, an of_node is destroyed and so, of_node_release() is called
> > because the of_node refcount reached 0.
> > of_node_release() calls fwnode_links_purge() to purge the links and
> > ended with __fwnode_link_del() calls.
> > __fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes)
> > involved in the link and so this call is done while one of them is no
> > more available (ie the one related to the of_node_release() call)
> >
> > Remove the pr_debug() call to avoid the use of the links fwnode while
> > destroying the fwnode itself.
> >
> > Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Herve Codina <herve.codina@bootlin.com>
> > ---
> >  drivers/base/core.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index f4b09691998e..62088c663014 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -109,8 +109,6 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> >   */
> >  static void __fwnode_link_del(struct fwnode_link *link)
> >  {
> > -       pr_debug("%pfwf Dropping the fwnode link to %pfwf\n",
> > -                link->consumer, link->supplier);  
> 
> Valid issue, but a NACK for the patch.
> 
> The pr_debug has been very handy, so I don't want to delete it. Also,
> the fwnode link can't get deleted before the supplier/consumer. If it
> is, I need to take a closer look as I'd expect the list_del() to cause
> corruption. My guess is that the %pfwf is traversing stuff that's
> causing an issue. But let me take a closer look next week when I'll be
> at LPC.
> 

The issue is really related to print the full name (%pfwf) of the node
been destroyed by of_node_release() due to refcount == 0.
The issue does not appear with %pfwP.

Looked at printk(). On %pfwf fwnode_handle_{get,put}() is called for
current node and its parents whereas %pfwP does not call
fwnode_handle_{get,put}() on the current node.

A fix can probably be done at printk() level to avoid the
fwnode_handle_{get,put}() calls for the current node in case of %pfwf.

I will do a patch in this way instead of removing the pr_debug() call
in __fwnode_link_del().

Best regards,
Hervé
  

Patch

diff --git a/drivers/base/core.c b/drivers/base/core.c
index f4b09691998e..62088c663014 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -109,8 +109,6 @@  int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
  */
 static void __fwnode_link_del(struct fwnode_link *link)
 {
-	pr_debug("%pfwf Dropping the fwnode link to %pfwf\n",
-		 link->consumer, link->supplier);
 	list_del(&link->s_hook);
 	list_del(&link->c_hook);
 	kfree(link);