Message ID | 20231130174126.688486-2-herve.codina@bootlin.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp563776vqy; Thu, 30 Nov 2023 09:41:56 -0800 (PST) X-Google-Smtp-Source: AGHT+IFH8hTU545ls2vsLvptXe09+qBJzX702LpXAn8cScLS9aNYCS1RI2meqorN297djgoEJhkl X-Received: by 2002:a17:902:ee82:b0:1c6:2ae1:dc28 with SMTP id a2-20020a170902ee8200b001c62ae1dc28mr23350304pld.36.1701366116656; Thu, 30 Nov 2023 09:41:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701366116; cv=none; d=google.com; s=arc-20160816; b=IasiGsVISMA+LDsWcBy8abYWVVCFdQJ/emIH21Rb3Q0XZRnmg6w0mzWQ89+hZIqUaA 2OJjlTC7v83z4pRiEIxMIKXjY04fbqeQQ3eYQpSQXejeVluIPsY0OFQhLr33szYok7IX +GFHKcCcLlubvKhSVqHk1wRNrl5UFOUEle7NLHurckWN9gaY8ls5D14+DEAcw3EjRao8 vHQjdjKPrzvZF9fsjoU0FvIx4x4beL4Jb5V45AN4cfNgvNASrRr8D/3Y8+R8bsRLvjaL L+5Z24Na0cvM9g4d9y/LtScf+6b321yAbzmsSc6Y1E15q3PRQCPCWHI4aAwx4jHv4l8a lSWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bZYcrZj6HAsVu5Q51awnJk6PzzvhbxN5gePC1mvCY2w=; fh=hxCb8OzaXBuSbXNrdSfbxd9E+OOUaTeXTLLu8usFTUA=; b=b/P0VwxLe1xjZbXJ/GEntFJYB4l92G5XD8lkS81OQdNQILmmNkzO13kwEq0ZPLc2EU VEG6WVcI03mHlIgSAkLyCIhpHa25zDg7XLD4X8v5c6Do8HD75ygLyy0Or3jLSC/NJDDc 60XoBRbR43hfK6Py9Kbj+txaIcALefaFF5aqievB2qqp4EfVD1bQNNGJffjhGq5ajlQn Lbn9hy5JITEsNuTPuWzeoEfjB8q05nQjPXKuil4XQ0ierLTUA+jSDUAz/iUL7xobL3UG BLUVt/BMaHS0O2xiK4W1PppLfmBwvXtgGdogTYgjilq5/0SN7qepndVw/6hnB5e1meb7 pVfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bootlin.com header.s=gm1 header.b=GjgAzYp+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=bootlin.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id t17-20020a170902e85100b001cf8e9e8813si1672744plg.315.2023.11.30.09.41.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 09:41:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@bootlin.com header.s=gm1 header.b=GjgAzYp+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=bootlin.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id C26748020929; Thu, 30 Nov 2023 09:41:52 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346583AbjK3Rl3 (ORCPT <rfc822;ruipengqi7@gmail.com> + 99 others); Thu, 30 Nov 2023 12:41:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbjK3Rl0 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 30 Nov 2023 12:41:26 -0500 Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net [217.70.183.201]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 505A410D9; Thu, 30 Nov 2023 09:41:32 -0800 (PST) Received: by mail.gandi.net (Postfix) with ESMTPA id 5795A1BF205; Thu, 30 Nov 2023 17:41:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bootlin.com; s=gm1; t=1701366091; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bZYcrZj6HAsVu5Q51awnJk6PzzvhbxN5gePC1mvCY2w=; b=GjgAzYp+S5WY4+3ceyC2RNJMwmWxIw618L4lY8z8QTqOz9pD+uZ9aBpDPoBrsdcvOU0m5J w//JloZuZgvptd8q9Rr6X91vCPGpNBAVxv0X2u2xu8ZWQVJFYQwCkn3S1m7qdInJ0RDfyI jnyLEaj+D6v1EWYM1t7n8AObgK368bdS/ErOiT0v7bGoS0SIo9+FT10iMb1CBBPjPCVmnh 7BNHoQK8ZHR7u1Y5v6QJIpJ68zX2CdEkBnuiF7o/heEgFwEoVvNKzLnu1FAK0bWdhuvcaq htqaaHjA7C7aPSabvrm0+GJWUl4t3uJxGXvs0S7AQBHHHA5YEVhR2Rr+pKqH+g== From: Herve Codina <herve.codina@bootlin.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, "Rafael J. Wysocki" <rafael@kernel.org>, Rob Herring <robh+dt@kernel.org>, Frank Rowand <frowand.list@gmail.com> Cc: Lizhi Hou <lizhi.hou@amd.com>, Max Zhen <max.zhen@amd.com>, Sonal Santan <sonal.santan@amd.com>, Stefano Stabellini <stefano.stabellini@xilinx.com>, Jonathan Cameron <Jonathan.Cameron@Huawei.com>, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, Allan Nielsen <allan.nielsen@microchip.com>, Horatiu Vultur <horatiu.vultur@microchip.com>, Steen Hegelund <steen.hegelund@microchip.com>, Thomas Petazzoni <thomas.petazzoni@bootlin.com>, Herve Codina <herve.codina@bootlin.com> Subject: [PATCH 1/2] driver core: Introduce device_link_wait_removal() Date: Thu, 30 Nov 2023 18:41:08 +0100 Message-ID: <20231130174126.688486-2-herve.codina@bootlin.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231130174126.688486-1-herve.codina@bootlin.com> References: <20231130174126.688486-1-herve.codina@bootlin.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-GND-Sasl: herve.codina@bootlin.com X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 30 Nov 2023 09:41:52 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784011677349555855 X-GMAIL-MSGID: 1784011677349555855 |
Series |
Synchronize DT overlay removal with devlink removals
|
|
Commit Message
Herve Codina
Nov. 30, 2023, 5:41 p.m. UTC
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
Comments
On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> wrote: > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > introduces a workqueue to release the consumer and supplier devices used > in the devlink. > In the job queued, devices are release and in turn, when all the > references to these devices are dropped, the release function of the > device itself is called. > > Nothing is present to provide some synchronisation with this workqueue > in order to ensure that all ongoing releasing operations are done and > so, some other operations can be started safely. > > For instance, in the following sequence: > 1) of_platform_depopulate() > 2) of_overlay_remove() > > During the step 1, devices are released and related devlinks are removed > (jobs pushed in the workqueue). > During the step 2, OF nodes are destroyed but, without any > synchronisation with devlink removal jobs, of_overlay_remove() can raise > warnings related to missing of_node_put(): > ERROR: memory leak, expected refcount 1 instead of 2 > > Indeed, the missing of_node_put() call is going to be done, too late, > from the workqueue job execution. > > Introduce device_link_wait_removal() to offer a way to synchronize > operations waiting for the end of devlink removals (i.e. end of > workqueue jobs). > Also, as a flushing operation is done on the workqueue, the workqueue > used is moved from a system-wide workqueue to a local one. Thanks for the bug report and fix. Sorry again about the delay in reviewing the changes. Please add Fixes tag for 80dd33cf72d1. > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > --- > drivers/base/core.c | 26 +++++++++++++++++++++++--- > include/linux/device.h | 1 + > 2 files changed, 24 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index ac026187ac6a..2e102a77758c 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > static void __fw_devlink_link_to_consumers(struct device *dev); > static bool fw_devlink_drv_reg_done; > static bool fw_devlink_best_effort; > +static struct workqueue_struct *fw_devlink_wq; > > /** > * __fwnode_link_add - Create a link between two fwnode_handles. > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device *dev) > /* > * It may take a while to complete this work because of the SRCU > * synchronization in device_link_release_fn() and if the consumer or > - * supplier devices get deleted when it runs, so put it into the "long" > - * workqueue. > + * supplier devices get deleted when it runs, so put it into the > + * dedicated workqueue. > */ > - queue_work(system_long_wq, &link->rm_work); > + queue_work(fw_devlink_wq, &link->rm_work); This has nothing to do with fw_devlink. fw_devlink is just triggering the issue in device links. You can hit this bug without fw_devlink too. So call this device_link_wq since it's consistent with device_link_* APIs. > } > > +/** > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate > + */ > +void device_link_wait_removal(void) > +{ > + /* > + * devlink removal jobs are queued in the dedicated work queue. > + * To be sure that all removal jobs are terminated, ensure that any > + * scheduled work has run to completion. > + */ > + drain_workqueue(fw_devlink_wq); Is there a reason this needs to be drain_workqueu() instead of flush_workqueue(). Drain is a stronger guarantee than we need in this case. All we are trying to make sure is that all the device link remove work queued so far have completed. > +} > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > + > static struct class devlink_class = { > .name = "devlink", > .dev_groups = devlink_groups, > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); > if (!sysfs_dev_char_kobj) > goto char_kobj_err; > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > + if (!fw_devlink_wq) Fix the name appropriately here too please. Thanks, Saravana > + goto wq_err; > > return 0; > > + wq_err: > + kobject_put(sysfs_dev_char_kobj); > char_kobj_err: > kobject_put(sysfs_dev_block_kobj); > block_kobj_err: > diff --git a/include/linux/device.h b/include/linux/device.h > index 2b093e62907a..c26f4b3df2bd 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -1250,6 +1250,7 @@ void device_link_del(struct device_link *link); > void device_link_remove(void *consumer, struct device *supplier); > void device_links_supplier_sync_state_pause(void); > void device_links_supplier_sync_state_resume(void); > +void device_link_wait_removal(void); > > /* Create alias, so I can be autoloaded. */ > #define MODULE_ALIAS_CHARDEV(major,minor) \ > -- > 2.42.0 > >
On Tue, 2024-02-20 at 16:31 -0800, Saravana Kannan wrote: > On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> wrote: > > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > > introduces a workqueue to release the consumer and supplier devices used > > in the devlink. > > In the job queued, devices are release and in turn, when all the > > references to these devices are dropped, the release function of the > > device itself is called. > > > > Nothing is present to provide some synchronisation with this workqueue > > in order to ensure that all ongoing releasing operations are done and > > so, some other operations can be started safely. > > > > For instance, in the following sequence: > > 1) of_platform_depopulate() > > 2) of_overlay_remove() > > > > During the step 1, devices are released and related devlinks are removed > > (jobs pushed in the workqueue). > > During the step 2, OF nodes are destroyed but, without any > > synchronisation with devlink removal jobs, of_overlay_remove() can raise > > warnings related to missing of_node_put(): > > ERROR: memory leak, expected refcount 1 instead of 2 > > > > Indeed, the missing of_node_put() call is going to be done, too late, > > from the workqueue job execution. > > > > Introduce device_link_wait_removal() to offer a way to synchronize > > operations waiting for the end of devlink removals (i.e. end of > > workqueue jobs). > > Also, as a flushing operation is done on the workqueue, the workqueue > > used is moved from a system-wide workqueue to a local one. > > Thanks for the bug report and fix. Sorry again about the delay in > reviewing the changes. > > Please add Fixes tag for 80dd33cf72d1. > > > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > > --- > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > include/linux/device.h | 1 + > > 2 files changed, 24 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > index ac026187ac6a..2e102a77758c 100644 > > --- a/drivers/base/core.c > > +++ b/drivers/base/core.c > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > > static void __fw_devlink_link_to_consumers(struct device *dev); > > static bool fw_devlink_drv_reg_done; > > static bool fw_devlink_best_effort; > > +static struct workqueue_struct *fw_devlink_wq; > > > > /** > > * __fwnode_link_add - Create a link between two fwnode_handles. > > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device *dev) > > /* > > * It may take a while to complete this work because of the SRCU > > * synchronization in device_link_release_fn() and if the consumer or > > - * supplier devices get deleted when it runs, so put it into the "long" > > - * workqueue. > > + * supplier devices get deleted when it runs, so put it into the > > + * dedicated workqueue. > > */ > > - queue_work(system_long_wq, &link->rm_work); > > + queue_work(fw_devlink_wq, &link->rm_work); > > This has nothing to do with fw_devlink. fw_devlink is just triggering > the issue in device links. You can hit this bug without fw_devlink too. > So call this device_link_wq since it's consistent with device_link_* APIs. > I'm not sure if I got this right in my series. I do call devlink_release_queue() to my queue. But on the Overlay side I use fwnode_links_flush_queue() because it looked more sensible from an OF point of view. And including (in OF code) linux/fwnode.h instead linux/device.h makes more sense to me. > > } > > > > +/** > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate > > + */ > > +void device_link_wait_removal(void) > > +{ > > + /* > > + * devlink removal jobs are queued in the dedicated work queue. > > + * To be sure that all removal jobs are terminated, ensure that any > > + * scheduled work has run to completion. > > + */ > > + drain_workqueue(fw_devlink_wq); > > Is there a reason this needs to be drain_workqueu() instead of > flush_workqueue(). Drain is a stronger guarantee than we need in this > case. All we are trying to make sure is that all the device link > remove work queued so far have completed. > Yeah, I'm also using flush_workqueue(). > > +} > > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > > + > > static struct class devlink_class = { > > .name = "devlink", > > .dev_groups = devlink_groups, > > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); > > if (!sysfs_dev_char_kobj) > > goto char_kobj_err; > > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > > + if (!fw_devlink_wq) > > Fix the name appropriately here too please. Hi Saravana, Oh, was not aware of this series... Please look at my first patch. It already has a review tag by Rafael. I think the creation of the queue makes more sense to be done in devlink_class_init(). Moreover, Rafael complained in my first version that erroring out because we failed to create the queue is too harsh since devlinks can still work. So, what we do is to schedule the work if we have a queue or too call device_link_release_fn() synchronously if we don't have the queue (note that failing to allocate the queue is very unlikely anyways). - Nuno Sá >
On Tue, Feb 20, 2024 at 10:56 PM Nuno Sá <noname.nuno@gmail.com> wrote: > > On Tue, 2024-02-20 at 16:31 -0800, Saravana Kannan wrote: > > On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> wrote: > > > > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > > > introduces a workqueue to release the consumer and supplier devices used > > > in the devlink. > > > In the job queued, devices are release and in turn, when all the > > > references to these devices are dropped, the release function of the > > > device itself is called. > > > > > > Nothing is present to provide some synchronisation with this workqueue > > > in order to ensure that all ongoing releasing operations are done and > > > so, some other operations can be started safely. > > > > > > For instance, in the following sequence: > > > 1) of_platform_depopulate() > > > 2) of_overlay_remove() > > > > > > During the step 1, devices are released and related devlinks are removed > > > (jobs pushed in the workqueue). > > > During the step 2, OF nodes are destroyed but, without any > > > synchronisation with devlink removal jobs, of_overlay_remove() can raise > > > warnings related to missing of_node_put(): > > > ERROR: memory leak, expected refcount 1 instead of 2 > > > > > > Indeed, the missing of_node_put() call is going to be done, too late, > > > from the workqueue job execution. > > > > > > Introduce device_link_wait_removal() to offer a way to synchronize > > > operations waiting for the end of devlink removals (i.e. end of > > > workqueue jobs). > > > Also, as a flushing operation is done on the workqueue, the workqueue > > > used is moved from a system-wide workqueue to a local one. > > > > Thanks for the bug report and fix. Sorry again about the delay in > > reviewing the changes. > > > > Please add Fixes tag for 80dd33cf72d1. > > > > > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > > > --- > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > > include/linux/device.h | 1 + > > > 2 files changed, 24 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > index ac026187ac6a..2e102a77758c 100644 > > > --- a/drivers/base/core.c > > > +++ b/drivers/base/core.c > > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > > > static void __fw_devlink_link_to_consumers(struct device *dev); > > > static bool fw_devlink_drv_reg_done; > > > static bool fw_devlink_best_effort; > > > +static struct workqueue_struct *fw_devlink_wq; > > > > > > /** > > > * __fwnode_link_add - Create a link between two fwnode_handles. > > > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device *dev) > > > /* > > > * It may take a while to complete this work because of the SRCU > > > * synchronization in device_link_release_fn() and if the consumer or > > > - * supplier devices get deleted when it runs, so put it into the "long" > > > - * workqueue. > > > + * supplier devices get deleted when it runs, so put it into the > > > + * dedicated workqueue. > > > */ > > > - queue_work(system_long_wq, &link->rm_work); > > > + queue_work(fw_devlink_wq, &link->rm_work); > > > > This has nothing to do with fw_devlink. fw_devlink is just triggering > > the issue in device links. You can hit this bug without fw_devlink too. > > So call this device_link_wq since it's consistent with device_link_* APIs. > > > > I'm not sure if I got this right in my series. I do call devlink_release_queue() to > my queue. But on the Overlay side I use fwnode_links_flush_queue() because it looked > more sensible from an OF point of view. And including (in OF code) linux/fwnode.h > instead linux/device.h makes more sense to me. > > > > } > > > > > > +/** > > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate > > > + */ > > > +void device_link_wait_removal(void) > > > +{ > > > + /* > > > + * devlink removal jobs are queued in the dedicated work queue. > > > + * To be sure that all removal jobs are terminated, ensure that any > > > + * scheduled work has run to completion. > > > + */ > > > + drain_workqueue(fw_devlink_wq); > > > > Is there a reason this needs to be drain_workqueu() instead of > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > case. All we are trying to make sure is that all the device link > > remove work queued so far have completed. > > > > Yeah, I'm also using flush_workqueue(). > > > > +} > > > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > > > + > > > static struct class devlink_class = { > > > .name = "devlink", > > > .dev_groups = devlink_groups, > > > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > > > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); > > > if (!sysfs_dev_char_kobj) > > > goto char_kobj_err; > > > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > > > + if (!fw_devlink_wq) > > > > Fix the name appropriately here too please. > > Hi Saravana, > > Oh, was not aware of this series... Please look at my first patch. It already has a > review tag by Rafael. I think the creation of the queue makes more sense to be done > in devlink_class_init(). Moreover, Rafael complained in my first version that > erroring out because we failed to create the queue is too harsh since devlinks can > still work. I think Rafael can be convinced on this one. Firstly, if we fail to allocate so early, we have bigger problems. > So, what we do is to schedule the work if we have a queue or too call > device_link_release_fn() synchronously if we don't have the queue (note that failing > to allocate the queue is very unlikely anyways). device links don't really work when you synchronously need to delete a link since it always uses SRCUs (it used to have a #ifndef CONFIG_SRCU locking). That's like saying a code still works when it doesn't hit a deadlock condition. Let's stick with Herve's patch series since he send it first and it has fewer things that need to be fixed. If he ignores this thread for too long, you can send a revision of yours again and we can accept that. -Saravana
On Thu, 2024-02-22 at 17:08 -0800, Saravana Kannan wrote: > On Tue, Feb 20, 2024 at 10:56 PM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Tue, 2024-02-20 at 16:31 -0800, Saravana Kannan wrote: > > > On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> > > > wrote: > > > > > > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > > > > introduces a workqueue to release the consumer and supplier devices used > > > > in the devlink. > > > > In the job queued, devices are release and in turn, when all the > > > > references to these devices are dropped, the release function of the > > > > device itself is called. > > > > > > > > Nothing is present to provide some synchronisation with this workqueue > > > > in order to ensure that all ongoing releasing operations are done and > > > > so, some other operations can be started safely. > > > > > > > > For instance, in the following sequence: > > > > 1) of_platform_depopulate() > > > > 2) of_overlay_remove() > > > > > > > > During the step 1, devices are released and related devlinks are removed > > > > (jobs pushed in the workqueue). > > > > During the step 2, OF nodes are destroyed but, without any > > > > synchronisation with devlink removal jobs, of_overlay_remove() can raise > > > > warnings related to missing of_node_put(): > > > > ERROR: memory leak, expected refcount 1 instead of 2 > > > > > > > > Indeed, the missing of_node_put() call is going to be done, too late, > > > > from the workqueue job execution. > > > > > > > > Introduce device_link_wait_removal() to offer a way to synchronize > > > > operations waiting for the end of devlink removals (i.e. end of > > > > workqueue jobs). > > > > Also, as a flushing operation is done on the workqueue, the workqueue > > > > used is moved from a system-wide workqueue to a local one. > > > > > > Thanks for the bug report and fix. Sorry again about the delay in > > > reviewing the changes. > > > > > > Please add Fixes tag for 80dd33cf72d1. > > > > > > > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > > > > --- > > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > > > include/linux/device.h | 1 + > > > > 2 files changed, 24 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > > index ac026187ac6a..2e102a77758c 100644 > > > > --- a/drivers/base/core.c > > > > +++ b/drivers/base/core.c > > > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > > > > static void __fw_devlink_link_to_consumers(struct device *dev); > > > > static bool fw_devlink_drv_reg_done; > > > > static bool fw_devlink_best_effort; > > > > +static struct workqueue_struct *fw_devlink_wq; > > > > > > > > /** > > > > * __fwnode_link_add - Create a link between two fwnode_handles. > > > > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device > > > > *dev) > > > > /* > > > > * It may take a while to complete this work because of the SRCU > > > > * synchronization in device_link_release_fn() and if the > > > > consumer or > > > > - * supplier devices get deleted when it runs, so put it into the > > > > "long" > > > > - * workqueue. > > > > + * supplier devices get deleted when it runs, so put it into the > > > > + * dedicated workqueue. > > > > */ > > > > - queue_work(system_long_wq, &link->rm_work); > > > > + queue_work(fw_devlink_wq, &link->rm_work); > > > > > > This has nothing to do with fw_devlink. fw_devlink is just triggering > > > the issue in device links. You can hit this bug without fw_devlink too. > > > So call this device_link_wq since it's consistent with device_link_* APIs. > > > > > > > I'm not sure if I got this right in my series. I do call > > devlink_release_queue() to > > my queue. But on the Overlay side I use fwnode_links_flush_queue() because > > it looked > > more sensible from an OF point of view. And including (in OF code) > > linux/fwnode.h > > instead linux/device.h makes more sense to me. > > > > > > } > > > > > > > > +/** > > > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to > > > > terminate > > > > + */ > > > > +void device_link_wait_removal(void) > > > > +{ > > > > + /* > > > > + * devlink removal jobs are queued in the dedicated work queue. > > > > + * To be sure that all removal jobs are terminated, ensure that > > > > any > > > > + * scheduled work has run to completion. > > > > + */ > > > > + drain_workqueue(fw_devlink_wq); > > > > > > Is there a reason this needs to be drain_workqueu() instead of > > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > > case. All we are trying to make sure is that all the device link > > > remove work queued so far have completed. > > > > > > > Yeah, I'm also using flush_workqueue(). > > > > > > +} > > > > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > > > > + > > > > static struct class devlink_class = { > > > > .name = "devlink", > > > > .dev_groups = devlink_groups, > > > > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > > > > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); > > > > if (!sysfs_dev_char_kobj) > > > > goto char_kobj_err; > > > > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > > > > + if (!fw_devlink_wq) > > > > > > Fix the name appropriately here too please. > > > > Hi Saravana, > > > > Oh, was not aware of this series... Please look at my first patch. It > > already has a > > review tag by Rafael. I think the creation of the queue makes more sense to > > be done > > in devlink_class_init(). Moreover, Rafael complained in my first version > > that > > erroring out because we failed to create the queue is too harsh since > > devlinks can > > still work. > > I think Rafael can be convinced on this one. Firstly, if we fail to > allocate so early, we have bigger problems. That's true... > > > So, what we do is to schedule the work if we have a queue or too call > > device_link_release_fn() synchronously if we don't have the queue (note that > > failing > > to allocate the queue is very unlikely anyways). > > device links don't really work when you synchronously need to delete a > link since it always uses SRCUs (it used to have a #ifndef CONFIG_SRCU > locking). That's like saying a code still works when it doesn't hit a Hmm, can you elaborate please? Why wouldn't it work if we call it synchronously? Sure, we'll have the synchronize_srcu() call which might take some time but I'm not honestly seeing what could go wrong other than waiting? I can also see that we can potentially hold the devlink lock for some time but can that lead to any deadlock (It would actually be nice - if doable at all - to not release the refcounts with a lock hold)? > deadlock condition. > > Let's stick with Herve's patch series since he send it first and it > has fewer things that need to be fixed. If he ignores this thread for Not exactly true :). If you look at my reply in the other thread (my series) you'll see that I actually sent it first (as RFC - and spotted the issue way back in May last year). About the stuff to fix, not sure if it's more. For now, your major complain seems to be about synchronously calling device_link_release_fn() and I did not had it in my v1. But anyways, I just want a fix for this to land as quick as possible :) And I guess we also need Rafael to agree in erroring if we fail to allocate the queue as he was against it. - Nuno Sá
Hi, On Thu, 22 Feb 2024 17:08:28 -0800 Saravana Kannan <saravanak@google.com> wrote: > On Tue, Feb 20, 2024 at 10:56 PM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Tue, 2024-02-20 at 16:31 -0800, Saravana Kannan wrote: > > > On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> wrote: > > > > > > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > > > > introduces a workqueue to release the consumer and supplier devices used > > > > in the devlink. > > > > In the job queued, devices are release and in turn, when all the > > > > references to these devices are dropped, the release function of the > > > > device itself is called. > > > > > > > > Nothing is present to provide some synchronisation with this workqueue > > > > in order to ensure that all ongoing releasing operations are done and > > > > so, some other operations can be started safely. > > > > > > > > For instance, in the following sequence: > > > > 1) of_platform_depopulate() > > > > 2) of_overlay_remove() > > > > > > > > During the step 1, devices are released and related devlinks are removed > > > > (jobs pushed in the workqueue). > > > > During the step 2, OF nodes are destroyed but, without any > > > > synchronisation with devlink removal jobs, of_overlay_remove() can raise > > > > warnings related to missing of_node_put(): > > > > ERROR: memory leak, expected refcount 1 instead of 2 > > > > > > > > Indeed, the missing of_node_put() call is going to be done, too late, > > > > from the workqueue job execution. > > > > > > > > Introduce device_link_wait_removal() to offer a way to synchronize > > > > operations waiting for the end of devlink removals (i.e. end of > > > > workqueue jobs). > > > > Also, as a flushing operation is done on the workqueue, the workqueue > > > > used is moved from a system-wide workqueue to a local one. > > > > > > Thanks for the bug report and fix. Sorry again about the delay in > > > reviewing the changes. > > > > > > Please add Fixes tag for 80dd33cf72d1. > > > > > > > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > > > > --- > > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > > > include/linux/device.h | 1 + > > > > 2 files changed, 24 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > > index ac026187ac6a..2e102a77758c 100644 > > > > --- a/drivers/base/core.c > > > > +++ b/drivers/base/core.c > > > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > > > > static void __fw_devlink_link_to_consumers(struct device *dev); > > > > static bool fw_devlink_drv_reg_done; > > > > static bool fw_devlink_best_effort; > > > > +static struct workqueue_struct *fw_devlink_wq; > > > > > > > > /** > > > > * __fwnode_link_add - Create a link between two fwnode_handles. > > > > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device *dev) > > > > /* > > > > * It may take a while to complete this work because of the SRCU > > > > * synchronization in device_link_release_fn() and if the consumer or > > > > - * supplier devices get deleted when it runs, so put it into the "long" > > > > - * workqueue. > > > > + * supplier devices get deleted when it runs, so put it into the > > > > + * dedicated workqueue. > > > > */ > > > > - queue_work(system_long_wq, &link->rm_work); > > > > + queue_work(fw_devlink_wq, &link->rm_work); > > > > > > This has nothing to do with fw_devlink. fw_devlink is just triggering > > > the issue in device links. You can hit this bug without fw_devlink too. > > > So call this device_link_wq since it's consistent with device_link_* APIs. > > > > > > > I'm not sure if I got this right in my series. I do call devlink_release_queue() to > > my queue. But on the Overlay side I use fwnode_links_flush_queue() because it looked > > more sensible from an OF point of view. And including (in OF code) linux/fwnode.h > > instead linux/device.h makes more sense to me. > > > > > > } > > > > > > > > +/** > > > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate > > > > + */ > > > > +void device_link_wait_removal(void) > > > > +{ > > > > + /* > > > > + * devlink removal jobs are queued in the dedicated work queue. > > > > + * To be sure that all removal jobs are terminated, ensure that any > > > > + * scheduled work has run to completion. > > > > + */ > > > > + drain_workqueue(fw_devlink_wq); > > > > > > Is there a reason this needs to be drain_workqueu() instead of > > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > > case. All we are trying to make sure is that all the device link > > > remove work queued so far have completed. > > > > > > > Yeah, I'm also using flush_workqueue(). > > > > > > +} > > > > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > > > > + > > > > static struct class devlink_class = { > > > > .name = "devlink", > > > > .dev_groups = devlink_groups, > > > > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > > > > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); > > > > if (!sysfs_dev_char_kobj) > > > > goto char_kobj_err; > > > > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > > > > + if (!fw_devlink_wq) > > > > > > Fix the name appropriately here too please. > > > > Hi Saravana, > > > > Oh, was not aware of this series... Please look at my first patch. It already has a > > review tag by Rafael. I think the creation of the queue makes more sense to be done > > in devlink_class_init(). Moreover, Rafael complained in my first version that > > erroring out because we failed to create the queue is too harsh since devlinks can > > still work. > > I think Rafael can be convinced on this one. Firstly, if we fail to > allocate so early, we have bigger problems. > > > So, what we do is to schedule the work if we have a queue or too call > > device_link_release_fn() synchronously if we don't have the queue (note that failing > > to allocate the queue is very unlikely anyways). > > device links don't really work when you synchronously need to delete a > link since it always uses SRCUs (it used to have a #ifndef CONFIG_SRCU > locking). That's like saying a code still works when it doesn't hit a > deadlock condition. > > Let's stick with Herve's patch series since he send it first and it > has fewer things that need to be fixed. If he ignores this thread for > too long, you can send a revision of yours again and we can accept > that. I don't ignore the thread :) Hope I could take some time in the near future to send a v2 of this series. Hervé
On Fri, 2024-02-23 at 09:46 +0100, Herve Codina wrote: > Hi, > > On Thu, 22 Feb 2024 17:08:28 -0800 > Saravana Kannan <saravanak@google.com> wrote: > > > On Tue, Feb 20, 2024 at 10:56 PM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > > > On Tue, 2024-02-20 at 16:31 -0800, Saravana Kannan wrote: > > > > On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@bootlin.com> > > > > wrote: > > > > > > > > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") > > > > > introduces a workqueue to release the consumer and supplier devices > > > > > used > > > > > in the devlink. > > > > > In the job queued, devices are release and in turn, when all the > > > > > references to these devices are dropped, the release function of the > > > > > device itself is called. > > > > > > > > > > Nothing is present to provide some synchronisation with this workqueue > > > > > in order to ensure that all ongoing releasing operations are done and > > > > > so, some other operations can be started safely. > > > > > > > > > > For instance, in the following sequence: > > > > > 1) of_platform_depopulate() > > > > > 2) of_overlay_remove() > > > > > > > > > > During the step 1, devices are released and related devlinks are > > > > > removed > > > > > (jobs pushed in the workqueue). > > > > > During the step 2, OF nodes are destroyed but, without any > > > > > synchronisation with devlink removal jobs, of_overlay_remove() can > > > > > raise > > > > > warnings related to missing of_node_put(): > > > > > ERROR: memory leak, expected refcount 1 instead of 2 > > > > > > > > > > Indeed, the missing of_node_put() call is going to be done, too late, > > > > > from the workqueue job execution. > > > > > > > > > > Introduce device_link_wait_removal() to offer a way to synchronize > > > > > operations waiting for the end of devlink removals (i.e. end of > > > > > workqueue jobs). > > > > > Also, as a flushing operation is done on the workqueue, the workqueue > > > > > used is moved from a system-wide workqueue to a local one. > > > > > > > > Thanks for the bug report and fix. Sorry again about the delay in > > > > reviewing the changes. > > > > > > > > Please add Fixes tag for 80dd33cf72d1. > > > > > > > > > Signed-off-by: Herve Codina <herve.codina@bootlin.com> > > > > > --- > > > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > > > > include/linux/device.h | 1 + > > > > > 2 files changed, 24 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > > > index ac026187ac6a..2e102a77758c 100644 > > > > > --- a/drivers/base/core.c > > > > > +++ b/drivers/base/core.c > > > > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); > > > > > static void __fw_devlink_link_to_consumers(struct device *dev); > > > > > static bool fw_devlink_drv_reg_done; > > > > > static bool fw_devlink_best_effort; > > > > > +static struct workqueue_struct *fw_devlink_wq; > > > > > > > > > > /** > > > > > * __fwnode_link_add - Create a link between two fwnode_handles. > > > > > @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device > > > > > *dev) > > > > > /* > > > > > * It may take a while to complete this work because of the > > > > > SRCU > > > > > * synchronization in device_link_release_fn() and if the > > > > > consumer or > > > > > - * supplier devices get deleted when it runs, so put it into > > > > > the "long" > > > > > - * workqueue. > > > > > + * supplier devices get deleted when it runs, so put it into > > > > > the > > > > > + * dedicated workqueue. > > > > > */ > > > > > - queue_work(system_long_wq, &link->rm_work); > > > > > + queue_work(fw_devlink_wq, &link->rm_work); > > > > > > > > This has nothing to do with fw_devlink. fw_devlink is just triggering > > > > the issue in device links. You can hit this bug without fw_devlink too. > > > > So call this device_link_wq since it's consistent with device_link_* > > > > APIs. > > > > > > > > > > I'm not sure if I got this right in my series. I do call > > > devlink_release_queue() to > > > my queue. But on the Overlay side I use fwnode_links_flush_queue() because > > > it looked > > > more sensible from an OF point of view. And including (in OF code) > > > linux/fwnode.h > > > instead linux/device.h makes more sense to me. > > > > > > > > } > > > > > > > > > > +/** > > > > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs > > > > > to terminate > > > > > + */ > > > > > +void device_link_wait_removal(void) > > > > > +{ > > > > > + /* > > > > > + * devlink removal jobs are queued in the dedicated work > > > > > queue. > > > > > + * To be sure that all removal jobs are terminated, ensure > > > > > that any > > > > > + * scheduled work has run to completion. > > > > > + */ > > > > > + drain_workqueue(fw_devlink_wq); > > > > > > > > Is there a reason this needs to be drain_workqueu() instead of > > > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > > > case. All we are trying to make sure is that all the device link > > > > remove work queued so far have completed. > > > > > > > > > > Yeah, I'm also using flush_workqueue(). > > > > > > > > +} > > > > > +EXPORT_SYMBOL_GPL(device_link_wait_removal); > > > > > + > > > > > static struct class devlink_class = { > > > > > .name = "devlink", > > > > > .dev_groups = devlink_groups, > > > > > @@ -4085,9 +4100,14 @@ int __init devices_init(void) > > > > > sysfs_dev_char_kobj = kobject_create_and_add("char", > > > > > dev_kobj); > > > > > if (!sysfs_dev_char_kobj) > > > > > goto char_kobj_err; > > > > > + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); > > > > > + if (!fw_devlink_wq) > > > > > > > > Fix the name appropriately here too please. > > > > > > Hi Saravana, > > > > > > Oh, was not aware of this series... Please look at my first patch. It > > > already has a > > > review tag by Rafael. I think the creation of the queue makes more sense > > > to be done > > > in devlink_class_init(). Moreover, Rafael complained in my first version > > > that > > > erroring out because we failed to create the queue is too harsh since > > > devlinks can > > > still work. > > > > I think Rafael can be convinced on this one. Firstly, if we fail to > > allocate so early, we have bigger problems. > > > > > So, what we do is to schedule the work if we have a queue or too call > > > device_link_release_fn() synchronously if we don't have the queue (note > > > that failing > > > to allocate the queue is very unlikely anyways). > > > > device links don't really work when you synchronously need to delete a > > link since it always uses SRCUs (it used to have a #ifndef CONFIG_SRCU > > locking). That's like saying a code still works when it doesn't hit a > > deadlock condition. > > > > Let's stick with Herve's patch series since he send it first and it > > has fewer things that need to be fixed. If he ignores this thread for > > too long, you can send a revision of yours again and we can accept > > that. > > I don't ignore the thread :) > > Hope I could take some time in the near future to send a v2 of this > series. Hi Herve, Just let me know if you don't see that happening anytime soon :). I'm very interested in having this applied fairly soon and I think the base idea for the fix is more or less in place (for both series). So it should be minor details now :). - Nuno Sá
On Fri, 2024-02-23 at 10:11 +0100, Herve Codina wrote: > Hi Saravana, > > On Tue, 20 Feb 2024 16:31:13 -0800 > Saravana Kannan <saravanak@google.com> wrote: > > ... > > > > +void device_link_wait_removal(void) > > > +{ > > > + /* > > > + * devlink removal jobs are queued in the dedicated work queue. > > > + * To be sure that all removal jobs are terminated, ensure that > > > any > > > + * scheduled work has run to completion. > > > + */ > > > + drain_workqueue(fw_devlink_wq); > > > > Is there a reason this needs to be drain_workqueu() instead of > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > case. All we are trying to make sure is that all the device link > > remove work queued so far have completed. > > I used drain_workqueue() because drain_workqueue() allows for jobs already > present in a workqueue to re-queue a job and drain_workqueue() will wait > also for this new job completion. > > I think flush_workqueue() doesn't wait for this chain queueing. > > In our case, my understanding was that device_link_release_fn() calls > put_device() for the consumer and the supplier. > If refcounts reaches zero, devlink_dev_release() can be called again > and re-queue a job. > Looks sensible. The only doubt (that Saravana mays know better) is that I'm not sure put_device() on a supplier or consumer can actually lead to devlink_dev_release(). AFAIU, a consumer or a supplier should not be a device from the devlink class. Hence, looking at device_release(), I'm not sure it can happen unless for some odd reason someone is messing with devlinks in .remove() or .type->remove(). - Nuno Sá
On Fri, Feb 23, 2024 at 2:41 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > On Fri, 2024-02-23 at 10:11 +0100, Herve Codina wrote: > > Hi Saravana, > > > > On Tue, 20 Feb 2024 16:31:13 -0800 > > Saravana Kannan <saravanak@google.com> wrote: > > > > ... > > > > > > +void device_link_wait_removal(void) > > > > +{ > > > > + /* > > > > + * devlink removal jobs are queued in the dedicated work queue. > > > > + * To be sure that all removal jobs are terminated, ensure that > > > > any > > > > + * scheduled work has run to completion. > > > > + */ > > > > + drain_workqueue(fw_devlink_wq); > > > > > > Is there a reason this needs to be drain_workqueu() instead of > > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > > case. All we are trying to make sure is that all the device link > > > remove work queued so far have completed. > > > > I used drain_workqueue() because drain_workqueue() allows for jobs already > > present in a workqueue to re-queue a job and drain_workqueue() will wait > > also for this new job completion. > > > > I think flush_workqueue() doesn't wait for this chain queueing. > > > > In our case, my understanding was that device_link_release_fn() calls > > put_device() for the consumer and the supplier. > > If refcounts reaches zero, devlink_dev_release() can be called again > > and re-queue a job. > > > > Looks sensible. The only doubt (that Saravana mays know better) is that I'm not > sure put_device() on a supplier or consumer can actually lead to > devlink_dev_release(). AFAIU, a consumer or a supplier should not be a device > from the devlink class. Hence, looking at device_release(), I'm not sure it can > happen unless for some odd reason someone is messing with devlinks in .remove() > or .type->remove(). The case we are trying to fix here involves a supplier or a consumer device (say Device-A) being device_del(). When that happens, all the device links to/from the device are deleted by a call to device_links_purge() since a device link can't exist without both the supplier and consumer existing. The problem you were hitting is that the device link deletion code does the put_device(Device-A) in a workqueue. You change is to make sure to wait until that has completed. To do that, you only need to wait for the device link deletion work (already queued before returning from device_del()) to finish. You don't need to wait for anything more. I read up on drain_workqueue() before I made my comments. The point I was trying to make is that there could be some unrelated device link deletions that you don't need to wait on. But taking a closer look[1], it looks like drain_workqueue() might actually cause bugs because while a workqueue is being drained, if another unrelated device link deletion is trying to queue work, that will get ignored. Reply to rest of the emails in this thread here: Nuno, Sorry if I messed up who sent the first patch, but I did dig back to your v1. But I could be wrong. If devlink_dev_release() could have done the work synchronously, we'd have done it in the first place. It's actually a bug because devlink_dev_release() gets called in atomic context but the put_device() on the supplier/consumer can do some sleeping work. -Saravana [1] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/workqueue.c#n1727
On Thu, 2024-02-29 at 15:26 -0800, Saravana Kannan wrote: > On Fri, Feb 23, 2024 at 2:41 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Fri, 2024-02-23 at 10:11 +0100, Herve Codina wrote: > > > Hi Saravana, > > > > > > On Tue, 20 Feb 2024 16:31:13 -0800 > > > Saravana Kannan <saravanak@google.com> wrote: > > > > > > ... > > > > > > > > +void device_link_wait_removal(void) > > > > > +{ > > > > > + /* > > > > > + * devlink removal jobs are queued in the dedicated work queue. > > > > > + * To be sure that all removal jobs are terminated, ensure that > > > > > any > > > > > + * scheduled work has run to completion. > > > > > + */ > > > > > + drain_workqueue(fw_devlink_wq); > > > > > > > > Is there a reason this needs to be drain_workqueu() instead of > > > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > > > case. All we are trying to make sure is that all the device link > > > > remove work queued so far have completed. > > > > > > I used drain_workqueue() because drain_workqueue() allows for jobs already > > > present in a workqueue to re-queue a job and drain_workqueue() will wait > > > also for this new job completion. > > > > > > I think flush_workqueue() doesn't wait for this chain queueing. > > > > > > In our case, my understanding was that device_link_release_fn() calls > > > put_device() for the consumer and the supplier. > > > If refcounts reaches zero, devlink_dev_release() can be called again > > > and re-queue a job. > > > > > > > Looks sensible. The only doubt (that Saravana mays know better) is that I'm not > > sure put_device() on a supplier or consumer can actually lead to > > devlink_dev_release(). AFAIU, a consumer or a supplier should not be a device > > from the devlink class. Hence, looking at device_release(), I'm not sure it can > > happen unless for some odd reason someone is messing with devlinks in .remove() > > or .type->remove(). > > The case we are trying to fix here involves a supplier or a consumer > device (say Device-A) being device_del(). When that happens, all the > device links to/from the device are deleted by a call to > device_links_purge() since a device link can't exist without both the > supplier and consumer existing. > > The problem you were hitting is that the device link deletion code > does the put_device(Device-A) in a workqueue. You change is to make > sure to wait until that has completed. To do that, you only need to > wait for the device link deletion work (already queued before > returning from device_del()) to finish. You don't need to wait for > anything more. > > I read up on drain_workqueue() before I made my comments. The point I > was trying to make is that there could be some unrelated device link > deletions that you don't need to wait on. > > But taking a closer look[1], it looks like drain_workqueue() might > actually cause bugs because while a workqueue is being drained, if > another unrelated device link deletion is trying to queue work, that > will get ignored. > Oh, even worst then... please also take a look at the new v3 Herve sent. Herve is already convinced about flush_workqueue(). The other sensible discussion is about releasing the of_mutex in patch 2. I'm not convinced we need it but you may know better. > Reply to rest of the emails in this thread here: > > Nuno, > > Sorry if I messed up who sent the first patch, but I did dig back to > your v1. But I could be wrong. > I did sent first a RFC [1] (which should also count :)). And it actually took a lot of "pushing" with resends to get some attention on this. And if follow the RFC you'll even see that I first reported the issue in May or something (but did not really put too much effort on it at the time). I have to admit it's a bit frustrating given how much I pushed and insisted in fixing this (and not have my own patches in :P). But that's life and in the end of day I just care about this being fixed. So, no hard feelings :). > If devlink_dev_release() could have done the work synchronously, we'd > have done it in the first place. It's actually a bug because > devlink_dev_release() gets called in atomic context but the > put_device() on the supplier/consumer can do some sleeping work. > Not sure I'm following the above. I may be missing something but looking at the code paths it actually looks like devlink_dev_release() is always called with the device_links_lock held. Therefore we need to be already in a sleeping context or we already have a problem... Looking at git history, the problem we had before was that we were using call_srcu() and the srcu callback cannot sleep which could happen in a device release function. Anyways, Rafael already said he's fine in erroring out in case the queue fails to allocate (as you said, if that happens the system is already likely screwed). My only complain now is in the place we're allocating the queue. [1]: https://lore.kernel.org/lkml/20231127-fix-device-links-overlays-v1-1-d7438f56d025@analog.com/ - Nuno Sá > -Saravana > > [1] - > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/workqueue.c#n1727
diff --git a/drivers/base/core.c b/drivers/base/core.c index ac026187ac6a..2e102a77758c 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void); static void __fw_devlink_link_to_consumers(struct device *dev); static bool fw_devlink_drv_reg_done; static bool fw_devlink_best_effort; +static struct workqueue_struct *fw_devlink_wq; /** * __fwnode_link_add - Create a link between two fwnode_handles. @@ -530,12 +531,26 @@ static void devlink_dev_release(struct device *dev) /* * It may take a while to complete this work because of the SRCU * synchronization in device_link_release_fn() and if the consumer or - * supplier devices get deleted when it runs, so put it into the "long" - * workqueue. + * supplier devices get deleted when it runs, so put it into the + * dedicated workqueue. */ - queue_work(system_long_wq, &link->rm_work); + queue_work(fw_devlink_wq, &link->rm_work); } +/** + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate + */ +void device_link_wait_removal(void) +{ + /* + * devlink removal jobs are queued in the dedicated work queue. + * To be sure that all removal jobs are terminated, ensure that any + * scheduled work has run to completion. + */ + drain_workqueue(fw_devlink_wq); +} +EXPORT_SYMBOL_GPL(device_link_wait_removal); + static struct class devlink_class = { .name = "devlink", .dev_groups = devlink_groups, @@ -4085,9 +4100,14 @@ int __init devices_init(void) sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); if (!sysfs_dev_char_kobj) goto char_kobj_err; + fw_devlink_wq = alloc_workqueue("fw_devlink_wq", 0, 0); + if (!fw_devlink_wq) + goto wq_err; return 0; + wq_err: + kobject_put(sysfs_dev_char_kobj); char_kobj_err: kobject_put(sysfs_dev_block_kobj); block_kobj_err: diff --git a/include/linux/device.h b/include/linux/device.h index 2b093e62907a..c26f4b3df2bd 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1250,6 +1250,7 @@ void device_link_del(struct device_link *link); void device_link_remove(void *consumer, struct device *supplier); void device_links_supplier_sync_state_pause(void); void device_links_supplier_sync_state_resume(void); +void device_link_wait_removal(void); /* Create alias, so I can be autoloaded. */ #define MODULE_ALIAS_CHARDEV(major,minor) \