[v1,1/2] PM: runtime: Do not call __rpm_callback() from rpm_idle()

Message ID 4789678.31r3eYUQgx@kreacher
State New
Headers
Series PM: runtime: Fix rpm_idle() and relocate rpm_callback() |

Commit Message

Rafael J. Wysocki Dec. 2, 2022, 2:30 p.m. UTC
  From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Calling __rpm_callback() from rpm_idle() after adding device links
support to the former is a clear mistake.

Not only it causes rpm_idle() to carry out unnecessary actions, but it
is also against the assumption regarding the stability of PM-runtime
status accross __rpm_callback() invocations, because rpm_suspend() and
rpm_resume() may run in parallel with __rpm_callback() when it is called
by rpm_idle() and the device's PM-runtime status can be updated by any
of them.

Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
  

Comments

Adrian Hunter Dec. 5, 2022, 7:45 a.m. UTC | #1
On 2/12/22 16:30, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Calling __rpm_callback() from rpm_idle() after adding device links
> support to the former is a clear mistake.
> 
> Not only it causes rpm_idle() to carry out unnecessary actions, but it
> is also against the assumption regarding the stability of PM-runtime
> status accross __rpm_callback() invocations, because rpm_suspend() and

accross -> across

> rpm_resume() may run in parallel with __rpm_callback() when it is called
> by rpm_idle() and the device's PM-runtime status can be updated by any
> of them.
> 
> Fixes: 21d5c57b3726 ("PM / runtime: Use device links")

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/runtime.c |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
>  
>  	dev->power.idle_notification = true;
>  
> -	retval = __rpm_callback(callback, dev);
> +	if (dev->power.irq_safe)
> +		spin_unlock(&dev->power.lock);
> +	else
> +		spin_unlock_irq(&dev->power.lock);
> +
> +	retval = callback(dev);
> +
> +	if (dev->power.irq_safe)
> +		spin_lock(&dev->power.lock);
> +	else
> +		spin_lock_irq(&dev->power.lock);
>  
>  	dev->power.idle_notification = false;
>  	wake_up_all(&dev->power.wait_queue);
> 
> 
>
  
Ulf Hansson Dec. 5, 2022, 12:07 p.m. UTC | #2
On Fri, 2 Dec 2022 at 15:32, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Calling __rpm_callback() from rpm_idle() after adding device links
> support to the former is a clear mistake.
>
> Not only it causes rpm_idle() to carry out unnecessary actions, but it
> is also against the assumption regarding the stability of PM-runtime
> status accross __rpm_callback() invocations, because rpm_suspend() and
> rpm_resume() may run in parallel with __rpm_callback() when it is called
> by rpm_idle() and the device's PM-runtime status can be updated by any
> of them.

Urgh, that's a nasty bug you are fixing here. Is there perhaps some
links to some error reports that can make sense to include here?

>
> Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/runtime.c |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
>
>         dev->power.idle_notification = true;
>
> -       retval = __rpm_callback(callback, dev);

Couldn't we just extend __rpm_callback() to take another in-parameter,
rather than open-coding the below?

Note that, __rpm_callback() already uses a "bool use_links" internal
variable, that indicates whether the device links should be used or
not.

> +       if (dev->power.irq_safe)
> +               spin_unlock(&dev->power.lock);
> +       else
> +               spin_unlock_irq(&dev->power.lock);
> +
> +       retval = callback(dev);
> +
> +       if (dev->power.irq_safe)
> +               spin_lock(&dev->power.lock);
> +       else
> +               spin_lock_irq(&dev->power.lock);
>
>         dev->power.idle_notification = false;
>         wake_up_all(&dev->power.wait_queue);
>
>
>

Kind regards
Uffe
  
Rafael J. Wysocki Dec. 5, 2022, 12:13 p.m. UTC | #3
On Mon, Dec 5, 2022 at 1:08 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 2 Dec 2022 at 15:32, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Calling __rpm_callback() from rpm_idle() after adding device links
> > support to the former is a clear mistake.
> >
> > Not only it causes rpm_idle() to carry out unnecessary actions, but it
> > is also against the assumption regarding the stability of PM-runtime
> > status accross __rpm_callback() invocations, because rpm_suspend() and
> > rpm_resume() may run in parallel with __rpm_callback() when it is called
> > by rpm_idle() and the device's PM-runtime status can be updated by any
> > of them.
>
> Urgh, that's a nasty bug you are fixing here. Is there perhaps some
> links to some error reports that can make sense to include here?

There is a bug report, but I have no confirmation that this fix is
sufficient to address it (even though I'm quite confident that it will
be).

> >
> > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/runtime.c |   12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
> >
> >         dev->power.idle_notification = true;
> >
> > -       retval = __rpm_callback(callback, dev);
>
> Couldn't we just extend __rpm_callback() to take another in-parameter,
> rather than open-coding the below?

I'd rather not do that.

I'd prefer rpm_callback() to be used only in rpm_suspend() and
rpm_resume() where all of the assumptions hold and rpm_idle() really
is a special case.

And there is not much open-coding here, just the locking part.

> Note that, __rpm_callback() already uses a "bool use_links" internal
> variable, that indicates whether the device links should be used or
> not.

Yes, it does, but why does that matter?

> > +       if (dev->power.irq_safe)
> > +               spin_unlock(&dev->power.lock);
> > +       else
> > +               spin_unlock_irq(&dev->power.lock);
> > +
> > +       retval = callback(dev);
> > +
> > +       if (dev->power.irq_safe)
> > +               spin_lock(&dev->power.lock);
> > +       else
> > +               spin_lock_irq(&dev->power.lock);
> >
> >         dev->power.idle_notification = false;
> >         wake_up_all(&dev->power.wait_queue);
> >
> >
> >
  
Ulf Hansson Dec. 5, 2022, 12:46 p.m. UTC | #4
On Mon, 5 Dec 2022 at 13:13, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Dec 5, 2022 at 1:08 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Fri, 2 Dec 2022 at 15:32, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Calling __rpm_callback() from rpm_idle() after adding device links
> > > support to the former is a clear mistake.
> > >
> > > Not only it causes rpm_idle() to carry out unnecessary actions, but it
> > > is also against the assumption regarding the stability of PM-runtime
> > > status accross __rpm_callback() invocations, because rpm_suspend() and
> > > rpm_resume() may run in parallel with __rpm_callback() when it is called
> > > by rpm_idle() and the device's PM-runtime status can be updated by any
> > > of them.
> >
> > Urgh, that's a nasty bug you are fixing here. Is there perhaps some
> > links to some error reports that can make sense to include here?
>
> There is a bug report, but I have no confirmation that this fix is
> sufficient to address it (even though I'm quite confident that it will
> be).
>
> > >
> > > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/power/runtime.c |   12 +++++++++++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > Index: linux-pm/drivers/base/power/runtime.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/power/runtime.c
> > > +++ linux-pm/drivers/base/power/runtime.c
> > > @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
> > >
> > >         dev->power.idle_notification = true;
> > >
> > > -       retval = __rpm_callback(callback, dev);
> >
> > Couldn't we just extend __rpm_callback() to take another in-parameter,
> > rather than open-coding the below?
>
> I'd rather not do that.
>
> I'd prefer rpm_callback() to be used only in rpm_suspend() and
> rpm_resume() where all of the assumptions hold and rpm_idle() really
> is a special case.
>
> And there is not much open-coding here, just the locking part.

That and the actual call to the callback. Not much, but still.

>
> > Note that, __rpm_callback() already uses a "bool use_links" internal
> > variable, that indicates whether the device links should be used or
> > not.
>
> Yes, it does, but why does that matter?

It means that __rpm_callback() is already prepared to (almost) cover this case.

>
> > > +       if (dev->power.irq_safe)
> > > +               spin_unlock(&dev->power.lock);
> > > +       else
> > > +               spin_unlock_irq(&dev->power.lock);
> > > +
> > > +       retval = callback(dev);
> > > +
> > > +       if (dev->power.irq_safe)
> > > +               spin_lock(&dev->power.lock);
> > > +       else
> > > +               spin_lock_irq(&dev->power.lock);
> > >
> > >         dev->power.idle_notification = false;
> > >         wake_up_all(&dev->power.wait_queue);
> > >
> > >
> > >

Note, it's not a big deal to me, if you feel strongly that your
current approach is better, I am fine with that too.

Kind regards
Uffe
  
Rafael J. Wysocki Dec. 5, 2022, 12:51 p.m. UTC | #5
On Mon, Dec 5, 2022 at 1:47 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 5 Dec 2022 at 13:13, Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Mon, Dec 5, 2022 at 1:08 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > On Fri, 2 Dec 2022 at 15:32, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > >
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > Calling __rpm_callback() from rpm_idle() after adding device links
> > > > support to the former is a clear mistake.
> > > >
> > > > Not only it causes rpm_idle() to carry out unnecessary actions, but it
> > > > is also against the assumption regarding the stability of PM-runtime
> > > > status accross __rpm_callback() invocations, because rpm_suspend() and
> > > > rpm_resume() may run in parallel with __rpm_callback() when it is called
> > > > by rpm_idle() and the device's PM-runtime status can be updated by any
> > > > of them.
> > >
> > > Urgh, that's a nasty bug you are fixing here. Is there perhaps some
> > > links to some error reports that can make sense to include here?
> >
> > There is a bug report, but I have no confirmation that this fix is
> > sufficient to address it (even though I'm quite confident that it will
> > be).
> >
> > > >
> > > > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/base/power/runtime.c |   12 +++++++++++-
> > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/base/power/runtime.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/base/power/runtime.c
> > > > +++ linux-pm/drivers/base/power/runtime.c
> > > > @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
> > > >
> > > >         dev->power.idle_notification = true;
> > > >
> > > > -       retval = __rpm_callback(callback, dev);
> > >
> > > Couldn't we just extend __rpm_callback() to take another in-parameter,
> > > rather than open-coding the below?
> >
> > I'd rather not do that.
> >
> > I'd prefer rpm_callback() to be used only in rpm_suspend() and
> > rpm_resume() where all of the assumptions hold and rpm_idle() really
> > is a special case.
> >
> > And there is not much open-coding here, just the locking part.
>
> That and the actual call to the callback. Not much, but still.

Note that it doesn't need to check the callback pointer, though.

Moreover, IMO this code is easier to read without having to look at
__rpm_callback() and reverse engineer all of the different use cases
covered by it.

> > > Note that, __rpm_callback() already uses a "bool use_links" internal
> > > variable, that indicates whether the device links should be used or
> > > not.
> >
> > Yes, it does, but why does that matter?
>
> It means that __rpm_callback() is already prepared to (almost) cover this case.

Well, why does it have to cover all of the cases that are even somewhat related?

> >
> > > > +       if (dev->power.irq_safe)
> > > > +               spin_unlock(&dev->power.lock);
> > > > +       else
> > > > +               spin_unlock_irq(&dev->power.lock);
> > > > +
> > > > +       retval = callback(dev);
> > > > +
> > > > +       if (dev->power.irq_safe)
> > > > +               spin_lock(&dev->power.lock);
> > > > +       else
> > > > +               spin_lock_irq(&dev->power.lock);
> > > >
> > > >         dev->power.idle_notification = false;
> > > >         wake_up_all(&dev->power.wait_queue);
> > > >
> > > >
> > > >
>
> Note, it's not a big deal to me, if you feel strongly that your
> current approach is better, I am fine with that too.

OK, thanks!
  
Rafael J. Wysocki Dec. 5, 2022, 2:45 p.m. UTC | #6
On Mon, Dec 5, 2022 at 8:45 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 2/12/22 16:30, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Calling __rpm_callback() from rpm_idle() after adding device links
> > support to the former is a clear mistake.
> >
> > Not only it causes rpm_idle() to carry out unnecessary actions, but it
> > is also against the assumption regarding the stability of PM-runtime
> > status accross __rpm_callback() invocations, because rpm_suspend() and
>
> accross -> across

Fixed whey applying the patch.

> > rpm_resume() may run in parallel with __rpm_callback() when it is called
> > by rpm_idle() and the device's PM-runtime status can be updated by any
> > of them.
> >
> > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

Thank you!

> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/runtime.c |   12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
> >
> >       dev->power.idle_notification = true;
> >
> > -     retval = __rpm_callback(callback, dev);
> > +     if (dev->power.irq_safe)
> > +             spin_unlock(&dev->power.lock);
> > +     else
> > +             spin_unlock_irq(&dev->power.lock);
> > +
> > +     retval = callback(dev);
> > +
> > +     if (dev->power.irq_safe)
> > +             spin_lock(&dev->power.lock);
> > +     else
> > +             spin_lock_irq(&dev->power.lock);
> >
> >       dev->power.idle_notification = false;
> >       wake_up_all(&dev->power.wait_queue);
> >
> >
> >
>
  

Patch

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -484,7 +484,17 @@  static int rpm_idle(struct device *dev,
 
 	dev->power.idle_notification = true;
 
-	retval = __rpm_callback(callback, dev);
+	if (dev->power.irq_safe)
+		spin_unlock(&dev->power.lock);
+	else
+		spin_unlock_irq(&dev->power.lock);
+
+	retval = callback(dev);
+
+	if (dev->power.irq_safe)
+		spin_lock(&dev->power.lock);
+	else
+		spin_lock_irq(&dev->power.lock);
 
 	dev->power.idle_notification = false;
 	wake_up_all(&dev->power.wait_queue);