[V7] thermal/core/power_allocator: avoid thermal cdev can not be reset

Message ID 20240110115526.30776-1-di.shen@unisoc.com
State New
Headers
Series [V7] thermal/core/power_allocator: avoid thermal cdev can not be reset |

Commit Message

Di Shen Jan. 10, 2024, 11:55 a.m. UTC
  Commit 0952177f2a1f ("thermal/core/power_allocator: Update once
cooling devices when temp is low") adds an update flag to avoid
the thermal event is triggered when there is no need, and
thermal cdev would be updated once when temperature is low.

But when the trips are writable, and switch_on_temp is set
to be a higher value, the cooling device state may not be
reset to 0, because last_temperature is smaller than the
switch_on_temp.

For example:
First:
switch_on_temp=70 control_temp=85;
Then userspace change the trip_temp:
switch_on_temp=45 control_temp=55 cur_temp=54

Then userspace reset the trip_temp:
switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54

At this time, the cooling device state should be reset to 0.
However, because cur_temp(57) < switch_on_temp(70)
last_temp(54) < switch_on_temp(70)  ---->  update = false,
update is false, the cooling device state can not be reset.

Considering tz->passive can also be represented the temperature
status, this patch modifies the update flag with tz->passive.

When the first time the temperature drops below switch_on, the
states of cooling devices can be reset once, and the tz->passive
is updated to 0. In the next round, because tz->passive is 0,
the cdev->state would not be updated.

By using the tz->passive as the "update" flag, the issue above
can be solved, and the cooling devices can be update only once
when the temperature is low.

Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low")
Cc: <stable@vger.kernel.org> # v5.13+
Suggested-by: Wei Wang <wvw@google.com>
Signed-off-by: Di Shen <di.shen@unisoc.com>

---
V7:
- Some formatting changes.
- Add Suggested-by tag.

V6: [6]
Compared to the previous version:
- Not change the thermal core.
- Not add new variables and function.
- Use tz->passive as "update" flag to indicate whether the cooling
  devices should be reset.

V5: [5]
- Simplify the reset ops, make it no return value and no specific
  trip ID as argument.
- Extend the commit message.

V4: [4]
- Compared to V3, handle it in thermal core instead of in governor.
- Add an ops to the governor structure, and call it when a trip
  point is changed.
- Define reset ops for power allocator.

V3: [3]
- Add fix tag.

V2: [2]
- Compared to v1, do not revert.
- Add a variable(last_switch_on_temp) in power_allocator_params
  to record the last switch_on_temp value.
- Adds a function to renew the update flag and update the
  last_switch_on_temp when thermal trips are writable.

V1: [1]
- Revert commit 0952177f2a1f.

[1] https://lore.kernel.org/all/20230309135515.1232-1-di.shen@unisoc.com/
[2] https://lore.kernel.org/all/20230315093008.17489-1-di.shen@unisoc.com/
[3] https://lore.kernel.org/all/20230320095620.7480-1-di.shen@unisoc.com/
[4] https://lore.kernel.org/all/20230619063534.12831-1-di.shen@unisoc.com/
[5] https://lore.kernel.org/all/20230710033234.28641-1-di.shen@unisoc.com/
[6] https://lore.kernel.org/all/20240109112736.32566-1-di.shen@unisoc.com/
---
---
 drivers/thermal/gov_power_allocator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Lukasz Luba Jan. 10, 2024, 1:04 p.m. UTC | #1
On 1/10/24 11:55, Di Shen wrote:
> Commit 0952177f2a1f ("thermal/core/power_allocator: Update once
> cooling devices when temp is low") adds an update flag to avoid
> the thermal event is triggered when there is no need, and
> thermal cdev would be updated once when temperature is low.
> 
> But when the trips are writable, and switch_on_temp is set
> to be a higher value, the cooling device state may not be
> reset to 0, because last_temperature is smaller than the
> switch_on_temp.
> 
> For example:
> First:
> switch_on_temp=70 control_temp=85;
> Then userspace change the trip_temp:
> switch_on_temp=45 control_temp=55 cur_temp=54
> 
> Then userspace reset the trip_temp:
> switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54
> 
> At this time, the cooling device state should be reset to 0.
> However, because cur_temp(57) < switch_on_temp(70)
> last_temp(54) < switch_on_temp(70)  ---->  update = false,
> update is false, the cooling device state can not be reset.
> 
> Considering tz->passive can also be represented the temperature
> status, this patch modifies the update flag with tz->passive.
> 
> When the first time the temperature drops below switch_on, the
> states of cooling devices can be reset once, and the tz->passive
> is updated to 0. In the next round, because tz->passive is 0,
> the cdev->state would not be updated.
> 
> By using the tz->passive as the "update" flag, the issue above
> can be solved, and the cooling devices can be update only once
> when the temperature is low.
> 
> Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low")
> Cc: <stable@vger.kernel.org> # v5.13+
> Suggested-by: Wei Wang <wvw@google.com>
> Signed-off-by: Di Shen <di.shen@unisoc.com>
> 
> ---
> V7:
> - Some formatting changes.
> - Add Suggested-by tag.
> 
> V6: [6]
> Compared to the previous version:
> - Not change the thermal core.
> - Not add new variables and function.
> - Use tz->passive as "update" flag to indicate whether the cooling
>    devices should be reset.
> 
> V5: [5]
> - Simplify the reset ops, make it no return value and no specific
>    trip ID as argument.
> - Extend the commit message.
> 
> V4: [4]
> - Compared to V3, handle it in thermal core instead of in governor.
> - Add an ops to the governor structure, and call it when a trip
>    point is changed.
> - Define reset ops for power allocator.
> 
> V3: [3]
> - Add fix tag.
> 
> V2: [2]
> - Compared to v1, do not revert.
> - Add a variable(last_switch_on_temp) in power_allocator_params
>    to record the last switch_on_temp value.
> - Adds a function to renew the update flag and update the
>    last_switch_on_temp when thermal trips are writable.
> 
> V1: [1]
> - Revert commit 0952177f2a1f.
> 
> [1] https://lore.kernel.org/all/20230309135515.1232-1-di.shen@unisoc.com/
> [2] https://lore.kernel.org/all/20230315093008.17489-1-di.shen@unisoc.com/
> [3] https://lore.kernel.org/all/20230320095620.7480-1-di.shen@unisoc.com/
> [4] https://lore.kernel.org/all/20230619063534.12831-1-di.shen@unisoc.com/
> [5] https://lore.kernel.org/all/20230710033234.28641-1-di.shen@unisoc.com/
> [6] https://lore.kernel.org/all/20240109112736.32566-1-di.shen@unisoc.com/
> ---
> ---
>   drivers/thermal/gov_power_allocator.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
> index 7b6aa265ff6a..81e061f183ad 100644
> --- a/drivers/thermal/gov_power_allocator.c
> +++ b/drivers/thermal/gov_power_allocator.c
> @@ -762,7 +762,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz,
>   
>   	trip = params->trip_switch_on;
>   	if (trip && tz->temperature < trip->temperature) {
> -		update = tz->last_temperature >= trip->temperature;
> +		update = tz->passive;
>   		tz->passive = 0;
>   		reset_pid_controller(params);
>   		allow_maximum_power(tz, update);

Thanks for the patch, LGTM.

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
  
Rafael J. Wysocki Jan. 10, 2024, 1:30 p.m. UTC | #2
On Wed, Jan 10, 2024 at 2:03 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 1/10/24 11:55, Di Shen wrote:
> > Commit 0952177f2a1f ("thermal/core/power_allocator: Update once
> > cooling devices when temp is low") adds an update flag to avoid
> > the thermal event is triggered when there is no need, and
> > thermal cdev would be updated once when temperature is low.
> >
> > But when the trips are writable, and switch_on_temp is set
> > to be a higher value, the cooling device state may not be
> > reset to 0, because last_temperature is smaller than the
> > switch_on_temp.
> >
> > For example:
> > First:
> > switch_on_temp=70 control_temp=85;
> > Then userspace change the trip_temp:
> > switch_on_temp=45 control_temp=55 cur_temp=54
> >
> > Then userspace reset the trip_temp:
> > switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54
> >
> > At this time, the cooling device state should be reset to 0.
> > However, because cur_temp(57) < switch_on_temp(70)
> > last_temp(54) < switch_on_temp(70)  ---->  update = false,
> > update is false, the cooling device state can not be reset.
> >
> > Considering tz->passive can also be represented the temperature
> > status, this patch modifies the update flag with tz->passive.
> >
> > When the first time the temperature drops below switch_on, the
> > states of cooling devices can be reset once, and the tz->passive
> > is updated to 0. In the next round, because tz->passive is 0,
> > the cdev->state would not be updated.
> >
> > By using the tz->passive as the "update" flag, the issue above
> > can be solved, and the cooling devices can be update only once
> > when the temperature is low.
> >
> > Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low")
> > Cc: <stable@vger.kernel.org> # v5.13+
> > Suggested-by: Wei Wang <wvw@google.com>
> > Signed-off-by: Di Shen <di.shen@unisoc.com>
> >
> > ---
> > V7:
> > - Some formatting changes.
> > - Add Suggested-by tag.
> >
> > V6: [6]
> > Compared to the previous version:
> > - Not change the thermal core.
> > - Not add new variables and function.
> > - Use tz->passive as "update" flag to indicate whether the cooling
> >    devices should be reset.
> >
> > V5: [5]
> > - Simplify the reset ops, make it no return value and no specific
> >    trip ID as argument.
> > - Extend the commit message.
> >
> > V4: [4]
> > - Compared to V3, handle it in thermal core instead of in governor.
> > - Add an ops to the governor structure, and call it when a trip
> >    point is changed.
> > - Define reset ops for power allocator.
> >
> > V3: [3]
> > - Add fix tag.
> >
> > V2: [2]
> > - Compared to v1, do not revert.
> > - Add a variable(last_switch_on_temp) in power_allocator_params
> >    to record the last switch_on_temp value.
> > - Adds a function to renew the update flag and update the
> >    last_switch_on_temp when thermal trips are writable.
> >
> > V1: [1]
> > - Revert commit 0952177f2a1f.
> >
> > [1] https://lore.kernel.org/all/20230309135515.1232-1-di.shen@unisoc.com/
> > [2] https://lore.kernel.org/all/20230315093008.17489-1-di.shen@unisoc.com/
> > [3] https://lore.kernel.org/all/20230320095620.7480-1-di.shen@unisoc.com/
> > [4] https://lore.kernel.org/all/20230619063534.12831-1-di.shen@unisoc.com/
> > [5] https://lore.kernel.org/all/20230710033234.28641-1-di.shen@unisoc.com/
> > [6] https://lore.kernel.org/all/20240109112736.32566-1-di.shen@unisoc.com/
> > ---
> > ---
> >   drivers/thermal/gov_power_allocator.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
> > index 7b6aa265ff6a..81e061f183ad 100644
> > --- a/drivers/thermal/gov_power_allocator.c
> > +++ b/drivers/thermal/gov_power_allocator.c
> > @@ -762,7 +762,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz,
> >
> >       trip = params->trip_switch_on;
> >       if (trip && tz->temperature < trip->temperature) {
> > -             update = tz->last_temperature >= trip->temperature;
> > +             update = tz->passive;
> >               tz->passive = 0;
> >               reset_pid_controller(params);
> >               allow_maximum_power(tz, update);
>
> Thanks for the patch, LGTM.
>
> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>

Applied as 6.8-rc1 material with modified subject and edited changelog.

Thanks!
  

Patch

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 7b6aa265ff6a..81e061f183ad 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -762,7 +762,7 @@  static int power_allocator_throttle(struct thermal_zone_device *tz,
 
 	trip = params->trip_switch_on;
 	if (trip && tz->temperature < trip->temperature) {
-		update = tz->last_temperature >= trip->temperature;
+		update = tz->passive;
 		tz->passive = 0;
 		reset_pid_controller(params);
 		allow_maximum_power(tz, update);