[V5] thermal/core/power_allocator: reset thermal governor when trip point is changed
Message ID | 20230710033234.28641-1-di.shen@unisoc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp4773241vqx; Sun, 9 Jul 2023 21:30:24 -0700 (PDT) X-Google-Smtp-Source: APBJJlGFl6o1OqINInHMcz1YmCOEHdyyT0WcdemCv9/Q+cIW32Ipj0tU65Egp5GE9XiPVhQ22sd+ X-Received: by 2002:a05:6808:14c3:b0:3a4:8b8:ce38 with SMTP id f3-20020a05680814c300b003a408b8ce38mr1555017oiw.54.1688963424307; Sun, 09 Jul 2023 21:30:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688963424; cv=none; d=google.com; s=arc-20160816; b=sDkAljoD1O43MtLknL3F8NHFj/c4UZYBbZcMR8y+5h57iHS6puQUjmm/uvyQa0Tmxs EVZsZh/0QEr2w3qmw/2L6PwXwWANZL3iRGTvc+i4wD/8Y3AzIDl4xS93wGE9ez+AiFcZ 7h9WrpYcTtOv3OO2Z+xjivQ2szt/9SqIJvwaUHjU0M650KYq64fxFTnhc7Q6B3lQjHbr iG81E4LrtkVyQTjzXiWc454XPmjia4jIaqVU43pM2/LkNEEEDE5OEgT0LIs5EonN0IoT Ua6Mojhdkx4Eq9FOfmqfb/V3dF3dCJjQRSM9rcKPIs3JRJkMmJ8FSms53mGbzO86khRH 44Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from; bh=jTxYazJh+hLnK61EPaI71n+49ZgW3X3uEokilD9/wrE=; fh=uXPQ136T4Y2CbQNlL80bUwL/uE5WspNd5WGIm/0JykE=; b=T+Tv4YSGIPvEqYod/4YCziHmD6iWRpNTKERZS1WvZKkiBMwFQOr22xDie7v9RyWMKT A3anVVjXA62Vt7wYiBONKyaW/0r+2Rk/vJ4kPJtwpc4+93rzBE4d3HW3TqACGVBnDA6N OIgOIw1MN9J7cTYuU2XIEkv0hfjNSj7TQwkKgJa5sfiJJOSqACyBF8jGPVA2ntIpeuWu Wllk1EKp4KsIcux2OA+CWwgS6Ww9vfmmesEee7+SV8rXOdsZ3DuMzeinmSHA77vylf3R 19AoglJCrAhR9dv3hzHhooKgHEHwhpD9bpJXoAZy3C9V/2zK/sJFFy+cLTr4VS3oLZve iipg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k30-20020a63561e000000b0055ac8f5d5b1si8449953pgb.164.2023.07.09.21.30.10; Sun, 09 Jul 2023 21:30:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229866AbjGJDd2 (ORCPT <rfc822;ybw1215001957@gmail.com> + 99 others); Sun, 9 Jul 2023 23:33:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229554AbjGJDdZ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 9 Jul 2023 23:33:25 -0400 Received: from SHSQR01.spreadtrum.com (mx1.unisoc.com [222.66.158.135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2B23F9; Sun, 9 Jul 2023 20:33:20 -0700 (PDT) Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 36A3Wd9E089360; Mon, 10 Jul 2023 11:32:39 +0800 (+08) (envelope-from Di.Shen@unisoc.com) Received: from SHDLP.spreadtrum.com (bjmbx01.spreadtrum.com [10.0.64.7]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4QzqL567v8z2LcN4n; Mon, 10 Jul 2023 11:31:41 +0800 (CST) Received: from bj10906pcu1.spreadtrum.com (10.0.73.63) by BJMBX01.spreadtrum.com (10.0.64.7) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Mon, 10 Jul 2023 11:32:37 +0800 From: Di Shen <di.shen@unisoc.com> To: <lukasz.luba@arm.com>, <rafael@kernel.org>, <daniel.lezcano@linaro.org> CC: <amitk@kernel.org>, <rui.zhang@intel.com>, <linux-pm@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <xuewen.yan@unisoc.com>, <jeson.gao@unisoc.com>, <orsonzhai@gmail.com>, <zhanglyra@gmail.com>, <di.shen@unisoc.com> Subject: [PATCH V5] thermal/core/power_allocator: reset thermal governor when trip point is changed Date: Mon, 10 Jul 2023 11:32:34 +0800 Message-ID: <20230710033234.28641-1-di.shen@unisoc.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.0.73.63] X-ClientProxiedBy: SHCAS01.spreadtrum.com (10.0.1.201) To BJMBX01.spreadtrum.com (10.0.64.7) X-MAIL: SHSQR01.spreadtrum.com 36A3Wd9E089360 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771006511888749844 X-GMAIL-MSGID: 1771006511888749844 |
Series |
[V5] thermal/core/power_allocator: reset thermal governor when trip point is changed
|
|
Commit Message
Di Shen
July 10, 2023, 3:32 a.m. UTC
When the thermal trip point is changed, the governor should
be reset so that the policy algorithm can be updated to adapt
to the new trip point.
1.Thermal governor is working for the passive trip point or active
trip point. Therefore, when the trip point is changed it should
reset the governor only for passic/active trip points.
2.For "power_allocator" governor reset, the parameters of pid
controller and the states of power cooling devices should be reset.
2.1
The IPA controls temperature using PID mechanism. It is a closed-
loop feedback monitoring system. It uses the gap between target
temperature and current temperature which says "error" as the
input of the PID controller:
err = desired_temperature - current_temperature
P_max =
k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power
That algorithm can 'learn' from the 'observed' in the past reaction
for it's control decisions and accumulates that information in the
I(Integral) part so that it can conpensate for those 'learned'
mistakes.
Based on the above, the most important is the desired temperature
comes from the trip point. When the trip point is changed, all the
previous learned errors won't help for the IPA. So the pid parameters
should be reset for IPA governor reset.
2.2
Other wise, the cooling devices should also be reset when the trip
point is changed.
This patch adds an ops for the thermal_governor structure to reset
the governor and give the 'reset' function definition for power
allocator. The ops is called when the trip points are changed via
sysfs interface.
Signed-off-by: Di Shen <di.shen@unisoc.com>
---
drivers/thermal/gov_power_allocator.c | 9 +++++++++
drivers/thermal/thermal_trip.c | 5 +++++
include/linux/thermal.h | 3 +++
3 files changed, 17 insertions(+)
Comments
On Mon, Jul 10, 2023 at 11:38 AM Di Shen <di.shen@unisoc.com> wrote: > > When the thermal trip point is changed, the governor should > be reset so that the policy algorithm can be updated to adapt > to the new trip point. > > 1.Thermal governor is working for the passive trip point or active > trip point. Therefore, when the trip point is changed it should > reset the governor only for passic/active trip points. > > 2.For "power_allocator" governor reset, the parameters of pid > controller and the states of power cooling devices should be reset. > > 2.1 > The IPA controls temperature using PID mechanism. It is a closed- > loop feedback monitoring system. It uses the gap between target > temperature and current temperature which says "error" as the > input of the PID controller: > > err = desired_temperature - current_temperature > P_max = > k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power > > That algorithm can 'learn' from the 'observed' in the past reaction > for it's control decisions and accumulates that information in the > I(Integral) part so that it can conpensate for those 'learned' > mistakes. > > Based on the above, the most important is the desired temperature > comes from the trip point. When the trip point is changed, all the > previous learned errors won't help for the IPA. So the pid parameters > should be reset for IPA governor reset. > > 2.2 > Other wise, the cooling devices should also be reset when the trip > point is changed. > > This patch adds an ops for the thermal_governor structure to reset > the governor and give the 'reset' function definition for power > allocator. The ops is called when the trip points are changed via > sysfs interface. > > Signed-off-by: Di Shen <di.shen@unisoc.com> > --- > drivers/thermal/gov_power_allocator.c | 9 +++++++++ > drivers/thermal/thermal_trip.c | 5 +++++ > include/linux/thermal.h | 3 +++ > 3 files changed, 17 insertions(+) > > diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c > index 8642f1096b91..8d17a10671e4 100644 > --- a/drivers/thermal/gov_power_allocator.c > +++ b/drivers/thermal/gov_power_allocator.c > @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) > return allocate_power(tz, trip.temperature); > } > > +static void power_allocator_reset(struct thermal_zone_device *tz) > +{ > + struct power_allocator_params *params = tz->governor_data; > + > + reset_pid_controller(params); > + allow_maximum_power(tz, true); > +} > + > static struct thermal_governor thermal_gov_power_allocator = { > .name = "power_allocator", > .bind_to_tz = power_allocator_bind, > .unbind_from_tz = power_allocator_unbind, > .throttle = power_allocator_throttle, > + .reset = power_allocator_reset, > }; > THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); > diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c > index 907f3a4d7bc8..13bbe029c6ab 100644 > --- a/drivers/thermal/thermal_trip.c > +++ b/drivers/thermal/thermal_trip.c > @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, > if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) > tz->trips[trip_id] = *trip; > > + /* Reset the governor only when the trip type is active or passive. */ > + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); > + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) > + tz->governor->reset(tz); > + > thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, > trip->temperature, trip->hysteresis); > > diff --git a/include/linux/thermal.h b/include/linux/thermal.h > index 87837094d549..d27d053319bf 100644 > --- a/include/linux/thermal.h > +++ b/include/linux/thermal.h > @@ -197,6 +197,8 @@ struct thermal_zone_device { > * thermal zone. > * @throttle: callback called for every trip point even if temperature is > * below the trip point temperature > + * @reset: callback called for governor reset > + * > * @governor_list: node in thermal_governor_list (in thermal_core.c) > */ > struct thermal_governor { > @@ -204,6 +206,7 @@ struct thermal_governor { > int (*bind_to_tz)(struct thermal_zone_device *tz); > void (*unbind_from_tz)(struct thermal_zone_device *tz); > int (*throttle)(struct thermal_zone_device *tz, int trip); > + void (*reset)(struct thermal_zone_device *tz); > struct list_head governor_list; > }; > > -- > 2.17.1 > Sorry, the change log was not successfully saved to the commit message. Add it as following here: --- V5: - Simplify the reset ops, make it no return value and no specific trip ID as argument. - Extend the commit message. V4: [4] - Compared to V3, handle it in thermal core instead of in governor. - Add an ops to the governor structure, and call it when a trip point is changed - Define reset ops for power allocator. V3: [3] - Add fix tag. V2: [2] - Compared to v1, do not revert. - Add a variable(last_switch_on_temp) in power_allocator_params to record the last switch_on_temp value. - Adds a function to renew the update flag and update the last_switch_on_temp when thermal trips are writable. V1: [1] - Revert commit 0952177f2a1f. [1] https://lore.kernel.org/all/20230309135515.1232-1-di.shen@unisoc.com/ [2] https://lore.kernel.org/all/20230315093008.17489-1-di.shen@unisoc.com/ [3] https://lore.kernel.org/all/20230320095620.7480-1-di.shen@unisoc.com/ [4] https://lore.kernel.org/all/20230619063534.12831-1-di.shen@unisoc.com/ ---
On 10/07/2023 05:32, Di Shen wrote: > When the thermal trip point is changed, the governor should > be reset so that the policy algorithm can be updated to adapt > to the new trip point. > > 1.Thermal governor is working for the passive trip point or active > trip point. Therefore, when the trip point is changed it should > reset the governor only for passic/active trip points. > > 2.For "power_allocator" governor reset, the parameters of pid > controller and the states of power cooling devices should be reset. > > 2.1 > The IPA controls temperature using PID mechanism. It is a closed- > loop feedback monitoring system. It uses the gap between target > temperature and current temperature which says "error" as the > input of the PID controller: > > err = desired_temperature - current_temperature > P_max = > k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power > > That algorithm can 'learn' from the 'observed' in the past reaction > for it's control decisions and accumulates that information in the > I(Integral) part so that it can conpensate for those 'learned' > mistakes. > > Based on the above, the most important is the desired temperature > comes from the trip point. When the trip point is changed, all the > previous learned errors won't help for the IPA. So the pid parameters > should be reset for IPA governor reset. > > 2.2 > Other wise, the cooling devices should also be reset when the trip > point is changed. > > This patch adds an ops for the thermal_governor structure to reset > the governor and give the 'reset' function definition for power > allocator. The ops is called when the trip points are changed via > sysfs interface. > > Signed-off-by: Di Shen <di.shen@unisoc.com> > --- > drivers/thermal/gov_power_allocator.c | 9 +++++++++ > drivers/thermal/thermal_trip.c | 5 +++++ > include/linux/thermal.h | 3 +++ > 3 files changed, 17 insertions(+) > > diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c > index 8642f1096b91..8d17a10671e4 100644 > --- a/drivers/thermal/gov_power_allocator.c > +++ b/drivers/thermal/gov_power_allocator.c > @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) > return allocate_power(tz, trip.temperature); > } > > +static void power_allocator_reset(struct thermal_zone_device *tz) > +{ > + struct power_allocator_params *params = tz->governor_data; > + > + reset_pid_controller(params); > + allow_maximum_power(tz, true); Do you really want to allow the maximum power? What about if the trip temperature is decreased ? You want maximum power only if the mitigation ends. > +} > + > static struct thermal_governor thermal_gov_power_allocator = { > .name = "power_allocator", > .bind_to_tz = power_allocator_bind, > .unbind_from_tz = power_allocator_unbind, > .throttle = power_allocator_throttle, > + .reset = power_allocator_reset, > }; > THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); > diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c > index 907f3a4d7bc8..13bbe029c6ab 100644 > --- a/drivers/thermal/thermal_trip.c > +++ b/drivers/thermal/thermal_trip.c > @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, > if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) > tz->trips[trip_id] = *trip; > > + /* Reset the governor only when the trip type is active or passive. */ > + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); Actually we have the trip points: ACTIVE, PASSIVE, HOT and CRITICAL The last two ones should not be writable. Instead of this test, it would be cleaner to only make the ACTIVE and PASSIVE trip point writable when the CONFIG_THERMAL_WRITABLE_TRIPS option is set. Consequently, other trip points won't be writable and this test can go away as set_trip will be protected by a RO sysfs file property. > + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) The temperature test is duplicated because it is already done in the block before. > + tz->governor->reset(tz); > + > thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, > trip->temperature, trip->hysteresis); > > diff --git a/include/linux/thermal.h b/include/linux/thermal.h > index 87837094d549..d27d053319bf 100644 > --- a/include/linux/thermal.h > +++ b/include/linux/thermal.h > @@ -197,6 +197,8 @@ struct thermal_zone_device { > * thermal zone. > * @throttle: callback called for every trip point even if temperature is > * below the trip point temperature > + * @reset: callback called for governor reset > + * > * @governor_list: node in thermal_governor_list (in thermal_core.c) > */ > struct thermal_governor { > @@ -204,6 +206,7 @@ struct thermal_governor { > int (*bind_to_tz)(struct thermal_zone_device *tz); > void (*unbind_from_tz)(struct thermal_zone_device *tz); > int (*throttle)(struct thermal_zone_device *tz, int trip); > + void (*reset)(struct thermal_zone_device *tz); > struct list_head governor_list; > }; >
Hi Daniel, Thank you for your reply. On Mon, Jul 10, 2023 at 4:59 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 10/07/2023 05:32, Di Shen wrote: > > When the thermal trip point is changed, the governor should > > be reset so that the policy algorithm can be updated to adapt > > to the new trip point. > > > > 1.Thermal governor is working for the passive trip point or active > > trip point. Therefore, when the trip point is changed it should > > reset the governor only for passic/active trip points. > > > > 2.For "power_allocator" governor reset, the parameters of pid > > controller and the states of power cooling devices should be reset. > > > > 2.1 > > The IPA controls temperature using PID mechanism. It is a closed- > > loop feedback monitoring system. It uses the gap between target > > temperature and current temperature which says "error" as the > > input of the PID controller: > > > > err = desired_temperature - current_temperature > > P_max = > > k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power > > > > That algorithm can 'learn' from the 'observed' in the past reaction > > for it's control decisions and accumulates that information in the > > I(Integral) part so that it can conpensate for those 'learned' > > mistakes. > > > > Based on the above, the most important is the desired temperature > > comes from the trip point. When the trip point is changed, all the > > previous learned errors won't help for the IPA. So the pid parameters > > should be reset for IPA governor reset. > > > > 2.2 > > Other wise, the cooling devices should also be reset when the trip > > point is changed. > > > > This patch adds an ops for the thermal_governor structure to reset > > the governor and give the 'reset' function definition for power > > allocator. The ops is called when the trip points are changed via > > sysfs interface. > > > > Signed-off-by: Di Shen <di.shen@unisoc.com> > > --- > > drivers/thermal/gov_power_allocator.c | 9 +++++++++ > > drivers/thermal/thermal_trip.c | 5 +++++ > > include/linux/thermal.h | 3 +++ > > 3 files changed, 17 insertions(+) > > > > diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c > > index 8642f1096b91..8d17a10671e4 100644 > > --- a/drivers/thermal/gov_power_allocator.c > > +++ b/drivers/thermal/gov_power_allocator.c > > @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) > > return allocate_power(tz, trip.temperature); > > } > > > > +static void power_allocator_reset(struct thermal_zone_device *tz) > > +{ > > + struct power_allocator_params *params = tz->governor_data; > > + > > + reset_pid_controller(params); > > + allow_maximum_power(tz, true); > > Do you really want to allow the maximum power? What about if the trip > temperature is decreased ? > If the trip temperature is decreased, allow_maximum_power will only be executed once, and then the ipa governor will adapt to the lower trip temperature and calculate the allocated power for cooling actors again. Right? But if the trip temperature is increased, it must allow the maximum power, otherwise, the Line 723 update flag might be false(see the false example in patch-v3), the power of cooling devices would be not allowed maximum for a while unless the current temperature is higher than the new switch_on_temp. In this situation, for cpufreq cooling devices, it means the cpu max_freq is limited, which is bad for device performance. The same for other cooling devices, if not reset the cooling state, it may introduce unexpected issues. 705 static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) 706 { 707 struct power_allocator_params *params = tz->governor_data; 708 struct thermal_trip trip; 709 int ret; 710 bool update; 711 712 //...... 721 ret = __thermal_zone_get_trip(tz, params->trip_switch_on, &trip); 722 if (!ret && (tz->temperature < trip.temperature)) { 723 update = (tz->last_temperature >= trip.temperature); 724 tz->passive = 0; 725 reset_pid_controller(params); 726 allow_maximum_power(tz, update); 727 return 0; 728 } 729 730 tz->passive = 1; 731 //....... 740 } > You want maximum power only if the mitigation ends. > Yes, you're right, I agree. It's better to do like that, actually, patch-v3 is in line with this point I think, but it is not in the thermal core. Oh, I see! How about modified like that: static void power_allocator_reset(struct thermal_zone_device *tz) { struct power_allocator_params *params = tz->governor_data; + struct thermal_trip trip; + int ret; reset_pid_controller(params); + ret = __thermal_zone_get_trip(tz, params->trip_switch_on, &trip); + if (!ret && (tz->temperature < trip.temperature)) allow_maximum_power(tz, true); } It seems clearer. > > +} > > + > > static struct thermal_governor thermal_gov_power_allocator = { > > .name = "power_allocator", > > .bind_to_tz = power_allocator_bind, > > .unbind_from_tz = power_allocator_unbind, > > .throttle = power_allocator_throttle, > > + .reset = power_allocator_reset, > > }; > > THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); > > diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c > > index 907f3a4d7bc8..13bbe029c6ab 100644 > > --- a/drivers/thermal/thermal_trip.c > > +++ b/drivers/thermal/thermal_trip.c > > @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, > > if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) > > tz->trips[trip_id] = *trip; > > > > + /* Reset the governor only when the trip type is active or passive. */ > > + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); > > Actually we have the trip points: > > ACTIVE, PASSIVE, HOT and CRITICAL > Correct. > The last two ones should not be writable. > > Instead of this test, it would be cleaner to only make the ACTIVE and > PASSIVE trip point writable when the CONFIG_THERMAL_WRITABLE_TRIPS > option is set. Consequently, other trip points won't be writable and > this test can go away as set_trip will be protected by a RO sysfs file > property. > > > + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) > > The temperature test is duplicated because it is already done in the > block before. > So here you mean two points: 1. Make the HOT and CRITICAL point RO so that it can no trip type test when gov->reset 2. not do the temperature test twice Right? For the first point, it's OK for me. But what if someone wants to power off the device earlier or later? I think it should have the access to update the critical temperature. I'm not really know about the hot trip temperature. So I'm not sure the real reason to make them RO when the trip points are writable. I can't make it RO just because of the code simplification. For the second point, that's OK. I will simplify in the next patch version. Thank you, Daniel. > > + tz->governor->reset(tz); > > + > > thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, > > trip->temperature, trip->hysteresis); > > > > diff --git a/include/linux/thermal.h b/include/linux/thermal.h > > index 87837094d549..d27d053319bf 100644 > > --- a/include/linux/thermal.h > > +++ b/include/linux/thermal.h > > @@ -197,6 +197,8 @@ struct thermal_zone_device { > > * thermal zone. > > * @throttle: callback called for every trip point even if temperature is > > * below the trip point temperature > > + * @reset: callback called for governor reset > > + * > > * @governor_list: node in thermal_governor_list (in thermal_core.c) > > */ > > struct thermal_governor { > > @@ -204,6 +206,7 @@ struct thermal_governor { > > int (*bind_to_tz)(struct thermal_zone_device *tz); > > void (*unbind_from_tz)(struct thermal_zone_device *tz); > > int (*throttle)(struct thermal_zone_device *tz, int trip); > > + void (*reset)(struct thermal_zone_device *tz); > > struct list_head governor_list; > > }; > > > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog > Best regards, Di
Hi Di, On 10/07/2023 15:36, Di Shen wrote: > Hi Daniel, > Thank you for your reply. > On Mon, Jul 10, 2023 at 4:59 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> >> On 10/07/2023 05:32, Di Shen wrote: >>> When the thermal trip point is changed, the governor should >>> be reset so that the policy algorithm can be updated to adapt >>> to the new trip point. >>> >>> 1.Thermal governor is working for the passive trip point or active >>> trip point. Therefore, when the trip point is changed it should >>> reset the governor only for passic/active trip points. >>> >>> 2.For "power_allocator" governor reset, the parameters of pid >>> controller and the states of power cooling devices should be reset. >>> >>> 2.1 >>> The IPA controls temperature using PID mechanism. It is a closed- >>> loop feedback monitoring system. It uses the gap between target >>> temperature and current temperature which says "error" as the >>> input of the PID controller: >>> >>> err = desired_temperature - current_temperature >>> P_max = >>> k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power >>> >>> That algorithm can 'learn' from the 'observed' in the past reaction >>> for it's control decisions and accumulates that information in the >>> I(Integral) part so that it can conpensate for those 'learned' >>> mistakes. >>> >>> Based on the above, the most important is the desired temperature >>> comes from the trip point. When the trip point is changed, all the >>> previous learned errors won't help for the IPA. So the pid parameters >>> should be reset for IPA governor reset. >>> >>> 2.2 >>> Other wise, the cooling devices should also be reset when the trip >>> point is changed. >>> >>> This patch adds an ops for the thermal_governor structure to reset >>> the governor and give the 'reset' function definition for power >>> allocator. The ops is called when the trip points are changed via >>> sysfs interface. >>> >>> Signed-off-by: Di Shen <di.shen@unisoc.com> >>> --- >>> drivers/thermal/gov_power_allocator.c | 9 +++++++++ >>> drivers/thermal/thermal_trip.c | 5 +++++ >>> include/linux/thermal.h | 3 +++ >>> 3 files changed, 17 insertions(+) >>> >>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c >>> index 8642f1096b91..8d17a10671e4 100644 >>> --- a/drivers/thermal/gov_power_allocator.c >>> +++ b/drivers/thermal/gov_power_allocator.c >>> @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) >>> return allocate_power(tz, trip.temperature); >>> } >>> >>> +static void power_allocator_reset(struct thermal_zone_device *tz) >>> +{ >>> + struct power_allocator_params *params = tz->governor_data; >>> + >>> + reset_pid_controller(params); >>> + allow_maximum_power(tz, true); >> >> Do you really want to allow the maximum power? What about if the trip >> temperature is decreased ? >> > If the trip temperature is decreased, allow_maximum_power will only > be executed once, and then the ipa governor will adapt to the lower trip > temperature and calculate the allocated power for cooling actors again. > Right? Sorry for jumping in this fifth version but I'm not sure about resetting the change is the right way (and probably, changing a trip point with the power allocator is not a good idea) The platforms where the IPA is planned to be used are creating a dummy trip point where the IPA begins the acquisition without cooling devices in order to have the math building the PID schema (eg. hi3660.dtsi). What about the sustainable power vs the trip point temperature? I mean we can change the trip temperature but not the sustainable power which is directly related to the target temperature. So the resulting power computation will be wrong. The more I think about that, the more I do believe writable trip point and IPA are incompatible. What about forbid that? For instance, add a set_trip callback instead of resetting in the governor and return -EPERM from the IPA? Lukasz ? > But if the trip temperature is increased, it must allow the maximum power, I would not say allow the maximum temperature but the new power head room. But that needs an update of the sustainable power information related to the new temperature :/ > otherwise, the Line 723 update flag might be false(see the false example > in patch-v3), the power of cooling devices would be not allowed maximum > for a while unless the current temperature is higher than the new > switch_on_temp. In this situation, for cpufreq cooling devices, it means the > cpu max_freq is limited, which is bad for device performance. The same for > other cooling devices, if not reset the cooling state, it may introduce > unexpected issues. > > 705 static int power_allocator_throttle(struct thermal_zone_device > *tz, int trip_id) > 706 { > 707 struct power_allocator_params *params = tz->governor_data; > 708 struct thermal_trip trip; > 709 int ret; > 710 bool update; > 711 > 712 //...... > 721 ret = __thermal_zone_get_trip(tz, params->trip_switch_on, > &trip); > 722 if (!ret && (tz->temperature < trip.temperature)) { > 723 update = (tz->last_temperature >= > trip.temperature); > 724 tz->passive = 0; > 725 reset_pid_controller(params); > 726 allow_maximum_power(tz, update); > 727 return 0; > 728 } > 729 > 730 tz->passive = 1; > 731 //....... > 740 } > >> You want maximum power only if the mitigation ends. >> > Yes, you're right, I agree. It's better to do like that, actually, > patch-v3 is in line > with this point I think, but it is not in the thermal core. > > Oh, I see! > How about modified like that: > static void power_allocator_reset(struct thermal_zone_device *tz) > { > struct power_allocator_params *params = tz->governor_data; > + struct thermal_trip trip; > + int ret; > > reset_pid_controller(params); > + ret = __thermal_zone_get_trip(tz, params->trip_switch_on, &trip); > + if (!ret && (tz->temperature < trip.temperature)) > allow_maximum_power(tz, true); No, that would be wrong. It should be (tz->temperature - hysteresis) < trip.temperature But if we are in the hysteresis area, we should allow a bit more power which depends on the sustainable power which was not updated with the trip temperature change :/ > } > It seems clearer. > >>> +} >>> + >>> static struct thermal_governor thermal_gov_power_allocator = { >>> .name = "power_allocator", >>> .bind_to_tz = power_allocator_bind, >>> .unbind_from_tz = power_allocator_unbind, >>> .throttle = power_allocator_throttle, >>> + .reset = power_allocator_reset, >>> }; >>> THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); >>> diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c >>> index 907f3a4d7bc8..13bbe029c6ab 100644 >>> --- a/drivers/thermal/thermal_trip.c >>> +++ b/drivers/thermal/thermal_trip.c >>> @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, >>> if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) >>> tz->trips[trip_id] = *trip; >>> >>> + /* Reset the governor only when the trip type is active or passive. */ >>> + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); >> >> Actually we have the trip points: >> >> ACTIVE, PASSIVE, HOT and CRITICAL >> > Correct. > >> The last two ones should not be writable. >> >> Instead of this test, it would be cleaner to only make the ACTIVE and >> PASSIVE trip point writable when the CONFIG_THERMAL_WRITABLE_TRIPS >> option is set. Consequently, other trip points won't be writable and >> this test can go away as set_trip will be protected by a RO sysfs file >> property. >> >>> + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) >> >> The temperature test is duplicated because it is already done in the >> block before. >> > So here you mean two points: > 1. Make the HOT and CRITICAL point RO so that it can no trip type test > when gov->reset > 2. not do the temperature test twice > Right? HOT and CRITICAL can not be changed because RO. The core code can use this assumption by not checking they are changed. > For the first point, it's OK for me. But what if someone wants to > power off the device earlier > or later? I think it should have the access to update the critical > temperature. > > I'm not really > know about the hot trip temperature. So I'm not sure the real reason > to make them RO > when the trip points are writable. I can't make it RO just because of > the code simplification. That is critical firmware information and it must not be touched by the user (even root). Imagine I can set 250°C for the critical trip point while the silicon supports 120°C only? So basically, we should protect these trip points by making them RO. By fixing this, only PASSIVE and ACTIVE trip points can be changed, thus checking CRITICAL and HOT above is pointless. > For the second point, that's OK. I will simplify in the next patch > version. Thank you, Daniel. Before sending a new version, IMO we should think a bit about writable trip points in general because the feature does not look mature. >>> + tz->governor->reset(tz); >>> + >>> thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, >>> trip->temperature, trip->hysteresis); >>> >>> diff --git a/include/linux/thermal.h b/include/linux/thermal.h >>> index 87837094d549..d27d053319bf 100644 >>> --- a/include/linux/thermal.h >>> +++ b/include/linux/thermal.h >>> @@ -197,6 +197,8 @@ struct thermal_zone_device { >>> * thermal zone. >>> * @throttle: callback called for every trip point even if temperature is >>> * below the trip point temperature >>> + * @reset: callback called for governor reset >>> + * >>> * @governor_list: node in thermal_governor_list (in thermal_core.c) >>> */ >>> struct thermal_governor { >>> @@ -204,6 +206,7 @@ struct thermal_governor { >>> int (*bind_to_tz)(struct thermal_zone_device *tz); >>> void (*unbind_from_tz)(struct thermal_zone_device *tz); >>> int (*throttle)(struct thermal_zone_device *tz, int trip); >>> + void (*reset)(struct thermal_zone_device *tz); >>> struct list_head governor_list; >>> }; >>> >> >> -- >> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs >> >> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | >> <http://twitter.com/#!/linaroorg> Twitter | >> <http://www.linaro.org/linaro-blog/> Blog >> > > Best regards, > Di
Hi Daniel, On Tue, Jul 11, 2023 at 3:33 AM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Hi Di, > > > On 10/07/2023 15:36, Di Shen wrote: > > Hi Daniel, > > Thank you for your reply. > > On Mon, Jul 10, 2023 at 4:59 PM Daniel Lezcano > > <daniel.lezcano@linaro.org> wrote: > >> > >> On 10/07/2023 05:32, Di Shen wrote: > >>> When the thermal trip point is changed, the governor should > >>> be reset so that the policy algorithm can be updated to adapt > >>> to the new trip point. > >>> > >>> 1.Thermal governor is working for the passive trip point or active > >>> trip point. Therefore, when the trip point is changed it should > >>> reset the governor only for passic/active trip points. > >>> > >>> 2.For "power_allocator" governor reset, the parameters of pid > >>> controller and the states of power cooling devices should be reset. > >>> > >>> 2.1 > >>> The IPA controls temperature using PID mechanism. It is a closed- > >>> loop feedback monitoring system. It uses the gap between target > >>> temperature and current temperature which says "error" as the > >>> input of the PID controller: > >>> > >>> err = desired_temperature - current_temperature > >>> P_max = > >>> k_p * err + k_i * err_integral + k_d * diff_err + sustainable_power > >>> > >>> That algorithm can 'learn' from the 'observed' in the past reaction > >>> for it's control decisions and accumulates that information in the > >>> I(Integral) part so that it can conpensate for those 'learned' > >>> mistakes. > >>> > >>> Based on the above, the most important is the desired temperature > >>> comes from the trip point. When the trip point is changed, all the > >>> previous learned errors won't help for the IPA. So the pid parameters > >>> should be reset for IPA governor reset. > >>> > >>> 2.2 > >>> Other wise, the cooling devices should also be reset when the trip > >>> point is changed. > >>> > >>> This patch adds an ops for the thermal_governor structure to reset > >>> the governor and give the 'reset' function definition for power > >>> allocator. The ops is called when the trip points are changed via > >>> sysfs interface. > >>> > >>> Signed-off-by: Di Shen <di.shen@unisoc.com> > >>> --- > >>> drivers/thermal/gov_power_allocator.c | 9 +++++++++ > >>> drivers/thermal/thermal_trip.c | 5 +++++ > >>> include/linux/thermal.h | 3 +++ > >>> 3 files changed, 17 insertions(+) > >>> > >>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c > >>> index 8642f1096b91..8d17a10671e4 100644 > >>> --- a/drivers/thermal/gov_power_allocator.c > >>> +++ b/drivers/thermal/gov_power_allocator.c > >>> @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) > >>> return allocate_power(tz, trip.temperature); > >>> } > >>> > >>> +static void power_allocator_reset(struct thermal_zone_device *tz) > >>> +{ > >>> + struct power_allocator_params *params = tz->governor_data; > >>> + > >>> + reset_pid_controller(params); > >>> + allow_maximum_power(tz, true); > >> > >> Do you really want to allow the maximum power? What about if the trip > >> temperature is decreased ? > >> > > If the trip temperature is decreased, allow_maximum_power will only > > be executed once, and then the ipa governor will adapt to the lower trip > > temperature and calculate the allocated power for cooling actors again. > > Right? > > Sorry for jumping in this fifth version but I'm not sure about resetting > the change is the right way (and probably, changing a trip point with > the power allocator is not a good idea) > > The platforms where the IPA is planned to be used are creating a dummy > trip point where the IPA begins the acquisition without cooling devices > in order to have the math building the PID schema (eg. hi3660.dtsi). > > What about the sustainable power vs the trip point temperature? I mean > we can change the trip temperature but not the sustainable power which > is directly related to the target temperature. So the resulting power > computation will be wrong. > I totally agree, thanks for reminding me. Sustainable power is the maximum power available at the target temperature, so it must be updated when the trip point is changed. Sorry for missing this point. How about calling get_sustainable_power() to update the sustainable_power? Furthermore, when the sustainble_power() is changed, the pid constants tzp->k_* must be estimated again. In get_sustainble_power, it checks that the sustainable_power is updated, it will call the estimate_pid_constants() to renew the tzp->k_*. > The more I think about that, the more I do believe writable trip point > and IPA are incompatible. > > What about forbid that? > > For instance, add a set_trip callback instead of resetting in the > governor and return -EPERM from the IPA? > I've seen that you have sent a patch recently which adds the callback thermal_zone_trips_update(), is that what you said set_trip callback? > Lukasz ? > > > But if the trip temperature is increased, it must allow the maximum power, > > I would not say allow the maximum temperature but the new power head > room. But that needs an update of the sustainable power information > related to the new temperature :/ > > > otherwise, the Line 723 update flag might be false(see the false example > > in patch-v3), the power of cooling devices would be not allowed maximum > > for a while unless the current temperature is higher than the new > > switch_on_temp. In this situation, for cpufreq cooling devices, it means the > > cpu max_freq is limited, which is bad for device performance. The same for > > other cooling devices, if not reset the cooling state, it may introduce > > unexpected issues. > > > > 705 static int power_allocator_throttle(struct thermal_zone_device > > *tz, int trip_id) > > 706 { > > 707 struct power_allocator_params *params = tz->governor_data; > > 708 struct thermal_trip trip; > > 709 int ret; > > 710 bool update; > > 711 > > 712 //...... > > 721 ret = __thermal_zone_get_trip(tz, params->trip_switch_on, > > &trip); > > 722 if (!ret && (tz->temperature < trip.temperature)) { > > 723 update = (tz->last_temperature >= > > trip.temperature); > > 724 tz->passive = 0; > > 725 reset_pid_controller(params); > > 726 allow_maximum_power(tz, update); > > 727 return 0; > > 728 } > > 729 > > 730 tz->passive = 1; > > 731 //....... > > 740 } > > > >> You want maximum power only if the mitigation ends. > >> > > Yes, you're right, I agree. It's better to do like that, actually, > > patch-v3 is in line > > with this point I think, but it is not in the thermal core. > > > > Oh, I see! > > How about modified like that: > > static void power_allocator_reset(struct thermal_zone_device *tz) > > { > > struct power_allocator_params *params = tz->governor_data; > > + struct thermal_trip trip; > > + int ret; > > > > reset_pid_controller(params); > > + ret = __thermal_zone_get_trip(tz, params->trip_switch_on, &trip); > > + if (!ret && (tz->temperature < trip.temperature)) > > allow_maximum_power(tz, true); > > No, that would be wrong. It should be > (tz->temperature - hysteresis) < trip.temperature > > But if we are in the hysteresis area, we should allow a bit more power > which depends on the sustainable power which was not updated with the > trip temperature change :/ > As what I said before, maybe we can use the get_sustainable_power to update sustainable_power. > > } > > It seems clearer. > > > >>> +} > >>> + > >>> static struct thermal_governor thermal_gov_power_allocator = { > >>> .name = "power_allocator", > >>> .bind_to_tz = power_allocator_bind, > >>> .unbind_from_tz = power_allocator_unbind, > >>> .throttle = power_allocator_throttle, > >>> + .reset = power_allocator_reset, > >>> }; > >>> THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); > >>> diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c > >>> index 907f3a4d7bc8..13bbe029c6ab 100644 > >>> --- a/drivers/thermal/thermal_trip.c > >>> +++ b/drivers/thermal/thermal_trip.c > >>> @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, > >>> if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) > >>> tz->trips[trip_id] = *trip; > >>> > >>> + /* Reset the governor only when the trip type is active or passive. */ > >>> + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); > >> > >> Actually we have the trip points: > >> > >> ACTIVE, PASSIVE, HOT and CRITICAL > >> > > Correct. > > > >> The last two ones should not be writable. > >> > >> Instead of this test, it would be cleaner to only make the ACTIVE and > >> PASSIVE trip point writable when the CONFIG_THERMAL_WRITABLE_TRIPS > >> option is set. Consequently, other trip points won't be writable and > >> this test can go away as set_trip will be protected by a RO sysfs file > >> property. > >> > >>> + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) > >> > >> The temperature test is duplicated because it is already done in the > >> block before. > >> > > So here you mean two points: > > 1. Make the HOT and CRITICAL point RO so that it can no trip type test > > when gov->reset > > 2. not do the temperature test twice > > Right? > > HOT and CRITICAL can not be changed because RO. The core code can use > this assumption by not checking they are changed. > > > For the first point, it's OK for me. But what if someone wants to > > power off the device earlier > > or later? I think it should have the access to update the critical > > temperature. > > > > I'm not really > > know about the hot trip temperature. So I'm not sure the real reason > > to make them RO > > when the trip points are writable. I can't make it RO just because of > > the code simplification. > > That is critical firmware information and it must not be touched by the > user (even root). Imagine I can set 250°C for the critical trip point > while the silicon supports 120°C only? > > So basically, we should protect these trip points by making them RO. > > By fixing this, only PASSIVE and ACTIVE trip points can be changed, thus > checking CRITICAL and HOT above is pointless. > Got it. The firmware information should not be modified in software, so RO is proper. > > > For the second point, that's OK. I will simplify in the next patch > > version. Thank you, Daniel. > > Before sending a new version, IMO we should think a bit about writable > trip points in general because the feature does not look mature. > Yes, a slight move in one trip point may affect the situation as a whole, especially for thermal zones which are governed by IPA. We should think carefully. And thank you for your sincere comments:) > > >>> + tz->governor->reset(tz); > >>> + > >>> thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, > >>> trip->temperature, trip->hysteresis); > >>> > >>> diff --git a/include/linux/thermal.h b/include/linux/thermal.h > >>> index 87837094d549..d27d053319bf 100644 > >>> --- a/include/linux/thermal.h > >>> +++ b/include/linux/thermal.h > >>> @@ -197,6 +197,8 @@ struct thermal_zone_device { > >>> * thermal zone. > >>> * @throttle: callback called for every trip point even if temperature is > >>> * below the trip point temperature > >>> + * @reset: callback called for governor reset > >>> + * > >>> * @governor_list: node in thermal_governor_list (in thermal_core.c) > >>> */ > >>> struct thermal_governor { > >>> @@ -204,6 +206,7 @@ struct thermal_governor { > >>> int (*bind_to_tz)(struct thermal_zone_device *tz); > >>> void (*unbind_from_tz)(struct thermal_zone_device *tz); > >>> int (*throttle)(struct thermal_zone_device *tz, int trip); > >>> + void (*reset)(struct thermal_zone_device *tz); > >>> struct list_head governor_list; > >>> }; > >>> > >> > >> -- > >> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > >> > >> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > >> <http://twitter.com/#!/linaroorg> Twitter | > >> <http://www.linaro.org/linaro-blog/> Blog > >> > > > > Best regards, > > Di > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog > Best regards, Di
Hi Di, On 11/07/2023 05:40, Di Shen wrote: [ ... ] >>>>> +static void power_allocator_reset(struct thermal_zone_device *tz) >>>>> +{ >>>>> + struct power_allocator_params *params = tz->governor_data; >>>>> + >>>>> + reset_pid_controller(params); >>>>> + allow_maximum_power(tz, true); >>>> >>>> Do you really want to allow the maximum power? What about if the trip >>>> temperature is decreased ? >>>> >>> If the trip temperature is decreased, allow_maximum_power will only >>> be executed once, and then the ipa governor will adapt to the lower trip >>> temperature and calculate the allocated power for cooling actors again. >>> Right? >> >> Sorry for jumping in this fifth version but I'm not sure about resetting >> the change is the right way (and probably, changing a trip point with >> the power allocator is not a good idea) >> >> The platforms where the IPA is planned to be used are creating a dummy >> trip point where the IPA begins the acquisition without cooling devices >> in order to have the math building the PID schema (eg. hi3660.dtsi). >> >> What about the sustainable power vs the trip point temperature? I mean >> we can change the trip temperature but not the sustainable power which >> is directly related to the target temperature. So the resulting power >> computation will be wrong. >> > I totally agree, thanks for reminding me. Sustainable power is the maximum > power available at the target temperature, so it must be updated when the trip > point is changed. Sorry for missing this point. How about calling > get_sustainable_power() to update the sustainable_power? Furthermore, when > the sustainble_power() is changed, the pid constants tzp->k_* must be estimated > again. In get_sustainble_power, it checks that the sustainable_power is updated, > it will call the estimate_pid_constants() to renew the tzp->k_*. Yes and the sustainable power can be set from userspace too. So here we have to distinguish what is related to the thermal setup and the thermal usage. Actually the thermal framework should protect the information from the firmware. It is not acceptable to have an user being able to change the trip points provided by the firmware. The writable trip point should allow only temperature changes below the ones given in the firmware. >> The more I think about that, the more I do believe writable trip point >> and IPA are incompatible. >> >> What about forbid that? >> >> For instance, add a set_trip callback instead of resetting in the >> governor and return -EPERM from the IPA? >> > I've seen that you have sent a patch recently which adds the callback > thermal_zone_trips_update(), is that what you said set_trip callback? Not exactly. Instead of adding a 'reset' callback, add a 'trip_update' (or whatever the name) callback. Then pass the trip point to the callback along with the thermal zone. int ipa_trip_update(struct thermal_zone_device *tz, struct thermal_trip *trip) { // Do more IPA crazy stuff or return -EPERM } >> Lukasz ? Lukasz? what do you think?
Hi Daniel, On 7/11/23 09:23, Daniel Lezcano wrote: > > Hi Di, > > On 11/07/2023 05:40, Di Shen wrote: > > [ ... ] > >>>>>> +static void power_allocator_reset(struct thermal_zone_device *tz) >>>>>> +{ >>>>>> + struct power_allocator_params *params = tz->governor_data; >>>>>> + >>>>>> + reset_pid_controller(params); >>>>>> + allow_maximum_power(tz, true); >>>>> >>>>> Do you really want to allow the maximum power? What about if the trip >>>>> temperature is decreased ? >>>>> >>>> If the trip temperature is decreased, allow_maximum_power will only >>>> be executed once, and then the ipa governor will adapt to the lower >>>> trip >>>> temperature and calculate the allocated power for cooling actors again. >>>> Right? >>> >>> Sorry for jumping in this fifth version but I'm not sure about resetting >>> the change is the right way (and probably, changing a trip point with >>> the power allocator is not a good idea) >>> >>> The platforms where the IPA is planned to be used are creating a dummy >>> trip point where the IPA begins the acquisition without cooling devices >>> in order to have the math building the PID schema (eg. hi3660.dtsi). >>> >>> What about the sustainable power vs the trip point temperature? I mean >>> we can change the trip temperature but not the sustainable power which >>> is directly related to the target temperature. So the resulting power >>> computation will be wrong. >>> >> I totally agree, thanks for reminding me. Sustainable power is the >> maximum >> power available at the target temperature, so it must be updated when >> the trip >> point is changed. Sorry for missing this point. How about calling >> get_sustainable_power() to update the sustainable_power? Furthermore, >> when >> the sustainble_power() is changed, the pid constants tzp->k_* must be >> estimated >> again. In get_sustainble_power, it checks that the sustainable_power >> is updated, >> it will call the estimate_pid_constants() to renew the tzp->k_*. > > Yes and the sustainable power can be set from userspace too. > > So here we have to distinguish what is related to the thermal setup and > the thermal usage. > > Actually the thermal framework should protect the information from the > firmware. It is not acceptable to have an user being able to change the > trip points provided by the firmware. > > The writable trip point should allow only temperature changes below the > ones given in the firmware. > >>> The more I think about that, the more I do believe writable trip point >>> and IPA are incompatible. >>> >>> What about forbid that? >>> >>> For instance, add a set_trip callback instead of resetting in the >>> governor and return -EPERM from the IPA? >>> >> I've seen that you have sent a patch recently which adds the callback >> thermal_zone_trips_update(), is that what you said set_trip callback? > > Not exactly. > > Instead of adding a 'reset' callback, add a 'trip_update' (or whatever > the name) callback. > > Then pass the trip point to the callback along with the thermal zone. > > int ipa_trip_update(struct thermal_zone_device *tz, > struct thermal_trip *trip) > { > // Do more IPA crazy stuff or return -EPERM > } > > >>> Lukasz ? > > Lukasz? what do you think? > > My apologies for delay, I was on 2-weeks vacation. I'll catch up and respond to those questions. Regards, Lukasz
Hi Lukasz, On 17/07/2023 12:07, Lukasz Luba wrote: [ ... ] >> int ipa_trip_update(struct thermal_zone_device *tz, >> struct thermal_trip *trip) >> { >> // Do more IPA crazy stuff or return -EPERM >> } >> >> >>>> Lukasz ? >> >> Lukasz? what do you think? >> >> > > My apologies for delay, I was on 2-weeks vacation. I'll catch up and > respond to those questions. No worries, I hope you enjoyed your vacations The main thing is to know you got the info ;) Thanks
Hi Lukasz, On Mon, Jul 17, 2023 at 6:06 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > Hi Daniel, > > > On 7/11/23 09:23, Daniel Lezcano wrote: > > > > Hi Di, > > > > On 11/07/2023 05:40, Di Shen wrote: > > > > [ ... ] > > > >>>>>> +static void power_allocator_reset(struct thermal_zone_device *tz) > >>>>>> +{ > >>>>>> + struct power_allocator_params *params = tz->governor_data; > >>>>>> + > >>>>>> + reset_pid_controller(params); > >>>>>> + allow_maximum_power(tz, true); > >>>>> > >>>>> Do you really want to allow the maximum power? What about if the trip > >>>>> temperature is decreased ? > >>>>> > >>>> If the trip temperature is decreased, allow_maximum_power will only > >>>> be executed once, and then the ipa governor will adapt to the lower > >>>> trip > >>>> temperature and calculate the allocated power for cooling actors again. > >>>> Right? > >>> > >>> Sorry for jumping in this fifth version but I'm not sure about resetting > >>> the change is the right way (and probably, changing a trip point with > >>> the power allocator is not a good idea) > >>> > >>> The platforms where the IPA is planned to be used are creating a dummy > >>> trip point where the IPA begins the acquisition without cooling devices > >>> in order to have the math building the PID schema (eg. hi3660.dtsi). > >>> > >>> What about the sustainable power vs the trip point temperature? I mean > >>> we can change the trip temperature but not the sustainable power which > >>> is directly related to the target temperature. So the resulting power > >>> computation will be wrong. > >>> > >> I totally agree, thanks for reminding me. Sustainable power is the > >> maximum > >> power available at the target temperature, so it must be updated when > >> the trip > >> point is changed. Sorry for missing this point. How about calling > >> get_sustainable_power() to update the sustainable_power? Furthermore, > >> when > >> the sustainble_power() is changed, the pid constants tzp->k_* must be > >> estimated > >> again. In get_sustainble_power, it checks that the sustainable_power > >> is updated, > >> it will call the estimate_pid_constants() to renew the tzp->k_*. > > > > Yes and the sustainable power can be set from userspace too. > > > > So here we have to distinguish what is related to the thermal setup and > > the thermal usage. > > > > Actually the thermal framework should protect the information from the > > firmware. It is not acceptable to have an user being able to change the > > trip points provided by the firmware. > > > > The writable trip point should allow only temperature changes below the > > ones given in the firmware. > > > >>> The more I think about that, the more I do believe writable trip point > >>> and IPA are incompatible. > >>> > >>> What about forbid that? > >>> > >>> For instance, add a set_trip callback instead of resetting in the > >>> governor and return -EPERM from the IPA? > >>> > >> I've seen that you have sent a patch recently which adds the callback > >> thermal_zone_trips_update(), is that what you said set_trip callback? > > > > Not exactly. > > > > Instead of adding a 'reset' callback, add a 'trip_update' (or whatever > > the name) callback. > > > > Then pass the trip point to the callback along with the thermal zone. > > > > int ipa_trip_update(struct thermal_zone_device *tz, > > struct thermal_trip *trip) > > { > > // Do more IPA crazy stuff or return -EPERM > > } > > > > > >>> Lukasz ? > > > > Lukasz? what do you think? > > > > > > My apologies for delay, I was on 2-weeks vacation. I'll catch up and > respond to those questions. > > Regards, > Lukasz What do you think about the points that Daniel has mentioned? Any comments would be very appreciated, hope to hear from you, thank you. Best regards, Di
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c index 8642f1096b91..8d17a10671e4 100644 --- a/drivers/thermal/gov_power_allocator.c +++ b/drivers/thermal/gov_power_allocator.c @@ -729,10 +729,19 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id) return allocate_power(tz, trip.temperature); } +static void power_allocator_reset(struct thermal_zone_device *tz) +{ + struct power_allocator_params *params = tz->governor_data; + + reset_pid_controller(params); + allow_maximum_power(tz, true); +} + static struct thermal_governor thermal_gov_power_allocator = { .name = "power_allocator", .bind_to_tz = power_allocator_bind, .unbind_from_tz = power_allocator_unbind, .throttle = power_allocator_throttle, + .reset = power_allocator_reset, }; THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator); diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c index 907f3a4d7bc8..13bbe029c6ab 100644 --- a/drivers/thermal/thermal_trip.c +++ b/drivers/thermal/thermal_trip.c @@ -173,6 +173,11 @@ int thermal_zone_set_trip(struct thermal_zone_device *tz, int trip_id, if (tz->trips && (t.temperature != trip->temperature || t.hysteresis != trip->hysteresis)) tz->trips[trip_id] = *trip; + /* Reset the governor only when the trip type is active or passive. */ + ret = (trip->type == THERMAL_TRIP_PASSIVE || trip->type == THERMAL_TRIP_ACTIVE); + if (ret && t.temperature != trip->temperature && tz->governor && tz->governor->reset) + tz->governor->reset(tz); + thermal_notify_tz_trip_change(tz->id, trip_id, trip->type, trip->temperature, trip->hysteresis); diff --git a/include/linux/thermal.h b/include/linux/thermal.h index 87837094d549..d27d053319bf 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -197,6 +197,8 @@ struct thermal_zone_device { * thermal zone. * @throttle: callback called for every trip point even if temperature is * below the trip point temperature + * @reset: callback called for governor reset + * * @governor_list: node in thermal_governor_list (in thermal_core.c) */ struct thermal_governor { @@ -204,6 +206,7 @@ struct thermal_governor { int (*bind_to_tz)(struct thermal_zone_device *tz); void (*unbind_from_tz)(struct thermal_zone_device *tz); int (*throttle)(struct thermal_zone_device *tz, int trip); + void (*reset)(struct thermal_zone_device *tz); struct list_head governor_list; };