[v8,5/5] thermal: mediatek: try again if first temp read is bogus

Message ID 20221018-up-i350-thermal-bringup-v8-5-23e8fbb08837@baylibre.com
State New
Headers
Series thermal: mediatek: Add support for MT8365 SoC |

Commit Message

Amjad Ouled-Ameur Jan. 25, 2023, 9:50 a.m. UTC
  In mtk_thermal_bank_temperature, return -EAGAIN instead of 0
on the first read of sensor that often are bogus values.

Signed-off-by: Michael Kao <michael.kao@mediatek.com>
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
 drivers/thermal/mtk_thermal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Daniel Lezcano Jan. 25, 2023, 10:58 a.m. UTC | #1
On 25/01/2023 10:50, Amjad Ouled-Ameur wrote:
> In mtk_thermal_bank_temperature, return -EAGAIN instead of 0
> on the first read of sensor that often are bogus values.
> 
> Signed-off-by: Michael Kao <michael.kao@mediatek.com>
> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> ---
>   drivers/thermal/mtk_thermal.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
> index b8e06f6c7c42..e7be450cd40a 100644
> --- a/drivers/thermal/mtk_thermal.c
> +++ b/drivers/thermal/mtk_thermal.c
> @@ -736,7 +736,7 @@ static int mtk_thermal_bank_temperature(struct mtk_thermal_bank *bank)
>   		 * not immediately shut down.
>   		 */
>   		if (temp > 200000)
> -			temp = 0;
> +			temp = -EAGAIN;

Did you try to add a delay between the bank init and the thermal zone 
device register (eg. 1ms) ?

May be the HW did not have time to initialize and capture a temperature 
before thermal_zone_device_register() is called (this one calls get_temp) ?

>   		if (temp > max)
>   			max = temp;
>
  
Amjad Ouled-Ameur Jan. 27, 2023, 10:50 a.m. UTC | #2
Hi Daniel,

On 1/25/23 11:58, Daniel Lezcano wrote:
> On 25/01/2023 10:50, Amjad Ouled-Ameur wrote:
>> In mtk_thermal_bank_temperature, return -EAGAIN instead of 0
>> on the first read of sensor that often are bogus values.
>>
>> Signed-off-by: Michael Kao <michael.kao@mediatek.com>
>> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
>> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
>> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> ---
>>   drivers/thermal/mtk_thermal.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
>> index b8e06f6c7c42..e7be450cd40a 100644
>> --- a/drivers/thermal/mtk_thermal.c
>> +++ b/drivers/thermal/mtk_thermal.c
>> @@ -736,7 +736,7 @@ static int mtk_thermal_bank_temperature(struct mtk_thermal_bank *bank)
>>            * not immediately shut down.
>>            */
>>           if (temp > 200000)
>> -            temp = 0;
>> +            temp = -EAGAIN;
>
> Did you try to add a delay between the bank init and the thermal zone device register (eg. 1ms) ?
>
> May be the HW did not have time to initialize and capture a temperature before thermal_zone_device_register() is called (this one calls get_temp) ?

A delay of 29 ms actually fixed the issue, thanks for the suggestion. I can send a V9 with this improvement.

Is there anything else to fix perhaps ?


Regards,

Amjad

>
>>           if (temp > max)
>>               max = temp;
>>
>
  
Daniel Lezcano Jan. 27, 2023, 11:54 a.m. UTC | #3
Hi Amjad,

On 27/01/2023 11:50, Amjad Ouled-Ameur wrote:
> Hi Daniel,
> 
> On 1/25/23 11:58, Daniel Lezcano wrote:
>> On 25/01/2023 10:50, Amjad Ouled-Ameur wrote:
>>> In mtk_thermal_bank_temperature, return -EAGAIN instead of 0
>>> on the first read of sensor that often are bogus values.
>>>
>>> Signed-off-by: Michael Kao <michael.kao@mediatek.com>
>>> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
>>> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
>>> Reviewed-by: AngeloGioacchino Del Regno 
>>> <angelogioacchino.delregno@collabora.com>
>>> ---
>>>   drivers/thermal/mtk_thermal.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/thermal/mtk_thermal.c 
>>> b/drivers/thermal/mtk_thermal.c
>>> index b8e06f6c7c42..e7be450cd40a 100644
>>> --- a/drivers/thermal/mtk_thermal.c
>>> +++ b/drivers/thermal/mtk_thermal.c
>>> @@ -736,7 +736,7 @@ static int mtk_thermal_bank_temperature(struct 
>>> mtk_thermal_bank *bank)
>>>            * not immediately shut down.
>>>            */
>>>           if (temp > 200000)
>>> -            temp = 0;
>>> +            temp = -EAGAIN;
>>
>> Did you try to add a delay between the bank init and the thermal zone 
>> device register (eg. 1ms) ?
>>
>> May be the HW did not have time to initialize and capture a 
>> temperature before thermal_zone_device_register() is called (this one 
>> calls get_temp) ?
> 
> A delay of 29 ms actually fixed the issue, thanks for the suggestion. I 
> can send a V9 with this improvement.

I'm glad that helped.

Will you remove the "if (temp > 200000)" test ?

> Is there anything else to fix perhaps ?

Not in your changes

Thanks

   -- Daniel
  

Patch

diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
index b8e06f6c7c42..e7be450cd40a 100644
--- a/drivers/thermal/mtk_thermal.c
+++ b/drivers/thermal/mtk_thermal.c
@@ -736,7 +736,7 @@  static int mtk_thermal_bank_temperature(struct mtk_thermal_bank *bank)
 		 * not immediately shut down.
 		 */
 		if (temp > 200000)
-			temp = 0;
+			temp = -EAGAIN;
 
 		if (temp > max)
 			max = temp;