Message ID | Y1WHnJ6h1RSOipV4@e120937-lin |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp93171wru; Sun, 23 Oct 2022 11:28:53 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4BVVLptG8g68lbW+d2GlqlSJvDofQraMed6ChF1Cinn1gZoyGf2H0koxC9AQPPji++VOdS X-Received: by 2002:a63:87c2:0:b0:46e:9bab:3064 with SMTP id i185-20020a6387c2000000b0046e9bab3064mr16441437pge.255.1666549733308; Sun, 23 Oct 2022 11:28:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666549733; cv=none; d=google.com; s=arc-20160816; b=gwj6Y1vTg6rkkwmbJk7sSYbnxPFph1EJYGnR1IC0mVdxM1c9gb+ajMGPw54YD420Ud ySsBZzh92WCubLWPGsVqLIDU8eLHGMZ9fa1JyyI2C+uZB4sS98nWzUWvfYJ+2HI32vw4 Pt+xUsGkwIjbpQDdBy+HtEpGPWpBS7heVvGXbZvlVSTSPATtrnoSI755gMtqIeuZwcsi pOZR0DTmOnTAzIuyDOy3kARXVXybKcJrtlFlrfLVzjr56j/o0hfzP7xInHmgMbSf5bkr BJNovSbOzyqccCX0kmPT5aeQ3JReSpoZnA/VFCJTJQt5LCZxnDViFnWwa+kd0dB5g7Wq wKNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date; bh=2Zq+hzfKI3w+F5GQCbXLoeqCgexTGWFCCMt2S9CV/T4=; b=z+TKJOApxwgppxeYqiWzbys9Z+Sct5gWBYXO1zv5HVgjFDXZwz8MsS1S34+q8/R1LT j13+5Zzrkcmq72SRs4qNmAk4PH/bDq3QDA3aaiFRjU3qS9KIxfU7/bDT9vKAgrAi3UFh LC3BM4bnrdpo9uN8eJTJhTlGxN/oc4pJxBYAJW4QJpi+d1KV13ry4MoqDKfJM8Nwqaka puleWp/M13DplNKIstqLvLMka0vDlG221UL7EGFmawTuE6Y9hn3mM1Aad546KpYQcm0h b6VEK0SvDIosFLc46Xil7dryPXHsrbems3ZZ3LvffmWQVZ6lcx8/BEHTn9BmaMTc2fLi DSxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s16-20020a056a00195000b0056341219532si34648819pfk.89.2022.10.23.11.28.40; Sun, 23 Oct 2022 11:28:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230113AbiJWS2G (ORCPT <rfc822;pwkd43@gmail.com> + 99 others); Sun, 23 Oct 2022 14:28:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230038AbiJWS2B (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 23 Oct 2022 14:28:01 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8E1904DF2D; Sun, 23 Oct 2022 11:27:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 39C30ED1; Sun, 23 Oct 2022 11:27:53 -0700 (PDT) Received: from e120937-lin (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 27B733F7B4; Sun, 23 Oct 2022 11:27:46 -0700 (PDT) Date: Sun, 23 Oct 2022 19:27:40 +0100 From: Cristian Marussi <cristian.marussi@arm.com> To: daniel.lezcano@linaro.org, linux@roeck-us.net Cc: sudeep.holla@arm.com, linux-hwmon@vger.kernel.org, linux-kernel@vger.kernel.org Subject: (bug report) HWMON & Thermal interactions Message-ID: <Y1WHnJ6h1RSOipV4@e120937-lin> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747504053098933202?= X-GMAIL-MSGID: =?utf-8?q?1747504053098933202?= |
Series |
(bug report) HWMON & Thermal interactions
|
|
Commit Message
Cristian Marussi
Oct. 23, 2022, 6:27 p.m. UTC
Hi,
Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due
to the fact that no trip points were (ever !) defined in the DT; bisecting it
looks like that after:
https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/
the presence of the mandatory trips node within thermal zones is now
enforced.
So, this is NOT what this bug report is about (I'll post soon patches for
the JUNO DT missing trips) BUT once this problem was solved in the DT,
another issue appeared:
[ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone
that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones
embedding that sensors, only the first one is found as belonging to one ThermZ.
(this happens ALSO with v6.0 once I added the trips...)
Digging deep into this, it turned out that inside the call chain
devm_hwmon_device_register_with_info
hwmon_device_register_with_info
__hwmon_device_register
hwmon_thermal_register_sensors(dev)
--> hwmon_thermal_add_sensor(dev, j)
--> devm_thermal_of_zone_register(dev, sensor_id, tdata, )
the HWMON channel index j is passed to the Thermal framework in order to
search and bind sensors with defined thermal zone, but this lead to the
assumption that sequential HWMON channel indexes corresponds one-to-one to the
underlying real sensor IDs that the ThermalFramework uses for matching
within the DT.
On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2
thermal zones like:
thernal_zones {
pmic {
...
thermal-sensors = <&scmi_sensors0 0>;
...
trips {
...
}
soc {
...
thermal-sensors = <&scmi_sensors0 3>;
...
trips {
...
}
}
}
This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for
the soc where J=1 BUT the real sensor ID is 3.
Note that there can be a number of sensors, not all of them of a type handled
by HWMON, and enumerated by SCMI in different ways depending on the
platform.
I suppose this is not an SCMI-only related issue, but maybe in non-SCMI
context, where sensors are purely defined in the DT, the solution can be
more easily attained (i.e. renumber the sensors).
At first I tried to solve this inside scmi-hwmon.c BUT I could not find
a way to present to the HWMON subsystem the list of sensors preserving
the above index/sensor_id matching (not even with a hack like passing
down dummy sensors to the HWMON subsystem to fill the 'holes' in the
numbering)
My tentative solution, which works fine for me in my context, was to add
an optional HWMON hwops, so that the core hwmon can retrieve if needed the
real sensor ID if different from the channel index (using an optional hwops
instead of some static hwinfo var let me avoid to have to patch all the
existent hwmon drivers that happens to just work fine as of today...but
maybe it is not necessarily the proper final solution...)
i.e.
----8<----
Author: Cristian Marussi <cristian.marussi@arm.com>
Date: Fri Oct 21 17:24:04 2022 +0100
hwmon: Add new .get_sensor_id hwops
Add a new optional helper which can be defined to allow an hwmon chip to
provide the logic to map hwmon indexes to the real underlying sensor IDs.
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
----->8----
... plus obviously the related scmi-hwmon.c patch to make use of this.
Any thought ? Am I missing something ?
(not really an expert on both subsystems really ... :P)
Thanks,
Cristian
Comments
On 10/23/22 11:27, Cristian Marussi wrote: > Hi, > > Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due > to the fact that no trip points were (ever !) defined in the DT; bisecting it > looks like that after: > > https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ > > the presence of the mandatory trips node within thermal zones is now > enforced. > > So, this is NOT what this bug report is about (I'll post soon patches for > the JUNO DT missing trips) BUT once this problem was solved in the DT, > another issue appeared: > > [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone > > that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones > embedding that sensors, only the first one is found as belonging to one ThermZ. > (this happens ALSO with v6.0 once I added the trips...) > > Digging deep into this, it turned out that inside the call chain > > devm_hwmon_device_register_with_info > hwmon_device_register_with_info > __hwmon_device_register > hwmon_thermal_register_sensors(dev) > --> hwmon_thermal_add_sensor(dev, j) > --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) > > the HWMON channel index j is passed to the Thermal framework in order to > search and bind sensors with defined thermal zone, but this lead to the > assumption that sequential HWMON channel indexes corresponds one-to-one to the > underlying real sensor IDs that the ThermalFramework uses for matching > within the DT. > > On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 > thermal zones like: > > thernal_zones { > pmic { > ... > thermal-sensors = <&scmi_sensors0 0>; > ... > trips { > ... > } > soc { > ... > thermal-sensors = <&scmi_sensors0 3>; > ... > trips { > ... > } > } > } > > This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for > the soc where J=1 BUT the real sensor ID is 3. > > Note that there can be a number of sensors, not all of them of a type handled > by HWMON, and enumerated by SCMI in different ways depending on the > platform. > > I suppose this is not an SCMI-only related issue, but maybe in non-SCMI > context, where sensors are purely defined in the DT, the solution can be > more easily attained (i.e. renumber the sensors). > > At first I tried to solve this inside scmi-hwmon.c BUT I could not find > a way to present to the HWMON subsystem the list of sensors preserving > the above index/sensor_id matching (not even with a hack like passing > down dummy sensors to the HWMON subsystem to fill the 'holes' in the > numbering) > > My tentative solution, which works fine for me in my context, was to add > an optional HWMON hwops, so that the core hwmon can retrieve if needed the > real sensor ID if different from the channel index (using an optional hwops > instead of some static hwinfo var let me avoid to have to patch all the > existent hwmon drivers that happens to just work fine as of today...but > maybe it is not necessarily the proper final solution...) > > i.e. > > ----8<---- > > Author: Cristian Marussi <cristian.marussi@arm.com> > Date: Fri Oct 21 17:24:04 2022 +0100 > > hwmon: Add new .get_sensor_id hwops > > Add a new optional helper which can be defined to allow an hwmon chip to > provide the logic to map hwmon indexes to the real underlying sensor IDs. > Maybe I am missing something, but ... The driver isn't supposed to know anything about thermal devices and thermal zones. If that no longer works, and drivers have to know about thermal zones and thermal zone device index values anyway, we might as well pull thermal device support from the hwmon core and implement it in drivers. Guenter
Linux regression tracking (Thorsten Leemhuis)
Oct. 24, 2022, 10:18 a.m. UTC |
#2
Addressed
Unaddressed
[Note: this mail is primarily send for documentation purposes and/or for regzbot, my Linux kernel regression tracking bot. That's why I removed most or all folks from the list of recipients, but left any that looked like a mailing lists. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter out.] [TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.] Hi, this is your Linux kernel regression tracker. CCing the regression mailing list, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html On 23.10.22 20:27, Cristian Marussi wrote: > > Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due > to the fact that no trip points were (ever !) defined in the DT; bisecting it > looks like that after: Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced e51813313 #regzbot title SCMI HWMON driver failed probing on JUNO #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/ Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. > https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ > > the presence of the mandatory trips node within thermal zones is now > enforced. > > So, this is NOT what this bug report is about (I'll post soon patches for > the JUNO DT missing trips) BUT once this problem was solved in the DT, > another issue appeared: > > [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone > > that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones > embedding that sensors, only the first one is found as belonging to one ThermZ. > (this happens ALSO with v6.0 once I added the trips...) > > Digging deep into this, it turned out that inside the call chain > > devm_hwmon_device_register_with_info > hwmon_device_register_with_info > __hwmon_device_register > hwmon_thermal_register_sensors(dev) > --> hwmon_thermal_add_sensor(dev, j) > --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) > > the HWMON channel index j is passed to the Thermal framework in order to > search and bind sensors with defined thermal zone, but this lead to the > assumption that sequential HWMON channel indexes corresponds one-to-one to the > underlying real sensor IDs that the ThermalFramework uses for matching > within the DT. > > On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 > thermal zones like: > > thernal_zones { > pmic { > ... > thermal-sensors = <&scmi_sensors0 0>; > ... > trips { > ... > } > soc { > ... > thermal-sensors = <&scmi_sensors0 3>; > ... > trips { > ... > } > } > } > > This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for > the soc where J=1 BUT the real sensor ID is 3. > > Note that there can be a number of sensors, not all of them of a type handled > by HWMON, and enumerated by SCMI in different ways depending on the > platform. > > I suppose this is not an SCMI-only related issue, but maybe in non-SCMI > context, where sensors are purely defined in the DT, the solution can be > more easily attained (i.e. renumber the sensors). > > At first I tried to solve this inside scmi-hwmon.c BUT I could not find > a way to present to the HWMON subsystem the list of sensors preserving > the above index/sensor_id matching (not even with a hack like passing > down dummy sensors to the HWMON subsystem to fill the 'holes' in the > numbering) > > My tentative solution, which works fine for me in my context, was to add > an optional HWMON hwops, so that the core hwmon can retrieve if needed the > real sensor ID if different from the channel index (using an optional hwops > instead of some static hwinfo var let me avoid to have to patch all the > existent hwmon drivers that happens to just work fine as of today...but > maybe it is not necessarily the proper final solution...) > > i.e. > > ----8<---- > > Author: Cristian Marussi <cristian.marussi@arm.com> > Date: Fri Oct 21 17:24:04 2022 +0100 > > hwmon: Add new .get_sensor_id hwops > > Add a new optional helper which can be defined to allow an hwmon chip to > provide the logic to map hwmon indexes to the real underlying sensor IDs. > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > > diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c > index 4218750d5a66..45d3d5070cde 100644 > --- a/drivers/hwmon/hwmon.c > +++ b/drivers/hwmon/hwmon.c > @@ -213,7 +213,8 @@ static void hwmon_thermal_remove_sensor(void *data) > list_del(data); > } > > -static int hwmon_thermal_add_sensor(struct device *dev, int index) > +static int hwmon_thermal_add_sensor(struct device *dev, int index, > 7+ unsigned int sensor_id) > { > struct hwmon_device *hwdev = to_hwmon_device(dev); > struct hwmon_thermal_data *tdata; > @@ -227,7 +228,7 @@ static int hwmon_thermal_add_sensor(struct device *dev, int index) > tdata->dev = dev; > tdata->index = index; > > - tzd = devm_thermal_of_zone_register(dev, index, tdata, > + tzd = devm_thermal_of_zone_register(dev, sensor_id, tdata, > &hwmon_thermal_ops); > if (IS_ERR(tzd)) { > if (PTR_ERR(tzd) != -ENODEV) > @@ -264,13 +265,18 @@ static int hwmon_thermal_register_sensors(struct device *dev) > > for (j = 0; info[i]->config[j]; j++) { > int err; > + unsigned int id; > > if (!(info[i]->config[j] & HWMON_T_INPUT) || > !chip->ops->is_visible(drvdata, hwmon_temp, > hwmon_temp_input, j)) > continue; > > - err = hwmon_thermal_add_sensor(dev, j); > + id = !chip->ops->get_sensor_id ? j : > + chip->ops->get_sensor_id(drvdata, > + hwmon_temp, j); > + > + err = hwmon_thermal_add_sensor(dev, j, id); > if (err) > return err; > } > diff --git a/include/linux/hwmon.h b/include/linux/hwmon.h > index 14325f93c6b2..e5dbab83f4d1 100644 > --- a/include/linux/hwmon.h > +++ b/include/linux/hwmon.h > @@ -396,6 +396,9 @@ enum hwmon_intrusion_attributes { > struct hwmon_ops { > umode_t (*is_visible)(const void *drvdata, enum hwmon_sensor_types type, > u32 attr, int channel); > + unsigned int (*get_sensor_id)(const void *drvdata, > + enum hwmon_sensor_types type, > + int channel); > int (*read)(struct device *dev, enum hwmon_sensor_types type, > u32 attr, int channel, long *val); > int (*read_string)(struct device *dev, enum hwmon_sensor_types type, > > > ----->8---- > > > ... plus obviously the related scmi-hwmon.c patch to make use of this. > > Any thought ? Am I missing something ? > (not really an expert on both subsystems really ... :P) > > Thanks, > Cristian >
On 10/23/22 14:23, Guenter Roeck wrote: > On 10/23/22 11:27, Cristian Marussi wrote: >> Hi, >> >> Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due >> to the fact that no trip points were (ever !) defined in the DT; bisecting it >> looks like that after: >> >> https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ >> >> the presence of the mandatory trips node within thermal zones is now >> enforced. >> >> So, this is NOT what this bug report is about (I'll post soon patches for >> the JUNO DT missing trips) BUT once this problem was solved in the DT, >> another issue appeared: >> >> [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone >> >> that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones >> embedding that sensors, only the first one is found as belonging to one ThermZ. >> (this happens ALSO with v6.0 once I added the trips...) >> >> Digging deep into this, it turned out that inside the call chain >> >> devm_hwmon_device_register_with_info >> hwmon_device_register_with_info >> __hwmon_device_register >> hwmon_thermal_register_sensors(dev) >> --> hwmon_thermal_add_sensor(dev, j) >> --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) >> >> the HWMON channel index j is passed to the Thermal framework in order to >> search and bind sensors with defined thermal zone, but this lead to the >> assumption that sequential HWMON channel indexes corresponds one-to-one to the >> underlying real sensor IDs that the ThermalFramework uses for matching >> within the DT. >> >> On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 >> thermal zones like: >> >> thernal_zones { >> pmic { >> ... >> thermal-sensors = <&scmi_sensors0 0>; >> ... >> trips { >> ... >> } >> soc { >> ... >> thermal-sensors = <&scmi_sensors0 3>; >> ... >> trips { >> ... >> } >> } >> } >> >> This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for >> the soc where J=1 BUT the real sensor ID is 3. >> >> Note that there can be a number of sensors, not all of them of a type handled >> by HWMON, and enumerated by SCMI in different ways depending on the >> platform. >> >> I suppose this is not an SCMI-only related issue, but maybe in non-SCMI >> context, where sensors are purely defined in the DT, the solution can be >> more easily attained (i.e. renumber the sensors). >> >> At first I tried to solve this inside scmi-hwmon.c BUT I could not find >> a way to present to the HWMON subsystem the list of sensors preserving >> the above index/sensor_id matching (not even with a hack like passing >> down dummy sensors to the HWMON subsystem to fill the 'holes' in the >> numbering) >> >> My tentative solution, which works fine for me in my context, was to add >> an optional HWMON hwops, so that the core hwmon can retrieve if needed the >> real sensor ID if different from the channel index (using an optional hwops >> instead of some static hwinfo var let me avoid to have to patch all the >> existent hwmon drivers that happens to just work fine as of today...but >> maybe it is not necessarily the proper final solution...) >> >> i.e. >> >> ----8<---- >> >> Author: Cristian Marussi <cristian.marussi@arm.com> >> Date: Fri Oct 21 17:24:04 2022 +0100 >> >> hwmon: Add new .get_sensor_id hwops >> Add a new optional helper which can be defined to allow an hwmon chip to >> provide the logic to map hwmon indexes to the real underlying sensor IDs. > > Maybe I am missing something, but ... > > The driver isn't supposed to know anything about thermal devices and > thermal zones. If that no longer works, and drivers have to know about > thermal zones and thermal zone device index values anyway, we might > as well pull thermal device support from the hwmon core and implement > it in drivers. > No, wait: The question is really: Why does the scmi driver present the sensor with index 3 to the hwmon subsystem as sensor with index 1 ? If the sensor has index 3, and is presented to other entities as sensor with index 3, it should be presented to the hwmon subsystem as sensor with index 3, not with index 1. If sensors with index 1..2 do not exist, the is_visible function should return 0 for those sensors. Guenter > Guenter >
On Mon, Oct 24, 2022 at 04:56:43AM -0700, Guenter Roeck wrote: > On 10/23/22 14:23, Guenter Roeck wrote: > > On 10/23/22 11:27, Cristian Marussi wrote: > > > Hi, > > > > > > Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due > > > to the fact that no trip points were (ever !) defined in the DT; bisecting it > > > looks like that after: > > > > > > https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ > > > > > > the presence of the mandatory trips node within thermal zones is now > > > enforced. > > > > > > So, this is NOT what this bug report is about (I'll post soon patches for > > > the JUNO DT missing trips) BUT once this problem was solved in the DT, > > > another issue appeared: > > > > > > [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone > > > > > > that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones > > > embedding that sensors, only the first one is found as belonging to one ThermZ. > > > (this happens ALSO with v6.0 once I added the trips...) > > > > > > Digging deep into this, it turned out that inside the call chain > > > > > > devm_hwmon_device_register_with_info > > > hwmon_device_register_with_info > > > __hwmon_device_register > > > hwmon_thermal_register_sensors(dev) > > > --> hwmon_thermal_add_sensor(dev, j) > > > --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) > > > > > > the HWMON channel index j is passed to the Thermal framework in order to > > > search and bind sensors with defined thermal zone, but this lead to the > > > assumption that sequential HWMON channel indexes corresponds one-to-one to the > > > underlying real sensor IDs that the ThermalFramework uses for matching > > > within the DT. > > > > > > On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 > > > thermal zones like: > > > > > > thernal_zones { > > > pmic { > > > ... > > > thermal-sensors = <&scmi_sensors0 0>; > > > ... > > > trips { > > > ... > > > } > > > soc { > > > ... > > > thermal-sensors = <&scmi_sensors0 3>; > > > ... > > > trips { > > > ... > > > } > > > } > > > } > > > > > > This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for > > > the soc where J=1 BUT the real sensor ID is 3. > > > > > > Note that there can be a number of sensors, not all of them of a type handled > > > by HWMON, and enumerated by SCMI in different ways depending on the > > > platform. > > > > > > I suppose this is not an SCMI-only related issue, but maybe in non-SCMI > > > context, where sensors are purely defined in the DT, the solution can be > > > more easily attained (i.e. renumber the sensors). > > > > > > At first I tried to solve this inside scmi-hwmon.c BUT I could not find > > > a way to present to the HWMON subsystem the list of sensors preserving > > > the above index/sensor_id matching (not even with a hack like passing > > > down dummy sensors to the HWMON subsystem to fill the 'holes' in the > > > numbering) > > > > > > My tentative solution, which works fine for me in my context, was to add > > > an optional HWMON hwops, so that the core hwmon can retrieve if needed the > > > real sensor ID if different from the channel index (using an optional hwops > > > instead of some static hwinfo var let me avoid to have to patch all the > > > existent hwmon drivers that happens to just work fine as of today...but > > > maybe it is not necessarily the proper final solution...) > > > > > > i.e. > > > > > > ----8<---- > > > > > > Author: Cristian Marussi <cristian.marussi@arm.com> > > > Date: Fri Oct 21 17:24:04 2022 +0100 > > > > > > hwmon: Add new .get_sensor_id hwops > > > Add a new optional helper which can be defined to allow an hwmon chip to > > > provide the logic to map hwmon indexes to the real underlying sensor IDs. > > > > Maybe I am missing something, but ... > > > > The driver isn't supposed to know anything about thermal devices and > > thermal zones. If that no longer works, and drivers have to know about > > thermal zones and thermal zone device index values anyway, we might > > as well pull thermal device support from the hwmon core and implement > > it in drivers. > > > > No, wait: The question is really: Why does the scmi driver present the sensor > with index 3 to the hwmon subsystem as sensor with index 1 ? > > If the sensor has index 3, and is presented to other entities as sensor > with index 3, it should be presented to the hwmon subsystem as sensor with > index 3, not with index 1. If sensors with index 1..2 do not exist, > the is_visible function should return 0 for those sensors. > My understanding was that the hwmon index is the index of the channel and hwmon_channel_info struct groups channels by type while the index is really used as a pointer in the hwmon_channel_info.config field, so in this case you're saying I should present 4 temp sensors placing a 'hole' at sensor 1,2 making is_visible return 0 for those channels ? Basically keeping the channel indexes in sync with the real sensor ID by the means of some dummy sensor entries in the config field: this could result potentially in a lot of holes given in SCMI the sensor_id is 16 bits and I thought that was too hackish but I can try. In the meantime, I gave it a go at what you suggested early (if I got it right...) by removing from the scmi-hwmon driver the HWMON_C_REGISTER_TZ attribute and adding a few explicit calls to devm_thermal_of_zone_register() at the end of the probe to specifically register the needed temp sensors (and associated real sensor IDs) with the ThermalFramework without relying on the HWMON core for Thermal and it works fine indeed. Thanks, Cristian
On 10/24/22 05:47, Cristian Marussi wrote: > On Mon, Oct 24, 2022 at 04:56:43AM -0700, Guenter Roeck wrote: >> On 10/23/22 14:23, Guenter Roeck wrote: >>> On 10/23/22 11:27, Cristian Marussi wrote: >>>> Hi, >>>> >>>> Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due >>>> to the fact that no trip points were (ever !) defined in the DT; bisecting it >>>> looks like that after: >>>> >>>> https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ >>>> >>>> the presence of the mandatory trips node within thermal zones is now >>>> enforced. >>>> >>>> So, this is NOT what this bug report is about (I'll post soon patches for >>>> the JUNO DT missing trips) BUT once this problem was solved in the DT, >>>> another issue appeared: >>>> >>>> [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone >>>> >>>> that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones >>>> embedding that sensors, only the first one is found as belonging to one ThermZ. >>>> (this happens ALSO with v6.0 once I added the trips...) >>>> >>>> Digging deep into this, it turned out that inside the call chain >>>> >>>> devm_hwmon_device_register_with_info >>>> hwmon_device_register_with_info >>>> __hwmon_device_register >>>> hwmon_thermal_register_sensors(dev) >>>> --> hwmon_thermal_add_sensor(dev, j) >>>> --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) >>>> >>>> the HWMON channel index j is passed to the Thermal framework in order to >>>> search and bind sensors with defined thermal zone, but this lead to the >>>> assumption that sequential HWMON channel indexes corresponds one-to-one to the >>>> underlying real sensor IDs that the ThermalFramework uses for matching >>>> within the DT. >>>> >>>> On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 >>>> thermal zones like: >>>> >>>> thernal_zones { >>>> pmic { >>>> ... >>>> thermal-sensors = <&scmi_sensors0 0>; >>>> ... >>>> trips { >>>> ... >>>> } >>>> soc { >>>> ... >>>> thermal-sensors = <&scmi_sensors0 3>; >>>> ... >>>> trips { >>>> ... >>>> } >>>> } >>>> } >>>> >>>> This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for >>>> the soc where J=1 BUT the real sensor ID is 3. >>>> >>>> Note that there can be a number of sensors, not all of them of a type handled >>>> by HWMON, and enumerated by SCMI in different ways depending on the >>>> platform. >>>> >>>> I suppose this is not an SCMI-only related issue, but maybe in non-SCMI >>>> context, where sensors are purely defined in the DT, the solution can be >>>> more easily attained (i.e. renumber the sensors). >>>> >>>> At first I tried to solve this inside scmi-hwmon.c BUT I could not find >>>> a way to present to the HWMON subsystem the list of sensors preserving >>>> the above index/sensor_id matching (not even with a hack like passing >>>> down dummy sensors to the HWMON subsystem to fill the 'holes' in the >>>> numbering) >>>> >>>> My tentative solution, which works fine for me in my context, was to add >>>> an optional HWMON hwops, so that the core hwmon can retrieve if needed the >>>> real sensor ID if different from the channel index (using an optional hwops >>>> instead of some static hwinfo var let me avoid to have to patch all the >>>> existent hwmon drivers that happens to just work fine as of today...but >>>> maybe it is not necessarily the proper final solution...) >>>> >>>> i.e. >>>> >>>> ----8<---- >>>> >>>> Author: Cristian Marussi <cristian.marussi@arm.com> >>>> Date: Fri Oct 21 17:24:04 2022 +0100 >>>> >>>> hwmon: Add new .get_sensor_id hwops >>>> Add a new optional helper which can be defined to allow an hwmon chip to >>>> provide the logic to map hwmon indexes to the real underlying sensor IDs. >>> >>> Maybe I am missing something, but ... >>> >>> The driver isn't supposed to know anything about thermal devices and >>> thermal zones. If that no longer works, and drivers have to know about >>> thermal zones and thermal zone device index values anyway, we might >>> as well pull thermal device support from the hwmon core and implement >>> it in drivers. >>> >> >> No, wait: The question is really: Why does the scmi driver present the sensor >> with index 3 to the hwmon subsystem as sensor with index 1 ? >> >> If the sensor has index 3, and is presented to other entities as sensor >> with index 3, it should be presented to the hwmon subsystem as sensor with >> index 3, not with index 1. If sensors with index 1..2 do not exist, >> the is_visible function should return 0 for those sensors. >> > > My understanding was that the hwmon index is the index of the channel > and hwmon_channel_info struct groups channels by type while the index is > really used as a pointer in the hwmon_channel_info.config field, so in > this case you're saying I should present 4 temp sensors placing a 'hole' > at sensor 1,2 making is_visible return 0 for those channels ? > > Basically keeping the channel indexes in sync with the real sensor ID by > the means of some dummy sensor entries in the config field: this could result > potentially in a lot of holes given in SCMI the sensor_id is 16 bits and > I thought that was too hackish but I can try. > The underlying idea with the hwmon -> thermal bridge is that index values used by thermal and by the hwmon subsystem match and, yes, that there would if necessary be holes in hwmon index values (normally this is not a 16-bit number space). If that doesn't work for scmi, and if there could indeed be something like thermal-sensors = <&scmi_sensors0 12345>; then I think the solution is indeed to not rely on the hwmon->thermal bridge in the hwmon core for this driver. > In the meantime, I gave it a go at what you suggested early (if I got it > right...) by removing from the scmi-hwmon driver the HWMON_C_REGISTER_TZ > attribute and adding a few explicit calls to devm_thermal_of_zone_register() at > the end of the probe to specifically register the needed temp sensors (and > associated real sensor IDs) with the ThermalFramework without relying on the > HWMON core for Thermal and it works fine indeed. > Excellent. Thanks, Guenter
On Mon, Oct 24, 2022 at 07:51:09AM -0700, Guenter Roeck wrote: > On 10/24/22 05:47, Cristian Marussi wrote: > > On Mon, Oct 24, 2022 at 04:56:43AM -0700, Guenter Roeck wrote: > > > On 10/23/22 14:23, Guenter Roeck wrote: > > > > On 10/23/22 11:27, Cristian Marussi wrote: > > > > > Hi, > > > > > > > > > > Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due > > > > > to the fact that no trip points were (ever !) defined in the DT; bisecting it > > > > > looks like that after: > > > > > > > > > > https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@linexp.org/ > > > > > > > > > > the presence of the mandatory trips node within thermal zones is now > > > > > enforced. > > > > > > > > > > So, this is NOT what this bug report is about (I'll post soon patches for > > > > > the JUNO DT missing trips) BUT once this problem was solved in the DT, > > > > > another issue appeared: > > > > > > > > > > [ 1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone > > > > > > > > > > that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones > > > > > embedding that sensors, only the first one is found as belonging to one ThermZ. > > > > > (this happens ALSO with v6.0 once I added the trips...) > > > > > > > > > > Digging deep into this, it turned out that inside the call chain > > > > > > > > > > devm_hwmon_device_register_with_info > > > > > hwmon_device_register_with_info > > > > > __hwmon_device_register > > > > > hwmon_thermal_register_sensors(dev) > > > > > --> hwmon_thermal_add_sensor(dev, j) > > > > > --> devm_thermal_of_zone_register(dev, sensor_id, tdata, ) > > > > > > > > > > the HWMON channel index j is passed to the Thermal framework in order to > > > > > search and bind sensors with defined thermal zone, but this lead to the > > > > > assumption that sequential HWMON channel indexes corresponds one-to-one to the > > > > > underlying real sensor IDs that the ThermalFramework uses for matching > > > > > within the DT. > > > > > > > > > > On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2 > > > > > thermal zones like: > > > > > > > > > > thernal_zones { > > > > > pmic { > > > > > ... > > > > > thermal-sensors = <&scmi_sensors0 0>; > > > > > ... > > > > > trips { > > > > > ... > > > > > } > > > > > soc { > > > > > ... > > > > > thermal-sensors = <&scmi_sensors0 3>; > > > > > ... > > > > > trips { > > > > > ... > > > > > } > > > > > } > > > > > } > > > > > > > > > > This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for > > > > > the soc where J=1 BUT the real sensor ID is 3. > > > > > > > > > > Note that there can be a number of sensors, not all of them of a type handled > > > > > by HWMON, and enumerated by SCMI in different ways depending on the > > > > > platform. > > > > > > > > > > I suppose this is not an SCMI-only related issue, but maybe in non-SCMI > > > > > context, where sensors are purely defined in the DT, the solution can be > > > > > more easily attained (i.e. renumber the sensors). > > > > > > > > > > At first I tried to solve this inside scmi-hwmon.c BUT I could not find > > > > > a way to present to the HWMON subsystem the list of sensors preserving > > > > > the above index/sensor_id matching (not even with a hack like passing > > > > > down dummy sensors to the HWMON subsystem to fill the 'holes' in the > > > > > numbering) > > > > > > > > > > My tentative solution, which works fine for me in my context, was to add > > > > > an optional HWMON hwops, so that the core hwmon can retrieve if needed the > > > > > real sensor ID if different from the channel index (using an optional hwops > > > > > instead of some static hwinfo var let me avoid to have to patch all the > > > > > existent hwmon drivers that happens to just work fine as of today...but > > > > > maybe it is not necessarily the proper final solution...) > > > > > > > > > > i.e. > > > > > > > > > > ----8<---- > > > > > > > > > > Author: Cristian Marussi <cristian.marussi@arm.com> > > > > > Date: Fri Oct 21 17:24:04 2022 +0100 > > > > > > > > > > hwmon: Add new .get_sensor_id hwops > > > > > Add a new optional helper which can be defined to allow an hwmon chip to > > > > > provide the logic to map hwmon indexes to the real underlying sensor IDs. > > > > > > > > Maybe I am missing something, but ... > > > > > > > > The driver isn't supposed to know anything about thermal devices and > > > > thermal zones. If that no longer works, and drivers have to know about > > > > thermal zones and thermal zone device index values anyway, we might > > > > as well pull thermal device support from the hwmon core and implement > > > > it in drivers. > > > > > > > > > > No, wait: The question is really: Why does the scmi driver present the sensor > > > with index 3 to the hwmon subsystem as sensor with index 1 ? > > > > > > If the sensor has index 3, and is presented to other entities as sensor > > > with index 3, it should be presented to the hwmon subsystem as sensor with > > > index 3, not with index 1. If sensors with index 1..2 do not exist, > > > the is_visible function should return 0 for those sensors. > > > > > > > My understanding was that the hwmon index is the index of the channel > > and hwmon_channel_info struct groups channels by type while the index is > > really used as a pointer in the hwmon_channel_info.config field, so in > > this case you're saying I should present 4 temp sensors placing a 'hole' > > at sensor 1,2 making is_visible return 0 for those channels ? > > > > Basically keeping the channel indexes in sync with the real sensor ID by > > the means of some dummy sensor entries in the config field: this could result > > potentially in a lot of holes given in SCMI the sensor_id is 16 bits and > > I thought that was too hackish but I can try. > > > > The underlying idea with the hwmon -> thermal bridge is that index values > used by thermal and by the hwmon subsystem match and, yes, that there would > if necessary be holes in hwmon index values (normally this is not a 16-bit > number space). If that doesn't work for scmi, and if there could indeed be > something like > > thermal-sensors = <&scmi_sensors0 12345>; > > then I think the solution is indeed to not rely on the hwmon->thermal bridge > in the hwmon core for this driver. Even though implausible it could be possible to have an SCMI fw platform advertising such high sensor IDs. > > > In the meantime, I gave it a go at what you suggested early (if I got it > > right...) by removing from the scmi-hwmon driver the HWMON_C_REGISTER_TZ > > attribute and adding a few explicit calls to devm_thermal_of_zone_register() at > > the end of the probe to specifically register the needed temp sensors (and > > associated real sensor IDs) with the ThermalFramework without relying on the > > HWMON core for Thermal and it works fine indeed. > > > > Excellent. > I'll follow this path. Thanks for your help & feedback. Cristian
Linux regression tracking (Thorsten Leemhuis)
Nov. 4, 2022, 10:32 a.m. UTC |
#7
Addressed
Unaddressed
On 24.10.22 12:18, Thorsten Leemhuis wrote: > On 23.10.22 20:27, Cristian Marussi wrote: >> >> Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due >> to the fact that no trip points were (ever !) defined in the DT; bisecting it >> looks like that after: > > Thanks for the report. To be sure below issue doesn't fall through the > cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression > tracking bot: > > #regzbot ^introduced e51813313 > #regzbot title SCMI HWMON driver failed probing on JUNO > #regzbot ignore-activity #regzbot fixed-by: c4a7b9b587ca
diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c index 4218750d5a66..45d3d5070cde 100644 --- a/drivers/hwmon/hwmon.c +++ b/drivers/hwmon/hwmon.c @@ -213,7 +213,8 @@ static void hwmon_thermal_remove_sensor(void *data) list_del(data); } -static int hwmon_thermal_add_sensor(struct device *dev, int index) +static int hwmon_thermal_add_sensor(struct device *dev, int index, 7+ unsigned int sensor_id) { struct hwmon_device *hwdev = to_hwmon_device(dev); struct hwmon_thermal_data *tdata; @@ -227,7 +228,7 @@ static int hwmon_thermal_add_sensor(struct device *dev, int index) tdata->dev = dev; tdata->index = index; - tzd = devm_thermal_of_zone_register(dev, index, tdata, + tzd = devm_thermal_of_zone_register(dev, sensor_id, tdata, &hwmon_thermal_ops); if (IS_ERR(tzd)) { if (PTR_ERR(tzd) != -ENODEV) @@ -264,13 +265,18 @@ static int hwmon_thermal_register_sensors(struct device *dev) for (j = 0; info[i]->config[j]; j++) { int err; + unsigned int id; if (!(info[i]->config[j] & HWMON_T_INPUT) || !chip->ops->is_visible(drvdata, hwmon_temp, hwmon_temp_input, j)) continue; - err = hwmon_thermal_add_sensor(dev, j); + id = !chip->ops->get_sensor_id ? j : + chip->ops->get_sensor_id(drvdata, + hwmon_temp, j); + + err = hwmon_thermal_add_sensor(dev, j, id); if (err) return err; } diff --git a/include/linux/hwmon.h b/include/linux/hwmon.h index 14325f93c6b2..e5dbab83f4d1 100644 --- a/include/linux/hwmon.h +++ b/include/linux/hwmon.h @@ -396,6 +396,9 @@ enum hwmon_intrusion_attributes { struct hwmon_ops { umode_t (*is_visible)(const void *drvdata, enum hwmon_sensor_types type, u32 attr, int channel); + unsigned int (*get_sensor_id)(const void *drvdata, + enum hwmon_sensor_types type, + int channel); int (*read)(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, long *val); int (*read_string)(struct device *dev, enum hwmon_sensor_types type,