Message ID | Y5sWMEG0xCl9bgEi@tpad |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp332908wrn; Thu, 15 Dec 2022 04:54:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf6IJ/sOLrs1iyihBt3OpC0YxvfqemT8+rQDqcpPZmA1rOBZBlrIqyiqh6V/OxhEzZ9RdM89 X-Received: by 2002:aa7:c7da:0:b0:463:ba70:42ff with SMTP id o26-20020aa7c7da000000b00463ba7042ffmr23655110eds.25.1671108854918; Thu, 15 Dec 2022 04:54:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671108854; cv=none; d=google.com; s=arc-20160816; b=jOB9iQi/LLLaPhU1eOaL7t/FbgIWarPinUYtLvaUIIofwoHzaLvPpo+pa4csTGO36K cjgbyC5XjIX1RwnK2Ran2HKgu96GTk6ETXZ8vlUvLQjbkZghNUzWEzpX94lfWh+oxeUe ZlaLQtlgRmX7npM8wYKeaNj5wwiYlfiEYRUiyUswkKcG3k2ypZJhWPHk7SwogX4rRaLs MpSHZ91j8k8FmVPdshmIYu1+Ga9LO2Z/q3ut5WXzeh2MjR5sd50OMu9C4iAgjFBNRnpA ZJXkBo5Ud8iF6h+obKK98oEdAgYrJ9KwbOjyVJsNVYCnrR+OKU4sRbMBFHmsHX4KEtn/ FTXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date:dkim-signature; bh=f/TLVNnRLwvO822nGsYI0P3m2dy78HMtEXDUfQIOSkg=; b=CMX39N2lVi4uJ+SSBTeUuEmtUvYCj7cmuoZLS5rVOLLzkleRmjf/n0zaGY3+Ug9uXD rdXmOPlInlGoWsOzgo946oz5FhmXYA3kR9dGVuuxOTkoTzwm2kacOheVbMp/cVH+kR5/ Z7fJzrWhw57CsXFQjz9hwGSa1QoPh9RjMUBQPEw0b01rWAeKquTqYlXpM1ozldQlIe8D Ih+ZcF0RZMDR9NehOY5vtOqCd4bHg9B/h6i572PiL5J1zwtZmNUbDVdqOpGfLagGLhuz Vmiqb0UE6dmJ1plnaiD5cP2PSG06p3Pv/kyB9Ue4RPWCoMhIPsy6wdng2PyaIssvJxrO Ue4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HnyO3gxv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h10-20020a056402280a00b0046baedff35bsi18124453ede.291.2022.12.15.04.53.51; Thu, 15 Dec 2022 04:54:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HnyO3gxv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229639AbiLOMrQ (ORCPT <rfc822;jeantsuru.cumc.mandola@gmail.com> + 99 others); Thu, 15 Dec 2022 07:47:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbiLOMqk (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 15 Dec 2022 07:46:40 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FCF732B91 for <linux-kernel@vger.kernel.org>; Thu, 15 Dec 2022 04:44:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671108246; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=f/TLVNnRLwvO822nGsYI0P3m2dy78HMtEXDUfQIOSkg=; b=HnyO3gxv0uSR3ZpCOVk8iqg6KIX/G++kh2APi4MnEgAWL8/mwfDr7EtXvxjAHyR/J4E0pq 1e+OY1NjcsPZ/KL3IpUL9F2zygx2SCxPZRbRM7v2KEab58F1eE6Iu6E9zXdryaJVKb+7SQ JkfwcUD0p9I7oCq697O2gU3e8MqlNlE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-595-w-sJ6HzoN2OKJc2tTi5yEw-1; Thu, 15 Dec 2022 07:43:50 -0500 X-MC-Unique: w-sJ6HzoN2OKJc2tTi5yEw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A7C373806072; Thu, 15 Dec 2022 12:43:49 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 79E51175AD; Thu, 15 Dec 2022 12:43:49 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id E42574041664A; Thu, 15 Dec 2022 09:42:24 -0300 (-03) Date: Thu, 15 Dec 2022 09:42:24 -0300 From: Marcelo Tosatti <mtosatti@redhat.com> To: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-kernel@vger.kernel.org, linux-hwmon@vger.kernel.org, Frederic Weisbecker <frederic@kernel.org> Subject: [PATCH] hwmon: coretemp: avoid RDMSR interruptions to isolated CPUs Message-ID: <Y5sWMEG0xCl9bgEi@tpad> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752284638836949682?= X-GMAIL-MSGID: =?utf-8?q?1752284638836949682?= |
Series |
hwmon: coretemp: avoid RDMSR interruptions to isolated CPUs
|
|
Commit Message
Marcelo Tosatti
Dec. 15, 2022, 12:42 p.m. UTC
The coretemp driver uses rdmsr_on_cpu calls to read
MSR_IA32_PACKAGE_THERM_STATUS/MSR_IA32_THERM_STATUS registers,
which contain information about current core temperature.
For certain low latency applications, the RDMSR interruption exceeds
the applications requirements.
So disable reading of crit_alarm and temp files via /sys, in case
CPU isolation is enabled.
Temperature information from the housekeeping cores should be
sufficient to infer die temperature.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Comments
On 12/15/22 04:42, Marcelo Tosatti wrote: > > The coretemp driver uses rdmsr_on_cpu calls to read > MSR_IA32_PACKAGE_THERM_STATUS/MSR_IA32_THERM_STATUS registers, > which contain information about current core temperature. > > For certain low latency applications, the RDMSR interruption exceeds > the applications requirements. > > So disable reading of crit_alarm and temp files via /sys, in case > CPU isolation is enabled. > That isn't really what the code is doing. It doesn't disable reading the attributes, it returns an error when an attempt is made to read them. > Temperature information from the housekeeping cores should be > sufficient to infer die temperature. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > index 9bee4d33fbdf..30a35f4130d5 100644 > --- a/drivers/hwmon/coretemp.c > +++ b/drivers/hwmon/coretemp.c > @@ -27,6 +27,7 @@ > #include <asm/msr.h> > #include <asm/processor.h> > #include <asm/cpu_device_id.h> > +#include <linux/sched/isolation.h> > > #define DRVNAME "coretemp" > > @@ -121,6 +122,10 @@ static ssize_t show_crit_alarm(struct device *dev, > struct platform_data *pdata = dev_get_drvdata(dev); > struct temp_data *tdata = pdata->core_data[attr->index]; > > + > + if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC)) > + return -EINVAL; Littering the output of the "sensors" command with errors is most definitely wrong and not acceptable. On top of that, the user didn't do anything wrong, so -EINVAL ("Invalid Argument") is definitely the wrong error. Maybe return -ENODATA, or if the condition is static just don't instantiate the attribute for the affected CPUs to start with. Also, this warrants a comment in the code and an explanation in Documentation/hwmon/coretemp.rst. Guenter
Hi Marcelo, https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Marcelo-Tosatti/hwmon-coretemp-avoid-RDMSR-interruptions-to-isolated-CPUs/20221215-204904 patch link: https://lore.kernel.org/r/Y5sWMEG0xCl9bgEi%40tpad patch subject: [PATCH] hwmon: coretemp: avoid RDMSR interruptions to isolated CPUs config: i386-randconfig-m021 compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Reported-by: Dan Carpenter <error27@gmail.com> smatch warnings: drivers/hwmon/coretemp.c:181 show_temp() warn: inconsistent returns '&tdata->update_lock'. vim +181 drivers/hwmon/coretemp.c 199e0de7f5df31 Durgadoss R 2011-05-20 154 static ssize_t show_temp(struct device *dev, 199e0de7f5df31 Durgadoss R 2011-05-20 155 struct device_attribute *devattr, char *buf) 199e0de7f5df31 Durgadoss R 2011-05-20 156 { bebe467823c0d8 Rudolf Marek 2007-05-08 157 u32 eax, edx; 199e0de7f5df31 Durgadoss R 2011-05-20 158 struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); 199e0de7f5df31 Durgadoss R 2011-05-20 159 struct platform_data *pdata = dev_get_drvdata(dev); 199e0de7f5df31 Durgadoss R 2011-05-20 160 struct temp_data *tdata = pdata->core_data[attr->index]; 199e0de7f5df31 Durgadoss R 2011-05-20 161 199e0de7f5df31 Durgadoss R 2011-05-20 162 mutex_lock(&tdata->update_lock); bebe467823c0d8 Rudolf Marek 2007-05-08 163 199e0de7f5df31 Durgadoss R 2011-05-20 164 /* Check whether the time interval has elapsed */ 199e0de7f5df31 Durgadoss R 2011-05-20 165 if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) { e78264610cd902 Marcelo Tosatti 2022-12-15 166 if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC)) e78264610cd902 Marcelo Tosatti 2022-12-15 167 return -EINVAL; mutex_unlock(&tdata->update_lock); 199e0de7f5df31 Durgadoss R 2011-05-20 168 rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); bf6ea084ebb54c Guenter Roeck 2013-11-20 169 /* bf6ea084ebb54c Guenter Roeck 2013-11-20 170 * Ignore the valid bit. In all observed cases the register bf6ea084ebb54c Guenter Roeck 2013-11-20 171 * value is either low or zero if the valid bit is 0. bf6ea084ebb54c Guenter Roeck 2013-11-20 172 * Return it instead of reporting an error which doesn't bf6ea084ebb54c Guenter Roeck 2013-11-20 173 * really help at all. bf6ea084ebb54c Guenter Roeck 2013-11-20 174 */ bf6ea084ebb54c Guenter Roeck 2013-11-20 175 tdata->temp = tdata->tjmax - ((eax >> 16) & 0x7f) * 1000; 952a11ca32a604 Paul Fertser 2021-09-24 176 tdata->valid = true; 199e0de7f5df31 Durgadoss R 2011-05-20 177 tdata->last_updated = jiffies; bebe467823c0d8 Rudolf Marek 2007-05-08 178 } bebe467823c0d8 Rudolf Marek 2007-05-08 179 199e0de7f5df31 Durgadoss R 2011-05-20 180 mutex_unlock(&tdata->update_lock); bf6ea084ebb54c Guenter Roeck 2013-11-20 @181 return sprintf(buf, "%d\n", tdata->temp); bebe467823c0d8 Rudolf Marek 2007-05-08 182 }
On Fri, Dec 23, 2022 at 01:48:14PM +0300, Dan Carpenter wrote: > Hi Marcelo, > > https://git-scm.com/docs/git-format-patch#_base_tree_information] > > url: https://github.com/intel-lab-lkp/linux/commits/Marcelo-Tosatti/hwmon-coretemp-avoid-RDMSR-interruptions-to-isolated-CPUs/20221215-204904 > patch link: https://lore.kernel.org/r/Y5sWMEG0xCl9bgEi%40tpad > patch subject: [PATCH] hwmon: coretemp: avoid RDMSR interruptions to isolated CPUs > config: i386-randconfig-m021 > compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 > > If you fix the issue, kindly add following tag where applicable > | Reported-by: kernel test robot <lkp@intel.com> > | Reported-by: Dan Carpenter <error27@gmail.com> > > smatch warnings: > drivers/hwmon/coretemp.c:181 show_temp() warn: inconsistent returns '&tdata->update_lock'. > > vim +181 drivers/hwmon/coretemp.c > > 199e0de7f5df31 Durgadoss R 2011-05-20 154 static ssize_t show_temp(struct device *dev, > 199e0de7f5df31 Durgadoss R 2011-05-20 155 struct device_attribute *devattr, char *buf) > 199e0de7f5df31 Durgadoss R 2011-05-20 156 { > bebe467823c0d8 Rudolf Marek 2007-05-08 157 u32 eax, edx; > 199e0de7f5df31 Durgadoss R 2011-05-20 158 struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > 199e0de7f5df31 Durgadoss R 2011-05-20 159 struct platform_data *pdata = dev_get_drvdata(dev); > 199e0de7f5df31 Durgadoss R 2011-05-20 160 struct temp_data *tdata = pdata->core_data[attr->index]; > 199e0de7f5df31 Durgadoss R 2011-05-20 161 > 199e0de7f5df31 Durgadoss R 2011-05-20 162 mutex_lock(&tdata->update_lock); > bebe467823c0d8 Rudolf Marek 2007-05-08 163 > 199e0de7f5df31 Durgadoss R 2011-05-20 164 /* Check whether the time interval has elapsed */ > 199e0de7f5df31 Durgadoss R 2011-05-20 165 if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) { > e78264610cd902 Marcelo Tosatti 2022-12-15 166 if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC)) > e78264610cd902 Marcelo Tosatti 2022-12-15 167 return -EINVAL; > > mutex_unlock(&tdata->update_lock); > > 199e0de7f5df31 Durgadoss R 2011-05-20 168 rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); > bf6ea084ebb54c Guenter Roeck 2013-11-20 169 /* > bf6ea084ebb54c Guenter Roeck 2013-11-20 170 * Ignore the valid bit. In all observed cases the register > bf6ea084ebb54c Guenter Roeck 2013-11-20 171 * value is either low or zero if the valid bit is 0. > bf6ea084ebb54c Guenter Roeck 2013-11-20 172 * Return it instead of reporting an error which doesn't > bf6ea084ebb54c Guenter Roeck 2013-11-20 173 * really help at all. > bf6ea084ebb54c Guenter Roeck 2013-11-20 174 */ > bf6ea084ebb54c Guenter Roeck 2013-11-20 175 tdata->temp = tdata->tjmax - ((eax >> 16) & 0x7f) * 1000; > 952a11ca32a604 Paul Fertser 2021-09-24 176 tdata->valid = true; > 199e0de7f5df31 Durgadoss R 2011-05-20 177 tdata->last_updated = jiffies; > bebe467823c0d8 Rudolf Marek 2007-05-08 178 } > bebe467823c0d8 Rudolf Marek 2007-05-08 179 > 199e0de7f5df31 Durgadoss R 2011-05-20 180 mutex_unlock(&tdata->update_lock); > bf6ea084ebb54c Guenter Roeck 2013-11-20 @181 return sprintf(buf, "%d\n", tdata->temp); > bebe467823c0d8 Rudolf Marek 2007-05-08 182 } > > -- > 0-DAY CI Kernel Test Service > https://01.org/lkp Thanks, v3 of the patch should not suffer from this issue.
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 9bee4d33fbdf..30a35f4130d5 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -27,6 +27,7 @@ #include <asm/msr.h> #include <asm/processor.h> #include <asm/cpu_device_id.h> +#include <linux/sched/isolation.h> #define DRVNAME "coretemp" @@ -121,6 +122,10 @@ static ssize_t show_crit_alarm(struct device *dev, struct platform_data *pdata = dev_get_drvdata(dev); struct temp_data *tdata = pdata->core_data[attr->index]; + + if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC)) + return -EINVAL; + mutex_lock(&tdata->update_lock); rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); mutex_unlock(&tdata->update_lock); @@ -158,6 +163,8 @@ static ssize_t show_temp(struct device *dev, /* Check whether the time interval has elapsed */ if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) { + if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC)) + return -EINVAL; rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); /* * Ignore the valid bit. In all observed cases the register