Message ID | 7552439.EvYhyI6sBW@kreacher |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp2434767vqg; Tue, 25 Jul 2023 05:32:10 -0700 (PDT) X-Google-Smtp-Source: APBJJlHztf71mG36+02x38yhg3Uw3VQe77HcOXEo1HnKxnq9WDeZGohMN/ndVPU9NJQb/bsJJXxg X-Received: by 2002:a05:6870:f148:b0:1b7:4870:9075 with SMTP id l8-20020a056870f14800b001b748709075mr14030220oac.6.1690288330525; Tue, 25 Jul 2023 05:32:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690288330; cv=none; d=google.com; s=arc-20160816; b=xmpqoJ1+TEsuN3Go9inVmN6u3SkUpMhXfUdVm+I49zPO2XD+rAwBzGe4+DL5SDT+dH qmOgUoWWmFrYG7EsbveuoFsvYxzQ46zE0pjgwijU0Ls8u/3uOunWhvsaGZqn0PfzaWGB obnGFum8NTreK1SVtm5ZJpeOAyE6sX8z7cSUA52pZ1I9KakXvlre5/Zpu/67Ok65+Tcj dTQmksKo0yGNcGjpO/s7Uy8VaTh7psT409iPDFwWXTEB6Wqove57VW3QBOwNreU/Ghdd XJRKXs1A5ZYYcP7MOBIzd4o7OBxZznKFH+At/KE9OMndSi9WSPu++w9EkbzM/rgYAozq iRfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=GXCL8lMvPjm5oxw9qCQ5PkWHOcol1z4dRjZG6zSxLO8=; fh=WAm0nkxw3rD71A4QzwJS1JiPPKRsyn91IiWMPsOrzoY=; b=wJr/r25q447xYQNg5JviRgliIw6K0op671aF8TCVKbX9d9vMqaPQc0J0Ku2EQhMmam JB3k2HGiecMLMeED+fb6CQRv2YZGOGoQ47jUDJCY+wlDUxayuN99Sxy/NHzRQKQmKkio 9Noy9xFNZBMimO+FJYWtg97aA//laoj075EX7sujHqUgKMezUIPLtUaZeuvCLjfhSFIJ xvZi/v8WfPfZ81LetEDbBwc6o1DlCuxPfLzFt8lp9H82J0TqMKGcdyBO0Zha5kt9Y/xY NeHCKxXiy7HHX6CRI+QWm5Z21etpoJ03n99pf16yb0VWVdOFVjuknJ08URk4ieKpsY7o hD3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q68-20020a632a47000000b0055c868268f9si11870793pgq.462.2023.07.25.05.31.49; Tue, 25 Jul 2023 05:32:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233559AbjGYMZk (ORCPT <rfc822;kloczko.tomasz@gmail.com> + 99 others); Tue, 25 Jul 2023 08:25:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233410AbjGYMZc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 25 Jul 2023 08:25:32 -0400 Received: from cloudserver094114.home.pl (cloudserver094114.home.pl [79.96.170.134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66976172A; Tue, 25 Jul 2023 05:25:31 -0700 (PDT) Received: from localhost (127.0.0.1) (HELO v370.home.net.pl) by /usr/run/smtp (/usr/run/postfix/private/idea_relay_lmtp) via UNIX with SMTP (IdeaSmtpServer 5.2.0) id 3810275855d39208; Tue, 25 Jul 2023 14:25:29 +0200 Received: from kreacher.localnet (unknown [195.136.19.94]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by v370.home.net.pl (Postfix) with ESMTPSA id 5404B661B0E; Tue, 25 Jul 2023 14:25:29 +0200 (CEST) From: "Rafael J. Wysocki" <rjw@rjwysocki.net> To: Linux ACPI <linux-acpi@vger.kernel.org> Cc: LKML <linux-kernel@vger.kernel.org>, Linux PM <linux-pm@vger.kernel.org>, Michal Wilczynski <michal.wilczynski@intel.com>, Zhang Rui <rui.zhang@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Daniel Lezcano <daniel.lezcano@linaro.org> Subject: [PATCH v3 5/8] ACPI: thermal: Hold thermal zone lock around trip updates Date: Tue, 25 Jul 2023 14:16:33 +0200 Message-ID: <7552439.EvYhyI6sBW@kreacher> In-Reply-To: <12254967.O9o76ZdvQC@kreacher> References: <13318886.uLZWGnKmhe@kreacher> <12254967.O9o76ZdvQC@kreacher> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="UTF-8" X-CLIENT-IP: 195.136.19.94 X-CLIENT-HOSTNAME: 195.136.19.94 X-VADE-SPAMSTATE: clean X-VADE-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedviedriedtgdehudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfjqffogffrnfdpggftiffpkfenuceurghilhhouhhtmecuudehtdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkjghfggfgtgesthfuredttddtjeenucfhrhhomhepfdftrghfrggvlhculfdrucghhihsohgtkhhifdcuoehrjhifsehrjhifhihsohgtkhhirdhnvghtqeenucggtffrrghtthgvrhhnpedvffeuiedtgfdvtddugeeujedtffetteegfeekffdvfedttddtuefhgeefvdejhfenucfkphepudelhedrudefiedrudelrdelgeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeduleehrddufeeirdduledrleegpdhhvghlohepkhhrvggrtghhvghrrdhlohgtrghlnhgvthdpmhgrihhlfhhrohhmpedftfgrfhgrvghlucflrdcuhgihshhotghkihdfuceorhhjfiesrhhjfiihshhotghkihdrnhgvtheqpdhnsggprhgtphhtthhopeejpdhrtghpthhtoheplhhinhhugidqrggtphhisehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqkhgvrhhnvghlsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqphhmsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtohepmhhitghhrghlrdifihhltgiihihnshhkihesihhnthgvlhdrtghomhdprhgt phhtthhopehruhhirdiihhgrnhhgsehinhhtvghlrdgtohhmpdhrtghpthhtohepshhrihhnihhvrghsrdhprghnughruhhvrggurgeslhhinhhugidrihhnthgvlhdrtghomh X-DCC--Metrics: v370.home.net.pl 1024; Body=7 Fuz1=7 Fuz2=7 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772044269368069758 X-GMAIL-MSGID: 1772395776492306885 |
Series |
ACPI: thermal: Use trip point table to register thermal zones
|
|
Commit Message
Rafael J. Wysocki
July 25, 2023, 12:16 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> There is a race condition between acpi_thermal_trips_update() and acpi_thermal_check_fn(), because the trip points may get updated while the latter is running which in theory may lead to inconsistent results. For example, if two trips are updated together, using the temperature value of one of them from before the update and the temperature value of the other one from after the update may not lead to the expected outcome. To address this, make acpi_thermal_trips_update() hold the thermal zone lock across the entire update of trip points. While at it, change the acpi_thermal_trips_update() return data type to void as that function always returns 0 anyway. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- v2 -> v3: No changes. v1 -> v2: * Hold the thermal zone lock instead of thermal_check_lock around trip point updates (this also helps to protect thermal_get_trend() from using stale trip temperatures). * Add a comment documenting the purpose of the locking. * Make acpi_thermal_trips_update() void. --- drivers/acpi/thermal.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
Comments
Hi Rafael, On 25/07/2023 14:16, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > There is a race condition between acpi_thermal_trips_update() and > acpi_thermal_check_fn(), because the trip points may get updated while > the latter is running which in theory may lead to inconsistent results. > For example, if two trips are updated together, using the temperature > value of one of them from before the update and the temperature value > of the other one from after the update may not lead to the expected > outcome. > > To address this, make acpi_thermal_trips_update() hold the thermal zone > lock across the entire update of trip points. As commented in patch 3/8, having a driver locking a thermal core structure is not right and goes to the opposite direction of the recent cleanups. Don't we have 2 race conditions: acpi_thermal_trips_update() + thermal_zone_device_check() acpi_thermal_trips_update() + acpi_thermal_trips_update() For the former, we can disable the thermal zone, update and then enable For the latter use a driver lock ? > While at it, change the acpi_thermal_trips_update() return data type > to void as that function always returns 0 anyway. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > v2 -> v3: No changes. > > v1 -> v2: > * Hold the thermal zone lock instead of thermal_check_lock around trip > point updates (this also helps to protect thermal_get_trend() from using > stale trip temperatures). > * Add a comment documenting the purpose of the locking. > * Make acpi_thermal_trips_update() void. > > --- > drivers/acpi/thermal.c | 21 ++++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > Index: linux-pm/drivers/acpi/thermal.c > =================================================================== > --- linux-pm.orig/drivers/acpi/thermal.c > +++ linux-pm/drivers/acpi/thermal.c > @@ -190,7 +190,7 @@ static int acpi_thermal_get_polling_freq > return 0; > } > > -static int acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) > +static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) > { > acpi_status status; > unsigned long long tmp; > @@ -398,17 +398,28 @@ static int acpi_thermal_trips_update(str > ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "device"); > } > } > +} > > - return 0; > +static void acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) > +{ > + /* > + * The locking is needed here to protect thermal_get_trend() from using > + * a stale passive trip temperature and to synchronize with the trip > + * temperature updates in acpi_thermal_check_fn(). > + */ > + thermal_zone_device_lock(tz->thermal_zone); > + > + __acpi_thermal_trips_update(tz, flag); > + > + thermal_zone_device_unlock(tz->thermal_zone); > } > > static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) > { > - int i, ret = acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); > bool valid; > + int i; > > - if (ret) > - return ret; > + __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); > > valid = tz->trips.critical.valid | > tz->trips.hot.valid | > > >
On Tue, Aug 1, 2023 at 8:39 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Hi Rafael, > > On 25/07/2023 14:16, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > There is a race condition between acpi_thermal_trips_update() and > > acpi_thermal_check_fn(), because the trip points may get updated while > > the latter is running which in theory may lead to inconsistent results. > > For example, if two trips are updated together, using the temperature > > value of one of them from before the update and the temperature value > > of the other one from after the update may not lead to the expected > > outcome. > > > > To address this, make acpi_thermal_trips_update() hold the thermal zone > > lock across the entire update of trip points. > > As commented in patch 3/8, having a driver locking a thermal core > structure is not right and goes to the opposite direction of the recent > cleanups. It already happens though, because thermal_zone_device_update() locks the zone and it is called by the driver. > Don't we have 2 race conditions: > > acpi_thermal_trips_update() + thermal_zone_device_check() > > acpi_thermal_trips_update() + acpi_thermal_trips_update() I'm not sure what you mean. First off, acpi_thermal_check_fn() needs to be locked against anything using the trips in the zone's trips[] table, in particular thermal_get_trend(). However, thermal_get_trend() also uses the driver's private trips information, so it needs to be locked against acpi_thermal_trips_update(). And obviously the latter needs to be locked against acpi_thermal_check_fn(). > For the former, we can disable the thermal zone, update and then enable Disabling the thermal zone is an idea, but it would be necessary to do that in both acpi_thermal_check_fn() and acpi_thermal_trips_update(). Also I'm not sure how different that would be from holding the zone lock across the updates. Moreover, acpi_thermal_trips_update() would then need to hold the local lock around the thermal zone disable/enable which would be way uglier than just using the zone lock directly in it.
Index: linux-pm/drivers/acpi/thermal.c =================================================================== --- linux-pm.orig/drivers/acpi/thermal.c +++ linux-pm/drivers/acpi/thermal.c @@ -190,7 +190,7 @@ static int acpi_thermal_get_polling_freq return 0; } -static int acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) +static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) { acpi_status status; unsigned long long tmp; @@ -398,17 +398,28 @@ static int acpi_thermal_trips_update(str ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "device"); } } +} - return 0; +static void acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) +{ + /* + * The locking is needed here to protect thermal_get_trend() from using + * a stale passive trip temperature and to synchronize with the trip + * temperature updates in acpi_thermal_check_fn(). + */ + thermal_zone_device_lock(tz->thermal_zone); + + __acpi_thermal_trips_update(tz, flag); + + thermal_zone_device_unlock(tz->thermal_zone); } static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) { - int i, ret = acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); bool valid; + int i; - if (ret) - return ret; + __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); valid = tz->trips.critical.valid | tz->trips.hot.valid |