Message ID | 20230313111806.2.Iee214b2dd184cb19197db8f97fad7e4adca273be@changeid |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1338675wrd; Mon, 13 Mar 2023 11:26:25 -0700 (PDT) X-Google-Smtp-Source: AK7set+vPexY/cLrhE4QzS8mIhub1Vkm1SuSG7kMtcHONuHCJUp1t79mpRuvB7822MPq7MyAy4FM X-Received: by 2002:a05:6a20:431e:b0:cc:9b29:f617 with SMTP id h30-20020a056a20431e00b000cc9b29f617mr36957196pzk.0.1678731985634; Mon, 13 Mar 2023 11:26:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678731985; cv=none; d=google.com; s=arc-20160816; b=mBywh+i3hyW1++5cBl66qptNhR9zXKRa69AWuc2b/BqreEyPg2YTAfp5rtWYEWmxiO Be2XgqvFAdjZDymPWvdf686XeojMIgez8AIb0EQSq0MDsvVtq5E3fK02fCizcvAvlXWH Xx6Wwoxa0ZXHOqVUPr6L6WGN/y1FQhjDTzU1vxx1pEse1jVIqXM07P8U4t55aOX0S1u/ UbQW+yXWuqNjNPckDY+pgo2m1bXiEPSe/hwWUfdNQXSSf8C/b0oVWvcbOAlCfcHs4ZPN jzxXl7ez26JBAHbexBmjvkYlvfTSI/RV6TdlOS8hB/6vmlZNvaYRyZTuCVXIjTrYjvOD 1XjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XstmFM2GshtLwvoAEIpsc9WtbOo21zJeuCi3J74UKuM=; b=MXcJH1qEvpZF4tZkLe9Y7Dq7NnvbhfgFwgGbiAV2ety009s/ZV4Y9HezKQ89ITXOhs Xwarr4M+VRd+jWX4Cy06rSpf18rIil+YalFjapAExWLxX+zaA8tVU7UHA5Yt1ZoQdhea iIdU3dCt3R0mFZ62SxCM8NijYO7fvYbDV4lfxLdnUg4vGcxdQ2TitnOuVM4NLICX1G1i 8jdA1XkcQfW/UQE33Stkybc/bppyBhhZrKuBLckoHbzyOV1RW2+pPV7BvVFJbVpZzSbe br9EEl/J8MvDOuuGtS9tAiXthLRcRB60STS6dVaLa3tOMu1XTnmDGpa+6xevKUaiYGil 2mhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=AXHqUvDD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r12-20020aa7962c000000b0058bc338c9b2si52980pfg.372.2023.03.13.11.26.09; Mon, 13 Mar 2023 11:26:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=AXHqUvDD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230081AbjCMSTU (ORCPT <rfc822;realc9580@gmail.com> + 99 others); Mon, 13 Mar 2023 14:19:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230014AbjCMSTI (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 13 Mar 2023 14:19:08 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E832C6A052 for <linux-kernel@vger.kernel.org>; Mon, 13 Mar 2023 11:19:05 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id bd34so8216264pfb.3 for <linux-kernel@vger.kernel.org>; Mon, 13 Mar 2023 11:19:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1678731545; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XstmFM2GshtLwvoAEIpsc9WtbOo21zJeuCi3J74UKuM=; b=AXHqUvDDmZPGCCP5cgSLitbVM+kqfQ7XsQEqKZ/cR7dPzDSiBHuZVBwGz0SqcN7oxs DBrJEwOiLSkhmgGZjRWmYnJNgtT5wofMgGiNBCn6tT4m+bsc3Wr3rS3hc3NMX5NvAenf qY8IA9nPi1oR6/dNzSBx9vfCEgoC1XFtoyiCI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678731545; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XstmFM2GshtLwvoAEIpsc9WtbOo21zJeuCi3J74UKuM=; b=5eicbU2+EH/P3D2qnn26LWeQWK8CRp1HEYE2Zd5nhZoIAv0AhIRrIZu0bAjdG5LmtQ mKeaZtaiXoXIX0QCasMArr8CbBzNq0orhCo0V+dMj+wG9j2kRkUyq9Bjw4rni14fqpXV VufhMFdseKFiAMMoXYdxgEvBKaqfRzwf80ROyMFEVVfYFfPZ1zp+tB1WR0KAVbGrCNEy OMRnH3qwMPUHV/m/CQapwg1Xp9ew3d3X/EckNejKBlCHIUrr/ah1bDe9hFjItQWJOGaW omvFrQb/K/wUw8YGIhtHR6DMNdvsmRjjfZVoEIKkVd2I33bBtNQw1+eXRSz+/Agdsxb/ 1j1g== X-Gm-Message-State: AO0yUKXi2tIZjExVZgCdXkaWTTxXUL9Yf0pblD7g7IN/3Pah2KdVZbP9 Nc12o47qhv64rOj4xpGJgU4acQ== X-Received: by 2002:aa7:9f44:0:b0:622:844e:80a8 with SMTP id h4-20020aa79f44000000b00622844e80a8mr5911107pfr.26.1678731545207; Mon, 13 Mar 2023 11:19:05 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:79c6:a848:199d:3491]) by smtp.gmail.com with ESMTPSA id a11-20020aa780cb000000b005ac419804d3sm25169pfn.186.2023.03.13.11.19.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Mar 2023 11:19:04 -0700 (PDT) From: Douglas Anderson <dianders@chromium.org> To: Mark Brown <broonie@kernel.org> Cc: mka@chromium.org, christian@kohlschutter.com, Douglas Anderson <dianders@chromium.org>, Liam Girdwood <lgirdwood@gmail.com>, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] regulator: fixed: Set PROBE_PREFER_ASYNCHRONOUS Date: Mon, 13 Mar 2023 11:18:20 -0700 Message-Id: <20230313111806.2.Iee214b2dd184cb19197db8f97fad7e4adca273be@changeid> X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog In-Reply-To: <20230313181820.2482385-1-dianders@chromium.org> References: <20230313181820.2482385-1-dianders@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760278070546899367?= X-GMAIL-MSGID: =?utf-8?q?1760278070546899367?= |
Series |
regulator: Fix boot speed regression w/ off-on-delay-us
|
|
Commit Message
Doug Anderson
March 13, 2023, 6:18 p.m. UTC
As of commit 218320fec294 ("regulator: core: Fix off-on-delay-us for
always-on/boot-on regulators"), we now might have a big delay during
probe of fixed regulators. That can have a significant boot speed
impact. Let's mitigate this by preferring async probe for fixed
regulators. The regulator framework itself has no issues with
regulators probing in an asynchronous way. The fixed regulator driver
is fairly straightforward and also has no issues.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
drivers/regulator/fixed.c | 1 +
1 file changed, 1 insertion(+)
Comments
On Mon, Mar 13, 2023 at 11:18:20AM -0700, Douglas Anderson wrote: > As of commit 218320fec294 ("regulator: core: Fix off-on-delay-us for > always-on/boot-on regulators"), we now might have a big delay during > probe of fixed regulators. That can have a significant boot speed > impact. Let's mitigate this by preferring async probe for fixed > regulators. The regulator framework itself has no issues with > regulators probing in an asynchronous way. The fixed regulator driver > is fairly straightforward and also has no issues. This is going to be true for all regulators...
Hi, On Mon, Mar 13, 2023 at 12:05 PM Mark Brown <broonie@kernel.org> wrote: > > On Mon, Mar 13, 2023 at 11:18:20AM -0700, Douglas Anderson wrote: > > As of commit 218320fec294 ("regulator: core: Fix off-on-delay-us for > > always-on/boot-on regulators"), we now might have a big delay during > > probe of fixed regulators. That can have a significant boot speed > > impact. Let's mitigate this by preferring async probe for fixed > > regulators. The regulator framework itself has no issues with > > regulators probing in an asynchronous way. The fixed regulator driver > > is fairly straightforward and also has no issues. > > This is going to be true for all regulators... Sure, that's true. So what are you suggesting? There has always been a hope that someday we could just turn on async probe everywhere. Folks at ChromeOS / Android at Google have looked at this on and off over the years. Most recently +briannorris looked at and described it as "no small task". There were a lot of kinks across the various subsystems. I think +Saravana also has it on his wishlist to tackle, too. For the most part, the best we've been able to do is to identify devices that happened to be involved in long delays at bootup and add the flag to those devices after confirming that they worked OK. It's not super elegant but it's pragmatic. That's what I've done here. If there are other regulator drivers that are known to be involved in big delays during probe time then we should add the flag there, too. Right now, "off-on-delay" is only for fixed regulators, though there's no real reason it shouldn't be applied to all regulators. I suppose some of the other regulator related delays could affect pretty much any regulator, though I guess normally those aren't associated with always-on / boot-on regulators? If you want, I can submit a pile of patches adding this flag to more regulators. I did that in the past for mmc: 31ae403513be mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that are newer than 5.4 d86472ae8b20 mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v5.4 a1a489197a07 mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v4.19 7320915c8861 mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v4.14 2a99f3fa85ea mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v4.9 21b2cec61c04 mmc: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v4.4 -Doug
On Mon, Mar 13, 2023 at 12:49:37PM -0700, Doug Anderson wrote: > On Mon, Mar 13, 2023 at 12:05 PM Mark Brown <broonie@kernel.org> wrote: > > This is going to be true for all regulators... > Sure, that's true. So what are you suggesting? That we shouldn't be making this change for just one driver, if you can write an identical commit log for most if not all drivers but are just making the change for one random driver then that suggests something is being missed somewhere. Either there's something special about this driver or we should do things consistently.
Hi, On Tue, Mar 14, 2023 at 6:29 AM Mark Brown <broonie@kernel.org> wrote: > > On Mon, Mar 13, 2023 at 12:49:37PM -0700, Doug Anderson wrote: > > On Mon, Mar 13, 2023 at 12:05 PM Mark Brown <broonie@kernel.org> wrote: > > > > This is going to be true for all regulators... > > > Sure, that's true. So what are you suggesting? > > That we shouldn't be making this change for just one driver, if you can > write an identical commit log for most if not all drivers but are just > making the change for one random driver then that suggests something is > being missed somewhere. Either there's something special about this > driver or we should do things consistently. The "special" things are: * It's been confirmed that this one random driver is involved in slowing down boot a significant amount. * The fixed regulator driver is among the simplest of the regulators and doesn't have the complex interactions that are typically associated with async probe problems. For reference, some of the problems I'm aware of that have been seen in the past when trying to enable async probe more broadly: a) Apparently, on ACPI child devices aren't guaranteed to be probed before parent devices once you turn on async probe. This is not typically the case with device-tree probed devices where children show up due to of_platform_populate() which is called by a parent device. This can be handled properly but often isn't. b) Some drivers try to poke directly at other devices and can get confused if the other device is probing at the same time. One example was dual-MIPI on Rockchip [1]. Again, this can be handled properly but often isn't because nobody tested it. c) Dynamically allocated ID numbers can change unpredictably from boot to boot with async probe. This showed up on MMC where eMMC and SD would change each boot between "mmcblk0" and "mmcblk1". d) Async probe (obviously) changes timing and that can expose latent bugs. Almost always those bugs should have been fixed anyway. e) Async probe tends to stress out other driver's (consumers of the device that's now probing async) error handling (since they are more likely to see -EPROBE_DEFER) and can expose latent bugs there. Let's take an example that I have a tiny bit of familiarity with: max77686. This is not too strange of a device. It's implemented as a MFD and has sub-drivers for PMIC (regulator), RTC, and clock. I think we can be pretty confident that a) above isn't a problem. The MFD driver (the parent) populates the regulator (the child) with devm_mfd_add_devices() and it won't do that until the (MFD) parent is ready. So far so good. I think we can also be fairly certain that c) isn't a problem, and probably it's not a problem anywhere for regulators. Nobody (that I'm aware of) relies on the stability of a dynamically allocated number for a regulator. Issues d) and e) could be an issue for max77686, but in almost all cases where they are we'd want to fix those latent bugs anyway since they could have hit even without async probe. ...but b) _could_ be a problem. Specifically, today I think that (unless some of its supplies aren't available at probe time) the PMIC driver will _always_ finish probing before the RTC/clock driver because of the order specified in the source code: { .name = "max77686-pmic", }, { .name = "max77686-rtc", }, { .name = "max77686-clk", }, One would need to do deeper code inspection (and, ideally, testing) to find out if indeed the RTC driver and clock driver will have problems if the regulator is not finished probing before their probes start. Hopefully everything here is fine, but it's the kind of place where maybe somebody was sloppy because, today, the order is guaranteed. In general drivers for multi-function devices are more likely to interact directly with sibling drivers outside of the normal driver framework and (in my experience) are more likely to be relying on probe order. If someone wanted to carefully inspect the max77686 code (especially the interactions between all the MFD sub-drivers) and test enabling async probe there then it would be a nice improvement. ...switching over and crossing our fingers might work OK but it also might not. If we look at the fixed regulator driver and compare, we're in a simpler state. Sure, we could still expose latent timing bugs or latent wrong error handling in regulator consumers, but those should be fixed anyway. The fixed regulator driver doesn't reach directly into other devices through private APIs and pretty much just relies on the regulator core for most of its work. The regulator core should be async-robust because of needing to deal with regulators that show up as modules. I guess the last thing I'll say is that PROBE_PREFER_ASYNCHRONOUS was added specifically so that we could enable this on drivers that were found to be slow to boot and that were tested to work with async probe. Without being able to add PROBE_PREFER_ASYNCHRONOUS people were open-coding solutions per driver to speed probe. [1] https://lore.kernel.org/r/20221019170255.2.I6b985b0ca372b7e35c6d9ea970b24bcb262d4fc1@changeid
On Tue, Mar 14, 2023 at 09:35:28AM -0700, Doug Anderson wrote: > The "special" things are: > * It's been confirmed that this one random driver is involved in > slowing down boot a significant amount. > * The fixed regulator driver is among the simplest of the regulators > and doesn't have the complex interactions that are typically > associated with async probe problems. I would be surprised to see regulators being in that category. > For reference, some of the problems I'm aware of that have been seen > in the past when trying to enable async probe more broadly: > a) Apparently, on ACPI child devices aren't guaranteed to be probed > before parent devices once you turn on async probe. This is not > typically the case with device-tree probed devices where children show > up due to of_platform_populate() which is called by a parent device. > This can be handled properly but often isn't. That seems unlikely to be an issue here. > b) Some drivers try to poke directly at other devices and can get > confused if the other device is probing at the same time. One example > was dual-MIPI on Rockchip [1]. Again, this can be handled properly but > often isn't because nobody tested it. Again, unlikely to be an issue for regulators - where multiple regulators interact with each other that's already coordinated through the core. > c) Dynamically allocated ID numbers can change unpredictably from boot > to boot with async probe. This showed up on MMC where eMMC and SD > would change each boot between "mmcblk0" and "mmcblk1". No numbers exposed to userspace here so that'll be fine. > d) Async probe (obviously) changes timing and that can expose latent > bugs. Almost always those bugs should have been fixed anyway. > e) Async probe tends to stress out other driver's (consumers of the > device that's now probing async) error handling (since they are more > likely to see -EPROBE_DEFER) and can expose latent bugs there. Right, not sure I'd worry about either of those though and given how widley used fixed regualtors are I'd expect it to be one of the riskiest here. > ...but b) _could_ be a problem. Specifically, today I think that > (unless some of its supplies aren't available at probe time) the PMIC > driver will _always_ finish probing before the RTC/clock driver > because of the order specified in the source code: > { .name = "max77686-pmic", }, > { .name = "max77686-rtc", }, > { .name = "max77686-clk", }, > One would need to do deeper code inspection (and, ideally, testing) to > find out if indeed the RTC driver and clock driver will have problems > if the regulator is not finished probing before their probes start. Behind the scenes interactions would be quite unusual, the power demands of the on chip devices are typically minimal so there's not much to do - it's typically all always on functions. Could happen of course but it'd be surprising. > I guess the last thing I'll say is that PROBE_PREFER_ASYNCHRONOUS was > added specifically so that we could enable this on drivers that were > found to be slow to boot and that were tested to work with async > probe. Without being able to add PROBE_PREFER_ASYNCHRONOUS people were > open-coding solutions per driver to speed probe. Sure, but my point is that at least every regulator with a ramp time (which will be most of them, even those that don't have one they tell the kernel about one probably should) is going to have a similar issue so could probably benefit from a similar solution.
diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c index 2a9867abba20..1927dc2d4cf8 100644 --- a/drivers/regulator/fixed.c +++ b/drivers/regulator/fixed.c @@ -334,6 +334,7 @@ static struct platform_driver regulator_fixed_voltage_driver = { .probe = reg_fixed_voltage_probe, .driver = { .name = "reg-fixed-voltage", + .probe_type = PROBE_PREFER_ASYNCHRONOUS, .of_match_table = of_match_ptr(fixed_of_match), }, };