Message ID | 20221114104333.3695531-1-javierm@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2076230wru; Mon, 14 Nov 2022 02:48:16 -0800 (PST) X-Google-Smtp-Source: AA0mqf4fIqb7qimN7Sc3zx4Ba+KkBp6UtCjcDljfq7CW9Akn9ZUR63f+RmKszi/BswWQBhS/lrPr X-Received: by 2002:a17:907:7853:b0:78d:ee08:1867 with SMTP id lb19-20020a170907785300b0078dee081867mr10376451ejc.123.1668422896538; Mon, 14 Nov 2022 02:48:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668422896; cv=none; d=google.com; s=arc-20160816; b=Y8bIYxyv3oyxrDRDLedKLA37nLKdVvtYFSu1kcmM9R/c9syt32OagdM7170FufMm9E cIywZeVZofGb2KA5SBsHZcpvHjB4jeZaNqW4x5qAPgrpr0EqXxOTAu+UuMOq8QhOY1Hq D5tRWhVwMZOcZkGsVzakFZDx49fPnubyedwQFeBpMenmPrVUAl3PQgK8DesbsY30nOPo 5/BHLPwMeGm3sNnzweZtNNvoDmhLqrrytn089szY9mTuVKfB38Kq+u8ngmY7+sl7UlLX t4mfeTy34luZz9KRpgf1gE3/OnKaESolyMOfo8NLAeSg59rvls96PoU8HCxpJZt6R0wG 662g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=KhGrKqZABfxXgxhvFZS6ZgjvLKrUckNIrRZXNfiIo74=; b=AjLB11r2msA5+BzlKSkQSoN5ntXw515j1cJpctfawJvHxIcUP2Y8Ou/4fRVLp529vP MFZv9uU0hZHdgeaSoLMBJzhk74WB1HDnycn8ST9zII+Pu+v27Onm2kBBmybu3pp7nKFM PoimsG+mOu5hBWntpxI1t+Zmg5NJCFFX3QfB3Loz0llo6r0LAR25ao5W2mfQdflFHJ3D M3N4dDMq+gnsWrZR945PhfPsxEpnVGjfDVc8Q27OyGWkcpwrcshOEvMXYij2V7jwhjzS hF7ISjfXzBDWPlEvCJArrYW8eQkHS6WnCO8GS2fC0l1d4d7sMi6pdvOAMMSKqgjGvZ4u cAcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Lwg52Q+d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e19-20020a056402191300b004571c47b13csi10424061edz.418.2022.11.14.02.47.52; Mon, 14 Nov 2022 02:48:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Lwg52Q+d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236319AbiKNKov (ORCPT <rfc822;winker.wchi@gmail.com> + 99 others); Mon, 14 Nov 2022 05:44:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236389AbiKNKon (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 14 Nov 2022 05:44:43 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B3AEB8C for <linux-kernel@vger.kernel.org>; Mon, 14 Nov 2022 02:43:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668422622; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=KhGrKqZABfxXgxhvFZS6ZgjvLKrUckNIrRZXNfiIo74=; b=Lwg52Q+dxsZyoqmnYMkSKPaSzZXdYEUWr1GsF56/awJL/KrKNEKHFDqeuzE9nGLGv1vgic SNbKEPKqB+FyMQfm5wOckIqQvbYVl+i6XhSdcgGFjLb2uA1aqlWywvV3P9E2H3iUH4YmHj XWteV3T0Kx0eqCv8m1Q/ucoYAjR0/1w= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-605-D4HNFdWJNOOMQkJcVfxOpw-1; Mon, 14 Nov 2022 05:43:41 -0500 X-MC-Unique: D4HNFdWJNOOMQkJcVfxOpw-1 Received: by mail-wm1-f72.google.com with SMTP id bg25-20020a05600c3c9900b003cf3ed7e27bso6580695wmb.4 for <linux-kernel@vger.kernel.org>; Mon, 14 Nov 2022 02:43:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KhGrKqZABfxXgxhvFZS6ZgjvLKrUckNIrRZXNfiIo74=; b=0fNDPg9TrSe6upxhxpY6pTU7kDo07jnZXgh6bi3NxsuBV4TrzkqvfU7dklZCRPEYI+ mI/gj2gEVlNGinuTqRZNH9VQ/kRJWIDQSQMSRfotl3YlzpEd9IagDhV5PlA1xh89YwMn /EPX18QpvFyCPPnKkq3doS8O8qVjgwaeCDcx0qiQvyhtpqM5OgAbhrAXa2yHkfl+Ugb2 /Egk7HsuShr6aGpJByhSwE3tI3/7M3U8VswJIwywZMDmJK7I0Bvr4iZ7qbL70paSAr0R r/KTEP+bZ1xU5xADRzx/Y5mvxOHrNjnbYj2nR3JqM5jwhmc+M233IJh9zyG25qgu74j8 NWYQ== X-Gm-Message-State: ANoB5plheBhF2KnGm6MHFFHkpYwYejuMU3FwzATi6SlBS4d1ez/JFRQU UgnTXSRWzSJ3Utdc0X6+USLIuEY9mpG/lqeNl13bojP5Eaov2lyiH/FObr0CEykkeNhhzghoONZ LG/pXZ0n+o1la56lse8ynz4SqYZzkrQPb9OH2Wk9qJPwFrSzx/ZBQbSlQdZSvFWkBkYQD3B14Tf M= X-Received: by 2002:a05:600c:3c93:b0:3cf:cfcd:1e0 with SMTP id bg19-20020a05600c3c9300b003cfcfcd01e0mr7551215wmb.166.1668422618433; Mon, 14 Nov 2022 02:43:38 -0800 (PST) X-Received: by 2002:a05:600c:3c93:b0:3cf:cfcd:1e0 with SMTP id bg19-20020a05600c3c9300b003cfcfcd01e0mr7551187wmb.166.1668422618089; Mon, 14 Nov 2022 02:43:38 -0800 (PST) Received: from minerva.home (205.pool92-176-231.dynamic.orange.es. [92.176.231.205]) by smtp.gmail.com with ESMTPSA id n18-20020a7bcbd2000000b003cf9bf5208esm17225468wmi.19.2022.11.14.02.43.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Nov 2022 02:43:37 -0800 (PST) From: Javier Martinez Canillas <javierm@redhat.com> To: linux-kernel@vger.kernel.org Cc: Saravana Kannan <saravanak@google.com>, Peter Robinson <pbrobinson@redhat.com>, Steev Klimaszewski <steev@kali.org>, Rob Herring <robh@kernel.org>, Sergio Lopez Pascual <slp@redhat.com>, Enric Balletbo i Serra <eballetbo@redhat.com>, John Stultz <jstultz@google.com>, Javier Martinez Canillas <javierm@redhat.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, "Rafael J. Wysocki" <rafael@kernel.org> Subject: [PATCH] driver core: Disable driver deferred probe timeout by default Date: Mon, 14 Nov 2022 11:43:33 +0100 Message-Id: <20221114104333.3695531-1-javierm@redhat.com> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749468206984529043?= X-GMAIL-MSGID: =?utf-8?q?1749468206984529043?= |
Series |
driver core: Disable driver deferred probe timeout by default
|
|
Commit Message
Javier Martinez Canillas
Nov. 14, 2022, 10:43 a.m. UTC
The driver_deferred_probe_timeout value has a long story. It was first set
to -1 when it was introduced by commit 25b4e70dcce9 ("driver core: allow
stopping deferred probe after init"), meaning that the driver core would
defer the probe forever unless a subsystem would opt-in by checking if the
initcalls where done using the driver_deferred_probe_check_state() helper,
or if a timeout was explicitly set with a "deferred_probe_timeout" param.
Only the power domain, IOMMU and MDIO subsystems currently opt-in to check
if the initcalls have completed with driver_deferred_probe_check_state().
Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state()
logic") then changed the driver_deferred_probe_check_state() helper logic,
to take into account whether modules have been enabled or not and also to
return -EPROBE_DEFER if the probe deferred timeout was still running.
Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a
longer default if CONFIG_MODULES is set"), the timeout was increased to 30
seconds if modules are enabled. Because seems that some of the subsystems
that were opt-in to not return -EPROBE_DEFER after the initcall where done
could still have dependencies whose drivers were built as a module.
This commit did a fundamental change to how probe deferral worked though,
since now the default was not to attempt probing for drivers indefinitely
but instead it would timeout after 30 seconds unless a different timeout
was set using the "deferred_probe_timeout" parameter.
The behavior was changed even mere with commit ce68929f07de ("driver core:
Revert default driver_deferred_probe_timeout value to 0"), since the value
was set to 0 by default. Meaning that the probe deferral would be disabled
after the initcalls where done. Unless a timeout was set in the cmdline.
Notice that the commit said that it was reverting the default value to 0,
but this was never 0. The default was -1 at the beginning and then changed
to 30 in a later commit.
This default value of 0 was reverted again by commit f516d01b9df2 ("Revert
"driver core: Set default deferred_probe_timeout back to 0."") and set to
10 seconds instead. Which was still less than the 30 seconds that was set
at some point to allow systems with drivers built as modules and loaded by
user-land to probe drivers that were previously deferred.
The 10 seconds timeout isn't enough for the mentioned systems, for example
general purpose distributions attempt to build all the possible drivers as
a module to keep the Linux kernel image small. But that means that in very
likely that the probe deferral mechanism will timeout and drivers won't be
probed correctly.
So let's change the default again to -1 as it was at the beginning. That's
how probe deferral always worked. In fact, it could even be that users can
load modules manually after the system has already booted so it is better
to not assume when it can be safe to just timeout instead of probe defer.
Systems that want to somehow time bound the probe deferral seems to be the
special case, so it makes more sense for these to either not enable module
support or specify a timeout using the "deferred_probe_timeout" parameter.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
---
drivers/base/dd.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
Comments
On Mon, Nov 14, 2022 at 11:43:33AM +0100, Javier Martinez Canillas wrote: > The driver_deferred_probe_timeout value has a long story. It was first set > to -1 when it was introduced by commit 25b4e70dcce9 ("driver core: allow > stopping deferred probe after init"), meaning that the driver core would > defer the probe forever unless a subsystem would opt-in by checking if the > initcalls where done using the driver_deferred_probe_check_state() helper, > or if a timeout was explicitly set with a "deferred_probe_timeout" param. > > Only the power domain, IOMMU and MDIO subsystems currently opt-in to check > if the initcalls have completed with driver_deferred_probe_check_state(). > > Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > logic") then changed the driver_deferred_probe_check_state() helper logic, > to take into account whether modules have been enabled or not and also to > return -EPROBE_DEFER if the probe deferred timeout was still running. > > Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a > longer default if CONFIG_MODULES is set"), the timeout was increased to 30 > seconds if modules are enabled. Because seems that some of the subsystems > that were opt-in to not return -EPROBE_DEFER after the initcall where done > could still have dependencies whose drivers were built as a module. > > This commit did a fundamental change to how probe deferral worked though, > since now the default was not to attempt probing for drivers indefinitely > but instead it would timeout after 30 seconds unless a different timeout > was set using the "deferred_probe_timeout" parameter. > > The behavior was changed even mere with commit ce68929f07de ("driver core: > Revert default driver_deferred_probe_timeout value to 0"), since the value > was set to 0 by default. Meaning that the probe deferral would be disabled > after the initcalls where done. Unless a timeout was set in the cmdline. > > Notice that the commit said that it was reverting the default value to 0, > but this was never 0. The default was -1 at the beginning and then changed > to 30 in a later commit. > > This default value of 0 was reverted again by commit f516d01b9df2 ("Revert > "driver core: Set default deferred_probe_timeout back to 0."") and set to > 10 seconds instead. Which was still less than the 30 seconds that was set > at some point to allow systems with drivers built as modules and loaded by > user-land to probe drivers that were previously deferred. > > The 10 seconds timeout isn't enough for the mentioned systems, for example > general purpose distributions attempt to build all the possible drivers as > a module to keep the Linux kernel image small. But that means that in very > likely that the probe deferral mechanism will timeout and drivers won't be > probed correctly. What specific "mentioned systems" have deferred probe drivers that are failing on the current value? What drivers are causing the long delay here? No one should be having to wait 10 seconds for a deferred delay on a real system as that feels really wrong. Why not fix the drivers that are causing this delay and maybe move them to be async so as to not slow down the whole boot process? thanks, greg k-h
Hello Greg, Thanks a lot for your feedback. On 11/14/22 11:54, Greg Kroah-Hartman wrote: [...] >> >> This default value of 0 was reverted again by commit f516d01b9df2 ("Revert >> "driver core: Set default deferred_probe_timeout back to 0."") and set to >> 10 seconds instead. Which was still less than the 30 seconds that was set >> at some point to allow systems with drivers built as modules and loaded by >> user-land to probe drivers that were previously deferred. >> >> The 10 seconds timeout isn't enough for the mentioned systems, for example >> general purpose distributions attempt to build all the possible drivers as >> a module to keep the Linux kernel image small. But that means that in very >> likely that the probe deferral mechanism will timeout and drivers won't be >> probed correctly. > > What specific "mentioned systems" have deferred probe drivers that are The "mentioned systems" are the ones mentioned in the paragraph above: "to allow systems with drivers built as modules and loaded by user-land to probe drivers that were previously deferred." I even gave an example about general purpose distributions that build as much as possible as a module. What more info do you think that is missing? > failing on the current value? What drivers are causing the long delay > here? No one should be having to wait 10 seconds for a deferred delay > on a real system as that feels really wrong. > Not really, it depends if the drivers are built-in, built as modules, in the initramfs or in the rootfs. A 10 seconds might not be enough if these modules are in the root partition and need to wait for this to be mounted and udev to load the modules, etc. Also, it may even be that the module alias is not correct and then users have to load them by explicitly have /etc/modules-load.d/ configs and so on. > Why not fix the drivers that are causing this delay and maybe move them > to be async so as to not slow down the whole boot process? > Yes, these drivers could be fixed to report a proper module alias or the dependencies can be built-in or added to the initramfs and that does not change the fact that by default the kernel shouldn't make assumptions about when is safe to just timeout instead of -EPROBE_DEFER. Because with modules the kernel has no way to know when all the modules have been already been loaded by user-space or more drivers are going to be registered in the future. Also, that's how probe deferral always worked since the mechanism was introduced. It's just recently that the behavior was changed to timeout. A nice feature of the probe deferral mechanism is that it was simple and reliable. Adding a timeout makes it non-deterministic and more fragile IMO.
On Mon, Nov 14, 2022 at 12:13:15PM +0100, Javier Martinez Canillas wrote: > Hello Greg, > > Thanks a lot for your feedback. > > On 11/14/22 11:54, Greg Kroah-Hartman wrote: > > [...] > > >> > >> This default value of 0 was reverted again by commit f516d01b9df2 ("Revert > >> "driver core: Set default deferred_probe_timeout back to 0."") and set to > >> 10 seconds instead. Which was still less than the 30 seconds that was set > >> at some point to allow systems with drivers built as modules and loaded by > >> user-land to probe drivers that were previously deferred. > >> > >> The 10 seconds timeout isn't enough for the mentioned systems, for example > >> general purpose distributions attempt to build all the possible drivers as > >> a module to keep the Linux kernel image small. But that means that in very > >> likely that the probe deferral mechanism will timeout and drivers won't be > >> probed correctly. > > > > What specific "mentioned systems" have deferred probe drivers that are > > The "mentioned systems" are the ones mentioned in the paragraph above: > > "to allow systems with drivers built as modules and loaded by user-land to > probe drivers that were previously deferred." > > I even gave an example about general purpose distributions that build as > much as possible as a module. What more info do you think that is missing? Exact systems that this is failing on would be great to have. > > failing on the current value? What drivers are causing the long delay > > here? No one should be having to wait 10 seconds for a deferred delay > > on a real system as that feels really wrong. > > > > Not really, it depends if the drivers are built-in, built as modules, in > the initramfs or in the rootfs. A 10 seconds might not be enough if these > modules are in the root partition and need to wait for this to be mounted > and udev to load the modules, etc. How does it take 10 seconds to load the initramfs for a system that requires deferred probe devices? What typs of hardware is this? > Also, it may even be that the module alias is not correct and then users > have to load them by explicitly have /etc/modules-load.d/ configs and so > on. Then that's a totally different issue and the module alias needs to be fixed and is not relevant here. > > Why not fix the drivers that are causing this delay and maybe move them > > to be async so as to not slow down the whole boot process? > > > > Yes, these drivers could be fixed to report a proper module alias or the > dependencies can be built-in or added to the initramfs and that does not > change the fact that by default the kernel shouldn't make assumptions > about when is safe to just timeout instead of -EPROBE_DEFER. Please let me know which drivers these are that are causing problems so we can fix them. > Because with modules the kernel has no way to know when all the modules > have been already been loaded by user-space or more drivers are going to > be registered in the future. Of course that is true, so we guess, and so far, 10 seconds is a good enough guess for normal systems out there that use deferred probe. What exact system and drivers do not work with this today? > Also, that's how probe deferral always worked since the mechanism was > introduced. It's just recently that the behavior was changed to timeout. > > A nice feature of the probe deferral mechanism is that it was simple and > reliable. Adding a timeout makes it non-deterministic and more fragile IMO. deferred probe was never simple or reliable or determinisitic. It was a hack we had to implement to handle complex hardware situations and loadable drivers. Let's not try to paper over driver bugs here by making the timeout "forever" but rather fix the root problem in the broken drivers. So, what drivers do we need to fix up? thanks, greg k-h
On 11/14/22 12:38, Greg Kroah-Hartman wrote: > On Mon, Nov 14, 2022 at 12:13:15PM +0100, Javier Martinez Canillas wrote: >> Hello Greg, [...] >> I even gave an example about general purpose distributions that build as >> much as possible as a module. What more info do you think that is missing? > > Exact systems that this is failing on would be great to have. > The exact system is a Snapdragon SC7180 based HP X2 Chromebook with latest Fedora Rawhide image (kernel version 6.1-rc4). The reason why is timing out is that the arm_smmu driver is built-in (CONFIG_ARM_SMMU=y) but it depends on gpucc-sc7180 clk driver that's built as module (CONFIG_SC_GPUCC_7180=m). >>> failing on the current value? What drivers are causing the long delay >>> here? No one should be having to wait 10 seconds for a deferred delay >>> on a real system as that feels really wrong. >>> >> >> Not really, it depends if the drivers are built-in, built as modules, in >> the initramfs or in the rootfs. A 10 seconds might not be enough if these >> modules are in the root partition and need to wait for this to be mounted >> and udev to load the modules, etc. > > How does it take 10 seconds to load the initramfs for a system that > requires deferred probe devices? What typs of hardware is this? > That could depend on may things. The dependency of the systemd unit files, whether NetworkManager-wait-online.service is enabled or not, etc. It can really take more than 10 seconds on some systems to load all the modules. [...] >> >> A nice feature of the probe deferral mechanism is that it was simple and >> reliable. Adding a timeout makes it non-deterministic and more fragile IMO. > > deferred probe was never simple or reliable or determinisitic. It was a > hack we had to implement to handle complex hardware situations and > loadable drivers. Let's not try to paper over driver bugs here by > making the timeout "forever" but rather fix the root problem in the > broken drivers. > I don't understand how adding a 10 secs timeout would make it more robust than just letting the driver core to attempt probing the deferred drivers again for every driver (or device) that gets registered. > So, what drivers do we need to fix up? > So what exactly needs to get fixed on the arm_smmu and gpucc-sc7180 drivers mentioned? Yes, we could built both of them as =y and make sure that drivers are registered and probed before the initcalls are done, but if we do that, we will need to have most of the drivers built-in in the Fedora kernel. That does not scale for all the platforms that we need to support.
Hi All, On 11/14/22 5:56 AM, Javier Martinez Canillas wrote: > On 11/14/22 12:38, Greg Kroah-Hartman wrote: >> On Mon, Nov 14, 2022 at 12:13:15PM +0100, Javier Martinez Canillas wrote: >>> Hello Greg, > [...] > >>> I even gave an example about general purpose distributions that build as >>> much as possible as a module. What more info do you think that is missing? >> Exact systems that this is failing on would be great to have. >> > The exact system is a Snapdragon SC7180 based HP X2 Chromebook with latest > Fedora Rawhide image (kernel version 6.1-rc4). The reason why is timing out > is that the arm_smmu driver is built-in (CONFIG_ARM_SMMU=y) but it depends > on gpucc-sc7180 clk driver that's built as module (CONFIG_SC_GPUCC_7180=m). Additionally, this fails on the Snapdragon SC8280XP based Thinkpad X13s with 6.1-rc4 (and rc5), however, unlike Javier's setup, I *do* have CONFIG_SC_GPUCC_SC8280XP=y, so it's already built in to the kernel. Additionally, all DRM options =y, except for CONFIG_DRM_GEM_HELP=m and CONFIG_DRM_DISPLAY_CONNECTOR=m. I am hoping to find some time tonight after work to also test on my SDM850 based Lenovo Yoga C630, as I believe my previous testing of 6.1 may have been hitting this but I did not have time to dig into what was going on there. -- steev
On Mon, Nov 14, 2022 at 2:43 AM Javier Martinez Canillas <javierm@redhat.com> wrote: > > The driver_deferred_probe_timeout value has a long story. It was first set > to -1 when it was introduced by commit 25b4e70dcce9 ("driver core: allow > stopping deferred probe after init"), meaning that the driver core would > defer the probe forever unless a subsystem would opt-in by checking if the > initcalls where done using the driver_deferred_probe_check_state() helper, > or if a timeout was explicitly set with a "deferred_probe_timeout" param. Indeed, it is a long history, I appreciate your archaeology here and I'm sorry for my part in adding complexity to it. That said, I don't believe the above is quite right (but the logic was quite confusing, so it's easy to get wrong - and it's late here so forgive me if I muck up my own explanation). As I mentioned in c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic"): " driver_deferred_probe_check_state() has some uninituitive behavior. * From boot to late_initcall, it returns -EPROBE_DEFER * From late_initcall to the deferred_probe_timeout (if set) it returns -ENODEV * If the deferred_probe_timeout it set, after it fires, it returns -ETIMEDOUT This is a bit confusing, as its useful to have the function return -EPROBE_DEFER while the timeout is still running. This behavior has resulted in the somwhat duplicative driver_deferred_probe_check_state_continue() function being added. " Looking at the code before this change, at late_initcall time, we'd run deferred_probe_initcall(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n321 Which sets: initcalls_done = true; Then in __driver_deferred_probe_check_state(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n238 We check initcalls_done, and after its set, we stop returning -EPROBE_DEFER. If deferred_probe_timeout was not set, it would then return 0. But then in driver_deferred_probe_check_state() https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n262 When we see the 0 back from from __driver_deferred_probe_check_state() we'll return -ENODEV. So I don't think it's true that before the change the driver core would defer the probe forever when there was no timeout. The problem we had at the time was that there were dts optional links (like iommus that may be in the dts but may not have a kernel driver), so if there is no module to load, we don't want to defer probing forever. The cutoff point was originally set to late_initcall time with a timeout option to extend it further. However, with Android, modules often wouldn't get a chance to load until much after late_initcall time. But setting the timeout didn't actually help to extend the probe time, because it would return -ENODEV after late_initcall. - That's what my patch was trying to address. I also tried to move the default to 30 so we'd not need to set a boot parameter to get modules to load on a normal boot, but that caused some trouble w/ with NFS root systems and the fix there would mean that they'd have to wait 30 seconds before the rootfs would mount. > Only the power domain, IOMMU and MDIO subsystems currently opt-in to check > if the initcalls have completed with driver_deferred_probe_check_state(). > > Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > logic") then changed the driver_deferred_probe_check_state() helper logic, > to take into account whether modules have been enabled or not and also to > return -EPROBE_DEFER if the probe deferred timeout was still running. > > Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a > longer default if CONFIG_MODULES is set"), the timeout was increased to 30 > seconds if modules are enabled. Because seems that some of the subsystems > that were opt-in to not return -EPROBE_DEFER after the initcall where done > could still have dependencies whose drivers were built as a module. > > This commit did a fundamental change to how probe deferral worked though, > since now the default was not to attempt probing for drivers indefinitely > but instead it would timeout after 30 seconds unless a different timeout > was set using the "deferred_probe_timeout" parameter. > > The behavior was changed even mere with commit ce68929f07de ("driver core: > Revert default driver_deferred_probe_timeout value to 0"), since the value > was set to 0 by default. Meaning that the probe deferral would be disabled > after the initcalls where done. Unless a timeout was set in the cmdline. > > Notice that the commit said that it was reverting the default value to 0, > but this was never 0. The default was -1 at the beginning and then changed > to 30 in a later commit. Hrm. So -1 sort of changed meaning after my initial change(c8c43cee29f6), because after that it did indeed change to being indefinite EPROBE_DEFER. For !modules systems, it was ok to use -1 before, because the check here would catch it: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=ce68929f07de#n255 But for modular systems, probing forever would break systems that have optional links in their DTS with no matching drivers, as the dependent drivers will never load. So setting the default to 0 was done to match the earlier behavior (with probing ending at late_initcall time). > This default value of 0 was reverted again by commit f516d01b9df2 ("Revert > "driver core: Set default deferred_probe_timeout back to 0."") and set to > 10 seconds instead. Which was still less than the 30 seconds that was set > at some point to allow systems with drivers built as modules and loaded by > user-land to probe drivers that were previously deferred. > > The 10 seconds timeout isn't enough for the mentioned systems, for example > general purpose distributions attempt to build all the possible drivers as > a module to keep the Linux kernel image small. But that means that in very > likely that the probe deferral mechanism will timeout and drivers won't be > probed correctly. > > So let's change the default again to -1 as it was at the beginning. That's > how probe deferral always worked. In fact, it could even be that users can > load modules manually after the system has already booted so it is better > to not assume when it can be safe to just timeout instead of probe defer. So I worry setting it to -1 by default will cause regressions on systems with optional dts links. But maybe I'm still missing something? thanks -john
Hello John, Thanks for the detailed response. On 11/15/22 10:33, John Stultz wrote: > On Mon, Nov 14, 2022 at 2:43 AM Javier Martinez Canillas > <javierm@redhat.com> wrote: >> >> The driver_deferred_probe_timeout value has a long story. It was first set >> to -1 when it was introduced by commit 25b4e70dcce9 ("driver core: allow >> stopping deferred probe after init"), meaning that the driver core would >> defer the probe forever unless a subsystem would opt-in by checking if the >> initcalls where done using the driver_deferred_probe_check_state() helper, >> or if a timeout was explicitly set with a "deferred_probe_timeout" param. > > Indeed, it is a long history, I appreciate your archaeology here and > I'm sorry for my part in adding complexity to it. > > That said, I don't believe the above is quite right (but the logic was > quite confusing, so it's easy to get wrong - and it's late here so > forgive me if I muck up my own explanation). > Sorry if I didn't get all the details correct. I tried to be accurate on my explanation but as you said, the logic is complex. > As I mentioned in c8c43cee29f6 ("driver core: Fix > driver_deferred_probe_check_state() logic"): > " > driver_deferred_probe_check_state() has some uninituitive behavior. > > * From boot to late_initcall, it returns -EPROBE_DEFER > > * From late_initcall to the deferred_probe_timeout (if set) > it returns -ENODEV > > * If the deferred_probe_timeout it set, after it fires, it > returns -ETIMEDOUT > > This is a bit confusing, as its useful to have the function > return -EPROBE_DEFER while the timeout is still running. This > behavior has resulted in the somwhat duplicative > driver_deferred_probe_check_state_continue() function being > added. > " > > Looking at the code before this change, at late_initcall time, we'd > run deferred_probe_initcall(): > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n321 > Which sets: initcalls_done = true; > > Then in __driver_deferred_probe_check_state(): > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n238 > > We check initcalls_done, and after its set, we stop returning > -EPROBE_DEFER. If deferred_probe_timeout was not set, it would then > return 0. > > But then in driver_deferred_probe_check_state() > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=e94f62b7140fa3da4c69a685b2e73ef52dd32c51#n262 > > When we see the 0 back from from __driver_deferred_probe_check_state() > we'll return -ENODEV. > Yes, agreed that commit c8c43cee29f6 was correct and that's why I didn't mention that one in my commit message. > So I don't think it's true that before the change the driver core > would defer the probe forever when there was no timeout. > I see. I thought that the driver_deferred_probe_check_state_continue() function would check for the return value and return -EPROBE_DEFER if if __driver_deferred_probe_check_state() returned 0. But probably there was code calling driver_deferred_probe_check_state() and not returning -EPROBE_DEFER after initcalls_done. In any case, what I tried to say is that originally the semantics of the deferred_probe_timeout= parameter was that it was disabled by default and that it had to be explicitly set. But at some point that default was switched and it became enabled by default unless is explicitly disabled. > The problem we had at the time was that there were dts optional links > (like iommus that may be in the dts but may not have a kernel driver), > so if there is no module to load, we don't want to defer probing > forever. The cutoff point was originally set to late_initcall time > with a timeout option to extend it further. However, with Android, > modules often wouldn't get a chance to load until much after > late_initcall time. > But do we need to tie the probe deferral to the optional links? Now both driver_deferred_probe_timeout = 0 and fw_devlink_drivers_done() are done in the deferred_probe_timeout_work_func() worker function handler. Could we untangle the two? That is, have a timeout to relax the links but still keep the probe deferral mechanism so that drivers with required dependencies have an opportunity to re-probe if deferred once the modules are loaded by user-land later in the boot? > But setting the timeout didn't actually help to extend the probe time, > because it would return -ENODEV after late_initcall. - That's what my > patch was trying to address. > Yes. I don't think that having a 0 by default makes sense. As mentioned in my commit message, if a system wants a 0 (no probe deferral after the initcalls are done) then should either disable modules (since basically makes module loading a no-op) or set deferred_probe_timeout=0 in cmdline. What I'm arguing is that a default of -1 (don't timeout probe deferral) is what should be the default. > I also tried to move the default to 30 so we'd not need to set a boot > parameter to get modules to load on a normal boot, but that caused > some trouble w/ with NFS root systems and the fix there would mean > that they'd have to wait 30 seconds before the rootfs would mount. > Right. I saw that in the git log history too. But I believe that's a corner case that in any case should be handled separately. For example, the probe deferral worker could be canceled if wait_for_init_devices_probe() is called? Something like the following (pseudo-code): diff --git a/drivers/base/core.c b/drivers/base/core.c index d02501933467..d2cc04fef1f5 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1737,6 +1737,12 @@ void __init wait_for_init_devices_probe(void) if (!fw_devlink_flags || fw_devlink_is_permissive()) return; + /* + * Cancel deferred probe to allow drivers with optional suppliers to + * be probed, even if those optional dependencies are still missing. + */ + deferred_probe_cancel_timeout(); + /* * Wait for all ongoing probes to finish so that the "best effort" is * only applied to devices that can't probe otherwise. >> Only the power domain, IOMMU and MDIO subsystems currently opt-in to check >> if the initcalls have completed with driver_deferred_probe_check_state(). >> >> Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() >> logic") then changed the driver_deferred_probe_check_state() helper logic, >> to take into account whether modules have been enabled or not and also to >> return -EPROBE_DEFER if the probe deferred timeout was still running. >> >> Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a >> longer default if CONFIG_MODULES is set"), the timeout was increased to 30 >> seconds if modules are enabled. Because seems that some of the subsystems >> that were opt-in to not return -EPROBE_DEFER after the initcall where done >> could still have dependencies whose drivers were built as a module. >> >> This commit did a fundamental change to how probe deferral worked though, >> since now the default was not to attempt probing for drivers indefinitely >> but instead it would timeout after 30 seconds unless a different timeout >> was set using the "deferred_probe_timeout" parameter. >> >> The behavior was changed even mere with commit ce68929f07de ("driver core: >> Revert default driver_deferred_probe_timeout value to 0"), since the value >> was set to 0 by default. Meaning that the probe deferral would be disabled >> after the initcalls where done. Unless a timeout was set in the cmdline. >> >> Notice that the commit said that it was reverting the default value to 0, >> but this was never 0. The default was -1 at the beginning and then changed >> to 30 in a later commit. > > Hrm. So -1 sort of changed meaning after my initial > change(c8c43cee29f6), because after that it did indeed change to being > indefinite EPROBE_DEFER. Yes, but that was also the case for the original commit from Rob (25b4e70dcce9) since driver_deferred_probe_check_state() was just: int driver_deferred_probe_check_state(struct device *dev) { if (initcalls_done) { if (!deferred_probe_timeout) { dev_WARN(dev, "deferred probe timeout, ignoring dependency"); return -ETIMEDOUT; } dev_warn(dev, "ignoring dependency for device, assuming no driver"); return -ENODEV; } return -EPROBE_DEFER; } So -1 being indefinite -EPROBE_DEFER was the original semantic and the default. > For !modules systems, it was ok to use -1 before, because the check > here would catch it: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/dd.c?id=ce68929f07de#n255 > > But for modular systems, probing forever would break systems that have > optional links in their DTS with no matching drivers, as the dependent > drivers will never load. > And I think that is what should had been addressed. That is, relax the link so that optional suppliers would not cause a probe deferral. But not disable the whole probe deferral mechanism only for this case. > So setting the default to 0 was done to match the earlier behavior > (with probing ending at late_initcall time). > Yes, but that earlier behavior wasn't the original. That's what I tried to say in the commit message. >> This default value of 0 was reverted again by commit f516d01b9df2 ("Revert >> "driver core: Set default deferred_probe_timeout back to 0."") and set to >> 10 seconds instead. Which was still less than the 30 seconds that was set >> at some point to allow systems with drivers built as modules and loaded by >> user-land to probe drivers that were previously deferred. >> >> The 10 seconds timeout isn't enough for the mentioned systems, for example >> general purpose distributions attempt to build all the possible drivers as >> a module to keep the Linux kernel image small. But that means that in very >> likely that the probe deferral mechanism will timeout and drivers won't be >> probed correctly. >> >> So let's change the default again to -1 as it was at the beginning. That's >> how probe deferral always worked. In fact, it could even be that users can >> load modules manually after the system has already booted so it is better >> to not assume when it can be safe to just timeout instead of probe defer. > > So I worry setting it to -1 by default will cause regressions on > systems with optional dts links. > But maybe I'm still missing something? > It may be. I've to admit that I'm not that familiar with the device links logic and may be missing some details. But as mentioned I believe that if that's a problem, we should attempt to relax the links and allow drivers with optional suppliers to be probed, while still allow drivers with links that are required to be deferred in case that a driver is provided later.
On Tue, Nov 15, 2022 at 2:53 AM Javier Martinez Canillas <javierm@redhat.com> wrote: > On 11/15/22 10:33, John Stultz wrote: > > So I don't think it's true that before the change the driver core > > would defer the probe forever when there was no timeout. > > > > I see. I thought that the driver_deferred_probe_check_state_continue() > function would check for the return value and return -EPROBE_DEFER if > if __driver_deferred_probe_check_state() returned 0. Yes, but that was only used by the pinctrl driver, I think mostly as a result of the confusing semantics of the driver_deferred_probe_check_state() behavior. After the semantics were improved (at least in my view - again apologies for any trouble I've caused you looking over all this :) it was switched it back. But yes, in that case, it would absorb the results and return -EPROBE_DEFER indefinitely if modules were enabled. > But probably there was code calling driver_deferred_probe_check_state() > and not returning -EPROBE_DEFER after initcalls_done. I'm not sure I recall any other than the pinctrl case. > In any case, what I tried to say is that originally the semantics of the > deferred_probe_timeout= parameter was that it was disabled by default > and that it had to be explicitly set. But at some point that default was > switched and it became enabled by default unless is explicitly disabled. True, I tried to set it to 30, then to zero and Saravana has set it to 10. But I believe that the timeout of zero was functionally identical to that of -1 behavior before my patches. The only difference is before w/ -1 you'd get -ENODEV at late_initcall, whereas with my changes (after ce68929f07de) and the default value set to 0, you get -ETIMEDOUT after late_initcall. To my understanding, in neither case did you ever get -EPROBE_DEFER after late_initcall. > > The problem we had at the time was that there were dts optional links > > (like iommus that may be in the dts but may not have a kernel driver), > > so if there is no module to load, we don't want to defer probing > > forever. The cutoff point was originally set to late_initcall time > > with a timeout option to extend it further. However, with Android, > > modules often wouldn't get a chance to load until much after > > late_initcall time. > > > > But do we need to tie the probe deferral to the optional links? Now both > driver_deferred_probe_timeout = 0 and fw_devlink_drivers_done() are done > in the deferred_probe_timeout_work_func() worker function handler. > > Could we untangle the two? That is, have a timeout to relax the links but > still keep the probe deferral mechanism so that drivers with required > dependencies have an opportunity to re-probe if deferred once the modules > are loaded by user-land later in the boot? > Potentially? I'd defer to Saravana, as the fwdevlink effort was in a large part resolving the dependency links in a sane way, which allowed for not having to set the deferred_probe_timeout boot option in order to get modules to load on devices. > > But setting the timeout didn't actually help to extend the probe time, > > because it would return -ENODEV after late_initcall. - That's what my > > patch was trying to address. > > > > Yes. I don't think that having a 0 by default makes sense. As mentioned > in my commit message, if a system wants a 0 (no probe deferral after the > initcalls are done) then should either disable modules (since basically > makes module loading a no-op) or set deferred_probe_timeout=0 in cmdline. > > What I'm arguing is that a default of -1 (don't timeout probe deferral) > is what should be the default. I don't think that's really an option. I'm not a fan of the deferred probe timeout, but it resolved the optional links and at least my efforts to change to a longer timeout value weren't able to work without causing regressions to someone. And until the fwdevlink stuff landed, some devices I dealt with just required booting with the deferred_probe_timeout=30 boot option set. I'm pretty sure that defaulting to indefinite probe deferral will cause regressions. > > I also tried to move the default to 30 so we'd not need to set a boot > > parameter to get modules to load on a normal boot, but that caused > > some trouble w/ with NFS root systems and the fix there would mean > > that they'd have to wait 30 seconds before the rootfs would mount. > > > > Right. I saw that in the git log history too. But I believe that's a corner > case that in any case should be handled separately. For example, the probe > deferral worker could be canceled if wait_for_init_devices_probe() is called? > > Something like the following (pseudo-code): > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index d02501933467..d2cc04fef1f5 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -1737,6 +1737,12 @@ void __init wait_for_init_devices_probe(void) > if (!fw_devlink_flags || fw_devlink_is_permissive()) > return; > > + /* > + * Cancel deferred probe to allow drivers with optional suppliers to > + * be probed, even if those optional dependencies are still missing. > + */ > + deferred_probe_cancel_timeout(); > + Potentially? I'd defer to Saravana as I've not kept up with recent changes. Though it seems sort of similar to the initcalls_done semantics? One of the foggy ideas I had was that it seemed like instead of a boot timer, we sort of really wanted a per-driver-probe timer, so that after init, if a module load happens, there's a new timer that gives a 30 seconds or so window for any dependent drivers to be loaded before continuing on. But I hadn't time to fully think it out. > > Hrm. So -1 sort of changed meaning after my initial > > change(c8c43cee29f6), because after that it did indeed change to being > > indefinite EPROBE_DEFER. > > Yes, but that was also the case for the original commit from Rob (25b4e70dcce9) > since driver_deferred_probe_check_state() was just: > > int driver_deferred_probe_check_state(struct device *dev) > { > if (initcalls_done) { > if (!deferred_probe_timeout) { > dev_WARN(dev, "deferred probe timeout, ignoring dependency"); > return -ETIMEDOUT; > } > dev_warn(dev, "ignoring dependency for device, assuming no driver"); > return -ENODEV; > } > return -EPROBE_DEFER; > } > > So -1 being indefinite -EPROBE_DEFER was the original semantic and the default. I think you're misreading this? In the code above, as soon as initcalls_done is set (at late_initcall), it returns -ENODEV. So even with deferred_probe_timeout = -1, after late_initcall, it will return -ENODEV and probing will stop. I sadly can't respond to all your points, but hopefully the above helps clarify some details. I do agree the deferred_probe_timeout and optional links logic is unfortunate, and by providing a working solution for optional links, it creates hardships for late module loading on other systems. But once it landed, fixing the latter would break for the former, (and sometimes fixing the latter would break other use cases) and we were sort of stuck with having to require devices that do later module loading to specify the timeout boot option. I think Saravana's immense fwdevlink efforts have improved things greatly, but I'm glad others are looking for better solutions as well, so I look forward to your proposals (hopefully I can read over your patches from last night soon). thanks -john
On 11/16/22 21:49, John Stultz wrote: > On Tue, Nov 15, 2022 at 2:53 AM Javier Martinez Canillas > <javierm@redhat.com> wrote: >> On 11/15/22 10:33, John Stultz wrote: >>> So I don't think it's true that before the change the driver core >>> would defer the probe forever when there was no timeout. >>> >> >> I see. I thought that the driver_deferred_probe_check_state_continue() >> function would check for the return value and return -EPROBE_DEFER if >> if __driver_deferred_probe_check_state() returned 0. > > Yes, but that was only used by the pinctrl driver, I think mostly as a > result of the confusing semantics of the > driver_deferred_probe_check_state() behavior. > After the semantics were improved (at least in my view - again > apologies for any trouble I've caused you looking over all this :) it > was switched it back. > But yes, in that case, it would absorb the results and return > -EPROBE_DEFER indefinitely if modules were enabled. > >> But probably there was code calling driver_deferred_probe_check_state() >> and not returning -EPROBE_DEFER after initcalls_done. > > I'm not sure I recall any other than the pinctrl case. > > I don't see pinctrl using it anymore, currently these are the callers: $ git grep driver_deferred_probe_check_state | grep -v "drivers/base/dd.c" drivers/base/power/domain.c: return driver_deferred_probe_check_state(base_dev); drivers/iommu/of_iommu.c: return driver_deferred_probe_check_state(dev); drivers/net/mdio/fwnode_mdio.c: rc = driver_deferred_probe_check_state(&phy->mdio.dev); include/linux/device/driver.h:int driver_deferred_probe_check_state(struct device *dev) ah, pinctrl and PM usage was deleted by commits: 24a026f85241 ("pinctrl: devicetree Delete usage of driver_deferred_probe_check_state()") 5a46079a9645 ("PM: domains: Delete usage of driver_deferred_probe_check_state()") >> In any case, what I tried to say is that originally the semantics of the >> deferred_probe_timeout= parameter was that it was disabled by default >> and that it had to be explicitly set. But at some point that default was >> switched and it became enabled by default unless is explicitly disabled. > > True, I tried to set it to 30, then to zero and Saravana has set it to 10. > But I believe that the timeout of zero was functionally identical to > that of -1 behavior before my patches. You mean before c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic") right? > The only difference is before w/ -1 you'd get -ENODEV at > late_initcall, whereas with my changes (after ce68929f07de) and the > default value set to 0, you get -ETIMEDOUT after late_initcall. > > To my understanding, in neither case did you ever get -EPROBE_DEFER > after late_initcall. > Yes, you are correct. I got confused by the current logic after your c8c43cee29f6 commit and the original logic when the helper was added. > >>> The problem we had at the time was that there were dts optional links >>> (like iommus that may be in the dts but may not have a kernel driver), >>> so if there is no module to load, we don't want to defer probing >>> forever. The cutoff point was originally set to late_initcall time >>> with a timeout option to extend it further. However, with Android, >>> modules often wouldn't get a chance to load until much after >>> late_initcall time. >>> >> >> But do we need to tie the probe deferral to the optional links? Now both >> driver_deferred_probe_timeout = 0 and fw_devlink_drivers_done() are done >> in the deferred_probe_timeout_work_func() worker function handler. >> >> Could we untangle the two? That is, have a timeout to relax the links but >> still keep the probe deferral mechanism so that drivers with required >> dependencies have an opportunity to re-probe if deferred once the modules >> are loaded by user-land later in the boot? >> > > Potentially? I'd defer to Saravana, as the fwdevlink effort was in a > large part resolving the dependency links in a sane way, which allowed > for not having to set the deferred_probe_timeout boot option in order > to get modules to load on devices. > Yes, but then that's only before late_initcall() because after that the links are relaxed to allow drivers to probe with optional links. But it also means that drivers probe are not deferred anymore. That's why I think that we should untangle the two. >>> But setting the timeout didn't actually help to extend the probe time, >>> because it would return -ENODEV after late_initcall. - That's what my >>> patch was trying to address. >>> >> >> Yes. I don't think that having a 0 by default makes sense. As mentioned >> in my commit message, if a system wants a 0 (no probe deferral after the >> initcalls are done) then should either disable modules (since basically >> makes module loading a no-op) or set deferred_probe_timeout=0 in cmdline. >> >> What I'm arguing is that a default of -1 (don't timeout probe deferral) >> is what should be the default. > > I don't think that's really an option. I'm not a fan of the deferred > probe timeout, but it resolved the optional links and at least my > efforts to change to a longer timeout value weren't able to work > without causing regressions to someone. And until the fwdevlink stuff > landed, some devices I dealt with just required booting with the > deferred_probe_timeout=30 boot option set. Yeah, but that was before the work from Saravana on fwdevlinks so maybe now we can make it work. AFAIK is on PTO until the 28th so I'll wait for feedback on the latest v2 patch-series once is back. > I'm pretty sure that defaulting to indefinite probe deferral will > cause regressions. > It may well be but I still think that we could make it work and avoid the probe deferral timeout. > >>> I also tried to move the default to 30 so we'd not need to set a boot >>> parameter to get modules to load on a normal boot, but that caused >>> some trouble w/ with NFS root systems and the fix there would mean >>> that they'd have to wait 30 seconds before the rootfs would mount. >>> >> >> Right. I saw that in the git log history too. But I believe that's a corner >> case that in any case should be handled separately. For example, the probe >> deferral worker could be canceled if wait_for_init_devices_probe() is called? >> >> Something like the following (pseudo-code): >> >> diff --git a/drivers/base/core.c b/drivers/base/core.c >> index d02501933467..d2cc04fef1f5 100644 >> --- a/drivers/base/core.c >> +++ b/drivers/base/core.c >> @@ -1737,6 +1737,12 @@ void __init wait_for_init_devices_probe(void) >> if (!fw_devlink_flags || fw_devlink_is_permissive()) >> return; >> >> + /* >> + * Cancel deferred probe to allow drivers with optional suppliers to >> + * be probed, even if those optional dependencies are still missing. >> + */ >> + deferred_probe_cancel_timeout(); >> + > > > Potentially? I'd defer to Saravana as I've not kept up with recent > changes. Though it seems sort of similar to the initcalls_done > semantics? > Indeed. It is redundant since ip_auto_config() that ends calling this function is also a late_initcall function. That's why I didn't include this in my v2. > One of the foggy ideas I had was that it seemed like instead of a boot > timer, we sort of really wanted a per-driver-probe timer, so that > after init, if a module load happens, there's a new timer that gives a > 30 seconds or so window for any dependent drivers to be loaded before > continuing on. But I hadn't time to fully think it out. That may work but I still have the gut feeling that we are making it more complicated than necessary. The probe deferral mechanism albeit inefficient it was highly effective on allowing drivers to re-probe until the deps were available. I always thought that was a simple and elegant design. > > >>> Hrm. So -1 sort of changed meaning after my initial >>> change(c8c43cee29f6), because after that it did indeed change to being >>> indefinite EPROBE_DEFER. >> >> Yes, but that was also the case for the original commit from Rob (25b4e70dcce9) >> since driver_deferred_probe_check_state() was just: >> >> int driver_deferred_probe_check_state(struct device *dev) >> { >> if (initcalls_done) { >> if (!deferred_probe_timeout) { >> dev_WARN(dev, "deferred probe timeout, ignoring dependency"); >> return -ETIMEDOUT; >> } >> dev_warn(dev, "ignoring dependency for device, assuming no driver"); >> return -ENODEV; >> } >> return -EPROBE_DEFER; >> } >> >> So -1 being indefinite -EPROBE_DEFER was the original semantic and the default. > > I think you're misreading this? In the code above, as soon as > initcalls_done is set (at late_initcall), it returns -ENODEV. > So even with deferred_probe_timeout = -1, after late_initcall, it will > return -ENODEV and probing will stop. > I did indeed misread, sorry about that. As mentioned I conflated the logic after your fix by c8c43cee29f6 with the original logic. I will update the commit message if after review Saravana considers that is worth to keep pursuing this. > I sadly can't respond to all your points, but hopefully the above > helps clarify some details. I do agree the deferred_probe_timeout and > optional links logic is unfortunate, and by providing a working > solution for optional links, it creates hardships for late module > loading on other systems. But once it landed, fixing the latter would > break for the former, (and sometimes fixing the latter would break > other use cases) and we were sort of stuck with having to require > devices that do later module loading to specify the timeout boot > option. I think Saravana's immense fwdevlink efforts have improved > things greatly, but I'm glad others are looking for better solutions > as well, so I look forward to your proposals (hopefully I can read > over your patches from last night soon). > Thanks a lot for your time and the detailed responses. In the meantime I'll just built-in the clock drivers in the fedora kernel to allow the arm-smmu to probe before late_initcall, but consider that a workaround. My motivation was to fix it properly in the device core because I know that will be an issue down the road for other platforms as well.
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 3dda62503102..7592f5ce35c8 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -256,11 +256,7 @@ static int deferred_devs_show(struct seq_file *s, void *data) } DEFINE_SHOW_ATTRIBUTE(deferred_devs); -#ifdef CONFIG_MODULES -int driver_deferred_probe_timeout = 10; -#else -int driver_deferred_probe_timeout; -#endif +int driver_deferred_probe_timeout = -1; EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);