[v2,6/6] PCI: hv: Use async probing to reduce boot time

Message ID 20230404020545.32359-7-decui@microsoft.com
State New
Headers
Series pci-hyper: Fix race condition bugs for fast device hotplug |

Commit Message

Dexuan Cui April 4, 2023, 2:05 a.m. UTC
  Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added
pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
create_root_hv_pci_bus() and in hv_eject_device_work() to address the
race between create_root_hv_pci_bus() and hv_eject_device_work(), but it
turns that grubing the pci_rescan_remove_lock mutex is not enough:
refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".

Now with hbus->state_lock and other fixes, the race is resolved, so
remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus():
this removes the serialization in hv_pci_probe() and hence allows
async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.

Add the async-probing flag to hv_pci_drv.

pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in
hv_pci_remove() are still kept: according to the comment before
drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock),
"PCI device removal routines should always be executed under this mutex".

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
---

v2:
  No change to the patch body.
  Improved the commit message [Michael Kelley]
  Added Cc:stable

 drivers/pci/controller/pci-hyperv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
  

Comments

Michael Kelley (LINUX) April 7, 2023, 4:11 p.m. UTC | #1
From: Dexuan Cui <decui@microsoft.com> Sent: Monday, April 3, 2023 7:06 PM
> 
> Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added
> pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
> create_root_hv_pci_bus() and in hv_eject_device_work() to address the
> race between create_root_hv_pci_bus() and hv_eject_device_work(), but it
> turns that grubing the pci_rescan_remove_lock mutex is not enough:
> refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
> 
> Now with hbus->state_lock and other fixes, the race is resolved, so
> remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus():
> this removes the serialization in hv_pci_probe() and hence allows
> async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
> 
> Add the async-probing flag to hv_pci_drv.
> 
> pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in
> hv_pci_remove() are still kept: according to the comment before
> drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock),
> "PCI device removal routines should always be executed under this mutex".
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Cc: stable@vger.kernel.org
> ---
> 
> v2:
>   No change to the patch body.
>   Improved the commit message [Michael Kelley]
>   Added Cc:stable
> 
>  drivers/pci/controller/pci-hyperv.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 3ae2f99dea8c2..2ea2b1b8a4c9a 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2312,12 +2312,16 @@ static int create_root_hv_pci_bus(struct
> hv_pcibus_device *hbus)
>  	if (error)
>  		return error;
> 
> -	pci_lock_rescan_remove();
> +	/*
> +	 * pci_lock_rescan_remove() and pci_unlock_rescan_remove() are
> +	 * unnecessary here, because we hold the hbus->state_lock, meaning
> +	 * hv_eject_device_work() and pci_devices_present_work() can't race
> +	 * with create_root_hv_pci_bus().
> +	 */
>  	hv_pci_assign_numa_node(hbus);
>  	pci_bus_assign_resources(bridge->bus);
>  	hv_pci_assign_slots(hbus);
>  	pci_bus_add_devices(bridge->bus);
> -	pci_unlock_rescan_remove();
>  	hbus->state = hv_pcibus_installed;
>  	return 0;
>  }
> @@ -4003,6 +4007,9 @@ static struct hv_driver hv_pci_drv = {
>  	.remove		= hv_pci_remove,
>  	.suspend	= hv_pci_suspend,
>  	.resume		= hv_pci_resume,
> +	.driver = {
> +		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
> +	},
>  };
> 
>  static void __exit exit_hv_pci_drv(void)
> --
> 2.25.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
  
Michael Kelley (LINUX) April 7, 2023, 4:14 p.m. UTC | #2
From: Michael Kelley (LINUX) <mikelley@microsoft.com> Sent: Friday, April 7, 2023 9:12 AM
> 
> From: Dexuan Cui <decui@microsoft.com> Sent: Monday, April 3, 2023 7:06 PM
> >
> > Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added
> > pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
> > create_root_hv_pci_bus() and in hv_eject_device_work() to address the
> > race between create_root_hv_pci_bus() and hv_eject_device_work(), but it
> > turns that grubing the pci_rescan_remove_lock mutex is not enough:

There's some kind of spelling error or typo above.  Should "grubing" be
"grabbing"?  Or did you intend something else?

Michael


> > refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
> >
> > Now with hbus->state_lock and other fixes, the race is resolved, so
> > remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus():
> > this removes the serialization in hv_pci_probe() and hence allows
> > async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
> >
> > Add the async-probing flag to hv_pci_drv.
> >
> > pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in
> > hv_pci_remove() are still kept: according to the comment before
> > drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock),
> > "PCI device removal routines should always be executed under this mutex".
> >
> > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > Cc: stable@vger.kernel.org
  
Dexuan Cui April 8, 2023, 12:23 a.m. UTC | #3
> From: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Sent: Friday, April 7, 2023 9:15 AM
> ...
> > > Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added
> > > pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
> > > create_root_hv_pci_bus() and in hv_eject_device_work() to address the
> > > race between create_root_hv_pci_bus() and hv_eject_device_work(), but
> > > it turns that grubing the pci_rescan_remove_lock mutex is not enough:
> 
> There's some kind of spelling error or typo above.  Should "grubing" be
> "grabbing"?  Or did you intend something else?
> 
> Michael

Sorry, it's a typo. The "grubing" should be "grabbing".
I suppose the PCI maintainers can help fix this. Let me know if v3 is needed.
  
Long Li April 11, 2023, 5:31 p.m. UTC | #4
> Subject: RE: [PATCH v2 6/6] PCI: hv: Use async probing to reduce boot time
> 
> > From: Michael Kelley (LINUX) <mikelley@microsoft.com>
> > Sent: Friday, April 7, 2023 9:15 AM
> > ...
> > > > Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject")
> > > > added
> > > > pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
> > > > create_root_hv_pci_bus() and in hv_eject_device_work() to address
> > > > the race between create_root_hv_pci_bus() and
> > > > hv_eject_device_work(), but it turns that grubing the
> pci_rescan_remove_lock mutex is not enough:
> >
> > There's some kind of spelling error or typo above.  Should "grubing"
> > be "grabbing"?  Or did you intend something else?
> >
> > Michael
> 
> Sorry, it's a typo. The "grubing" should be "grabbing".
> I suppose the PCI maintainers can help fix this. Let me know if v3 is needed.

Other than the typo,

Reviewed-by: Long Li <longli@microsoft.com>
  

Patch

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 3ae2f99dea8c2..2ea2b1b8a4c9a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2312,12 +2312,16 @@  static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus)
 	if (error)
 		return error;
 
-	pci_lock_rescan_remove();
+	/*
+	 * pci_lock_rescan_remove() and pci_unlock_rescan_remove() are
+	 * unnecessary here, because we hold the hbus->state_lock, meaning
+	 * hv_eject_device_work() and pci_devices_present_work() can't race
+	 * with create_root_hv_pci_bus().
+	 */
 	hv_pci_assign_numa_node(hbus);
 	pci_bus_assign_resources(bridge->bus);
 	hv_pci_assign_slots(hbus);
 	pci_bus_add_devices(bridge->bus);
-	pci_unlock_rescan_remove();
 	hbus->state = hv_pcibus_installed;
 	return 0;
 }
@@ -4003,6 +4007,9 @@  static struct hv_driver hv_pci_drv = {
 	.remove		= hv_pci_remove,
 	.suspend	= hv_pci_suspend,
 	.resume		= hv_pci_resume,
+	.driver = {
+		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
+	},
 };
 
 static void __exit exit_hv_pci_drv(void)