drivers: base: Introduce a new kernel parameter driver_sync_probe=

Message ID 20231206115355.4319-1-laoar.shao@gmail.com
State New
Headers
Series drivers: base: Introduce a new kernel parameter driver_sync_probe= |

Commit Message

Yafang Shao Dec. 6, 2023, 11:53 a.m. UTC
  After upgrading our kernel from version 4.19 to 6.1, certain regressions
occurred due to the driver's asynchronous probe behavior. Specifically,
the SCSI driver transitioned to an asynchronous probe by default, resulting
in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
was consistently identified as /dev/sda. However, with kernel 6.1, the root
disk can be any of /dev/sdX, leading to issues for applications reliant on
/dev/sda, notably impacting monitoring systems monitoring the root disk.

To address this, a new kernel parameter 'driver_sync_probe=' is introduced
to enforce synchronous probe behavior for specific drivers.

For instance, using the following kernel parameter:

  driver_sync_probe=sd,nvme

The sd (SCSI) and nvme disks will undergo synchronous probing. This ensures
that these disks maintain consistent identification behavior despite the
default asynchronous probe, mitigating the issues experienced by
applications reliant on fixed disk identification.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 .../admin-guide/kernel-parameters.txt         | 10 +++++
 drivers/base/dd.c                             | 41 ++++++++++++++-----
 2 files changed, 41 insertions(+), 10 deletions(-)
  

Comments

Greg KH Dec. 6, 2023, 1:31 p.m. UTC | #1
On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> After upgrading our kernel from version 4.19 to 6.1, certain regressions
> occurred due to the driver's asynchronous probe behavior. Specifically,
> the SCSI driver transitioned to an asynchronous probe by default, resulting
> in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> was consistently identified as /dev/sda. However, with kernel 6.1, the root
> disk can be any of /dev/sdX, leading to issues for applications reliant on
> /dev/sda, notably impacting monitoring systems monitoring the root disk.

Device names are never guaranteed to be stable, ALWAYS use a persistant
names like a filesystem label or other ways.  Look at /dev/disk/ for the
needed ways to do this properly.

> To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> to enforce synchronous probe behavior for specific drivers.

This should be a per-bus thing, not a driver-specific thing as drivers
for the same bus could have differing settings here which would cause a
mess.

Please just revert the scsi bus functionality if you have had
regressions here, it's not a driver-core thing to do.

thanks,

greg k-h
  
Yafang Shao Dec. 6, 2023, 2:08 p.m. UTC | #2
On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > occurred due to the driver's asynchronous probe behavior. Specifically,
> > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > /dev/sda, notably impacting monitoring systems monitoring the root disk.
>
> Device names are never guaranteed to be stable, ALWAYS use a persistant
> names like a filesystem label or other ways.  Look at /dev/disk/ for the
> needed ways to do this properly.

The root disk is typically identified as /dev/sda or /dev/vda, right?
This is because the root disk, which houses the operating system,
cannot be removed or hotplugged. Therefore, it usually remains as the
first disk in the system. With the synchronous probe, the root disk
maintains a stable and consistent identification.

>
> > To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> > to enforce synchronous probe behavior for specific drivers.
>
> This should be a per-bus thing, not a driver-specific thing as drivers
> for the same bus could have differing settings here which would cause a
> mess.
>
> Please just revert the scsi bus functionality if you have had
> regressions here, it's not a driver-core thing to do.

Are you suggesting a reversal of the asynchronous probe code in the
SCSI driver? While reverting to synchronous probing could ensure
stability, it's worth noting that asynchronous probing can potentially
shorten the reboot duration under specific conditions. Thus, there
might be some resistance to reverting this change as it offers
performance benefits in certain scenarios. That's why I prefer to
introduce a kernel parameter for it.
  
Greg KH Dec. 7, 2023, 10:19 a.m. UTC | #3
On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> >
> > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > needed ways to do this properly.
> 
> The root disk is typically identified as /dev/sda or /dev/vda, right?

Depends on your system.  It can also be identified, in the proper way,
as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
(note, fake uuid, use your own disk uuid please.)

Why not do that?  That's the most stable and recommended way of doing
things.

> This is because the root disk, which houses the operating system,
> cannot be removed or hotplugged.

Not true at all, happens for many systems (think about how systems that
run their whole OS out of ram work...)

> Therefore, it usually remains as the
> first disk in the system. With the synchronous probe, the root disk
> maintains a stable and consistent identification.
> 
> >
> > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> > > to enforce synchronous probe behavior for specific drivers.
> >
> > This should be a per-bus thing, not a driver-specific thing as drivers
> > for the same bus could have differing settings here which would cause a
> > mess.
> >
> > Please just revert the scsi bus functionality if you have had
> > regressions here, it's not a driver-core thing to do.
> 
> Are you suggesting a reversal of the asynchronous probe code in the
> SCSI driver?

For your broken scsi driver, yes.

> While reverting to synchronous probing could ensure
> stability, it's worth noting that asynchronous probing can potentially
> shorten the reboot duration under specific conditions. Thus, there
> might be some resistance to reverting this change as it offers
> performance benefits in certain scenarios. That's why I prefer to
> introduce a kernel parameter for it.

I don't want to add a new parameter that we need to support for forever
and add to the complexity of the system unless it is REALLY needed.
Please work with the scsi developers to resolve the issue for your
hardware, as it's been working for everyone else for well over a year
now, right?

thanks,

greg k-h
  
Yafang Shao Dec. 7, 2023, 11:59 a.m. UTC | #4
On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > >
> > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > needed ways to do this properly.
> >
> > The root disk is typically identified as /dev/sda or /dev/vda, right?
>
> Depends on your system.  It can also be identified, in the proper way,
> as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> (note, fake uuid, use your own disk uuid please.)
>
> Why not do that?  That's the most stable and recommended way of doing
> things.

Adapting to this change isn't straightforward, especially for a large
fleet of servers. Our monitoring system needs to accommodate and
adjust accordingly.

>
> > This is because the root disk, which houses the operating system,
> > cannot be removed or hotplugged.
>
> Not true at all, happens for many systems (think about how systems that
> run their whole OS out of ram work...)
>
> > Therefore, it usually remains as the
> > first disk in the system. With the synchronous probe, the root disk
> > maintains a stable and consistent identification.
> >
> > >
> > > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> > > > to enforce synchronous probe behavior for specific drivers.
> > >
> > > This should be a per-bus thing, not a driver-specific thing as drivers
> > > for the same bus could have differing settings here which would cause a
> > > mess.
> > >
> > > Please just revert the scsi bus functionality if you have had
> > > regressions here, it's not a driver-core thing to do.
> >
> > Are you suggesting a reversal of the asynchronous probe code in the
> > SCSI driver?
>
> For your broken scsi driver, yes.
>
> > While reverting to synchronous probing could ensure
> > stability, it's worth noting that asynchronous probing can potentially
> > shorten the reboot duration under specific conditions. Thus, there
> > might be some resistance to reverting this change as it offers
> > performance benefits in certain scenarios. That's why I prefer to
> > introduce a kernel parameter for it.
>
> I don't want to add a new parameter that we need to support for forever
> and add to the complexity of the system unless it is REALLY needed.

BTW, since there's already a 'driver_async_probe=', introducing
another 'driver_sync_probe=' wouldn't significantly increase the
maintenance overhead.

> Please work with the scsi developers to resolve the issue for your
> hardware, as it's been working for everyone else for well over a year
> now, right?

The SCSI guys are added to this mail thread.

I'm uncertain whether it's possible to add SCSI kernel parameters
selectively. If that's not feasible, we'll need to maintain the
following modification in our local kernel:

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e934779..8148d12 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -607,7 +607,7 @@ static void sd_set_flush_flag(struct scsi_disk *sdkp)
                .name           = "sd",
                .owner          = THIS_MODULE,
                .probe          = sd_probe,
-               .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
+               .probe_type     = PROBE_PREFER_SYNCHRONOUS,
                .remove         = sd_remove,
                .shutdown       = sd_shutdown,
                .pm             = &sd_pm_ops,




--
Regards
Yafang
  
Greg KH Dec. 7, 2023, 12:12 p.m. UTC | #5
On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > >
> > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > needed ways to do this properly.
> > >
> > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> >
> > Depends on your system.  It can also be identified, in the proper way,
> > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > (note, fake uuid, use your own disk uuid please.)
> >
> > Why not do that?  That's the most stable and recommended way of doing
> > things.
> 
> Adapting to this change isn't straightforward, especially for a large
> fleet of servers. Our monitoring system needs to accommodate and
> adjust accordingly.

Agreed, that can be rough.  But as this is an issue that was caused by a
scsi core change, perhaps the scsi developers can describe why it's ok.

But really, device naming has ALWAYS been known to not be
deterministic, which is why Pat and I did all the driver core work 20+
years ago so that you have the ability to properly name your devices in
a way that is deterministic.  Using the kernel name like sda is NOT
using that functionality, so while it has been nice to see that it has
been stable for you for a while, you are playing with fire here and will
get burned one day when the firmware in your devices decide to change
response times.

> > > While reverting to synchronous probing could ensure
> > > stability, it's worth noting that asynchronous probing can potentially
> > > shorten the reboot duration under specific conditions. Thus, there
> > > might be some resistance to reverting this change as it offers
> > > performance benefits in certain scenarios. That's why I prefer to
> > > introduce a kernel parameter for it.
> >
> > I don't want to add a new parameter that we need to support for forever
> > and add to the complexity of the system unless it is REALLY needed.
> 
> BTW, since there's already a 'driver_async_probe=', introducing
> another 'driver_sync_probe=' wouldn't significantly increase the
> maintenance overhead.

Any new code adds maintenance overhead and complexity, so you have to
justify it's existance especially when you are not going to be the one
maintaining it :)

thanks,

greg k-h
  
Yafang Shao Dec. 7, 2023, 12:36 p.m. UTC | #6
On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > >
> > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > needed ways to do this properly.
> > > >
> > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > >
> > > Depends on your system.  It can also be identified, in the proper way,
> > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > (note, fake uuid, use your own disk uuid please.)
> > >
> > > Why not do that?  That's the most stable and recommended way of doing
> > > things.
> >
> > Adapting to this change isn't straightforward, especially for a large
> > fleet of servers. Our monitoring system needs to accommodate and
> > adjust accordingly.
>
> Agreed, that can be rough.  But as this is an issue that was caused by a
> scsi core change, perhaps the scsi developers can describe why it's ok.
>
> But really, device naming has ALWAYS been known to not be
> deterministic, which is why Pat and I did all the driver core work 20+
> years ago so that you have the ability to properly name your devices in
> a way that is deterministic.  Using the kernel name like sda is NOT
> using that functionality, so while it has been nice to see that it has
> been stable for you for a while, you are playing with fire here and will
> get burned one day when the firmware in your devices decide to change
> response times.

I agree that using UUID is a better approach. However, it's worth
noting that the widely used IO monitoring tool 'iostat' faces
challenges when working with UUIDs. This indicates that there's a
significant amount of work ahead of us in this aspect.


>
> > > > While reverting to synchronous probing could ensure
> > > > stability, it's worth noting that asynchronous probing can potentially
> > > > shorten the reboot duration under specific conditions. Thus, there
> > > > might be some resistance to reverting this change as it offers
> > > > performance benefits in certain scenarios. That's why I prefer to
> > > > introduce a kernel parameter for it.
> > >
> > > I don't want to add a new parameter that we need to support for forever
> > > and add to the complexity of the system unless it is REALLY needed.
> >
> > BTW, since there's already a 'driver_async_probe=', introducing
> > another 'driver_sync_probe=' wouldn't significantly increase the
> > maintenance overhead.
>
> Any new code adds maintenance overhead and complexity, so you have to
> justify it's existance especially when you are not going to be the one
> maintaining it :)

Understood.
  
Greg KH Dec. 8, 2023, 5:36 a.m. UTC | #7
On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > >
> > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > needed ways to do this properly.
> > > > >
> > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > >
> > > > Depends on your system.  It can also be identified, in the proper way,
> > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > (note, fake uuid, use your own disk uuid please.)
> > > >
> > > > Why not do that?  That's the most stable and recommended way of doing
> > > > things.
> > >
> > > Adapting to this change isn't straightforward, especially for a large
> > > fleet of servers. Our monitoring system needs to accommodate and
> > > adjust accordingly.
> >
> > Agreed, that can be rough.  But as this is an issue that was caused by a
> > scsi core change, perhaps the scsi developers can describe why it's ok.
> >
> > But really, device naming has ALWAYS been known to not be
> > deterministic, which is why Pat and I did all the driver core work 20+
> > years ago so that you have the ability to properly name your devices in
> > a way that is deterministic.  Using the kernel name like sda is NOT
> > using that functionality, so while it has been nice to see that it has
> > been stable for you for a while, you are playing with fire here and will
> > get burned one day when the firmware in your devices decide to change
> > response times.
> 
> I agree that using UUID is a better approach. However, it's worth
> noting that the widely used IO monitoring tool 'iostat' faces
> challenges when working with UUIDs. This indicates that there's a
> significant amount of work ahead of us in this aspect.

That indicates that iostat needs to be fixed as this has been an option
that people rely on for 20+ years now.  Or use a better tool :)

thanks,

greg k-h
  
Yafang Shao Dec. 8, 2023, 6:49 a.m. UTC | #8
On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > >
> > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > needed ways to do this properly.
> > > > > >
> > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > >
> > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > (note, fake uuid, use your own disk uuid please.)
> > > > >
> > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > things.
> > > >
> > > > Adapting to this change isn't straightforward, especially for a large
> > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > adjust accordingly.
> > >
> > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > >
> > > But really, device naming has ALWAYS been known to not be
> > > deterministic, which is why Pat and I did all the driver core work 20+
> > > years ago so that you have the ability to properly name your devices in
> > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > using that functionality, so while it has been nice to see that it has
> > > been stable for you for a while, you are playing with fire here and will
> > > get burned one day when the firmware in your devices decide to change
> > > response times.
> >
> > I agree that using UUID is a better approach. However, it's worth
> > noting that the widely used IO monitoring tool 'iostat' faces
> > challenges when working with UUIDs. This indicates that there's a
> > significant amount of work ahead of us in this aspect.
>
> That indicates that iostat needs to be fixed as this has been an option
> that people rely on for 20+ years now.  Or use a better tool :)

The issue arises when a disk contains multiple partitions, such as
/dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
display 'sda' since only its partitions possess UUIDs. Uncertain how
to address it yet.
  
Greg KH Dec. 8, 2023, 7:15 a.m. UTC | #9
On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote:
> On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > > >
> > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > > >
> > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > > needed ways to do this properly.
> > > > > > >
> > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > > >
> > > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > > (note, fake uuid, use your own disk uuid please.)
> > > > > >
> > > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > > things.
> > > > >
> > > > > Adapting to this change isn't straightforward, especially for a large
> > > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > > adjust accordingly.
> > > >
> > > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > > >
> > > > But really, device naming has ALWAYS been known to not be
> > > > deterministic, which is why Pat and I did all the driver core work 20+
> > > > years ago so that you have the ability to properly name your devices in
> > > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > > using that functionality, so while it has been nice to see that it has
> > > > been stable for you for a while, you are playing with fire here and will
> > > > get burned one day when the firmware in your devices decide to change
> > > > response times.
> > >
> > > I agree that using UUID is a better approach. However, it's worth
> > > noting that the widely used IO monitoring tool 'iostat' faces
> > > challenges when working with UUIDs. This indicates that there's a
> > > significant amount of work ahead of us in this aspect.
> >
> > That indicates that iostat needs to be fixed as this has been an option
> > that people rely on for 20+ years now.  Or use a better tool :)
> 
> The issue arises when a disk contains multiple partitions, such as
> /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
> display 'sda' since only its partitions possess UUIDs. Uncertain how
> to address it yet.

Then use one of the other many other unique ids that are in /dev/disk/
today.  You have loads of things to choose from:
	$ ls /dev/disk/
	by-diskseq  by-id  by-label  by-partlabel  by-partuuid  by-path  by-uuid

You have a plethera of choices here, use whatever works best for your
systems.  This is a userspace decision to make, not a kernel one, as
this is a policy choice of yours.

good luck!

greg k-h
  
Yafang Shao Dec. 8, 2023, 7:26 a.m. UTC | #10
On Fri, Dec 8, 2023 at 3:15 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote:
> > On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > > > >
> > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > > > needed ways to do this properly.
> > > > > > > >
> > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > > > >
> > > > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > > > (note, fake uuid, use your own disk uuid please.)
> > > > > > >
> > > > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > > > things.
> > > > > >
> > > > > > Adapting to this change isn't straightforward, especially for a large
> > > > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > > > adjust accordingly.
> > > > >
> > > > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > > > >
> > > > > But really, device naming has ALWAYS been known to not be
> > > > > deterministic, which is why Pat and I did all the driver core work 20+
> > > > > years ago so that you have the ability to properly name your devices in
> > > > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > > > using that functionality, so while it has been nice to see that it has
> > > > > been stable for you for a while, you are playing with fire here and will
> > > > > get burned one day when the firmware in your devices decide to change
> > > > > response times.
> > > >
> > > > I agree that using UUID is a better approach. However, it's worth
> > > > noting that the widely used IO monitoring tool 'iostat' faces
> > > > challenges when working with UUIDs. This indicates that there's a
> > > > significant amount of work ahead of us in this aspect.
> > >
> > > That indicates that iostat needs to be fixed as this has been an option
> > > that people rely on for 20+ years now.  Or use a better tool :)
> >
> > The issue arises when a disk contains multiple partitions, such as
> > /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
> > display 'sda' since only its partitions possess UUIDs. Uncertain how
> > to address it yet.
>
> Then use one of the other many other unique ids that are in /dev/disk/
> today.  You have loads of things to choose from:
>         $ ls /dev/disk/
>         by-diskseq  by-id  by-label  by-partlabel  by-partuuid  by-path  by-uuid
>
> You have a plethera of choices here, use whatever works best for your
> systems.  This is a userspace decision to make, not a kernel one, as
> this is a policy choice of yours.
>

Indeed, there are alternative methods besides using UUIDs. This
example serves to highlight that UUIDs might not cover all scenarios,
similar to other IDs listed under /dev/disk/.
  

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 65731b060e3f..9b1a12b24f65 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1144,6 +1144,16 @@ 
 			match the *.
 			Format: <driver_name1>,<driver_name2>...
 
+	driver_sync_probe=  [KNL]
+			List of driver names to be probed synchronously. *
+			matches with all driver names. If * is specified, the
+			rest of the listed driver names are those that will NOT
+			match the *.
+			Format: <driver_name1>,<driver_name2>...
+
+			Note that 'driver_sync_probe=' takes precedence over
+			'driver_async_probe=' if both parameters are set.
+
 	drm.edid_firmware=[<connector>:]<file>[,[<connector>:]<file>]
 			Broken monitors, graphic adapters, KVMs and EDIDless
 			panels may send no or incorrect EDID data sets.
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 0c3725c3eefa..f4d8f0b76b26 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -58,9 +58,11 @@  static atomic_t deferred_trigger_count = ATOMIC_INIT(0);
 static bool initcalls_done;
 
 /* Save the async probe drivers' name from kernel cmdline */
-#define ASYNC_DRV_NAMES_MAX_LEN	256
-static char async_probe_drv_names[ASYNC_DRV_NAMES_MAX_LEN];
+#define PROBE_DRV_NAMES_MAX_LEN	256
+static char async_probe_drv_names[PROBE_DRV_NAMES_MAX_LEN];
 static bool async_probe_default;
+static char sync_probe_drv_names[PROBE_DRV_NAMES_MAX_LEN];
+static bool sync_probe_default;
 
 /*
  * In some cases, like suspend to RAM or hibernation, It might be reasonable
@@ -843,30 +845,48 @@  static int driver_probe_device(struct device_driver *drv, struct device *dev)
 	return ret;
 }
 
-static inline bool cmdline_requested_async_probing(const char *drv_name)
+static inline bool
+cmdline_requested_probing(const char *drv_name, const char *drv_names, bool all_drv)
 {
-	bool async_drv;
+	bool probe_drv;
 
-	async_drv = parse_option_str(async_probe_drv_names, drv_name);
-
-	return (async_probe_default != async_drv);
+	probe_drv = parse_option_str(drv_names, drv_name);
+	return (all_drv != probe_drv);
 }
 
 /* The option format is "driver_async_probe=drv_name1,drv_name2,..." */
 static int __init save_async_options(char *buf)
 {
-	if (strlen(buf) >= ASYNC_DRV_NAMES_MAX_LEN)
+	if (strlen(buf) >= PROBE_DRV_NAMES_MAX_LEN)
 		pr_warn("Too long list of driver names for 'driver_async_probe'!\n");
 
-	strscpy(async_probe_drv_names, buf, ASYNC_DRV_NAMES_MAX_LEN);
+	strscpy(async_probe_drv_names, buf, PROBE_DRV_NAMES_MAX_LEN);
 	async_probe_default = parse_option_str(async_probe_drv_names, "*");
 
 	return 1;
 }
 __setup("driver_async_probe=", save_async_options);
 
+/* The option format is "driver_sync_probe=drv_name1,drv_name2,..."
+ * driver_sync_probe is prior to driver_async_probe if both of them are set.
+ */
+static int __init save_sync_options(char *buf)
+{
+	if (strlen(buf) >= PROBE_DRV_NAMES_MAX_LEN)
+		pr_warn("Too long list of driver names for 'driver_sync_probe'!\n");
+
+	strscpy(sync_probe_drv_names, buf, PROBE_DRV_NAMES_MAX_LEN);
+	sync_probe_default = parse_option_str(sync_probe_drv_names, "*");
+
+	return 1;
+}
+__setup("driver_sync_probe=", save_sync_options);
+
 static bool driver_allows_async_probing(struct device_driver *drv)
 {
+	if (cmdline_requested_probing(drv->name, sync_probe_drv_names, sync_probe_default))
+		return false;
+
 	switch (drv->probe_type) {
 	case PROBE_PREFER_ASYNCHRONOUS:
 		return true;
@@ -875,7 +895,8 @@  static bool driver_allows_async_probing(struct device_driver *drv)
 		return false;
 
 	default:
-		if (cmdline_requested_async_probing(drv->name))
+		if (cmdline_requested_probing(drv->name, async_probe_drv_names,
+					      async_probe_default))
 			return true;
 
 		if (module_requested_async_probing(drv->owner))