Message ID | 20231206115355.4319-1-laoar.shao@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp4050160vqy; Wed, 6 Dec 2023 03:55:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IHzWXFF4l7eAgiw6lR19RTsQxYf+2Ae8DbsGjUr8dKF4gzKmzCbs93XotC3BwK+NcrSPZ08 X-Received: by 2002:a17:902:8491:b0:1d0:98d7:f2f6 with SMTP id c17-20020a170902849100b001d098d7f2f6mr870233plo.67.1701863700388; Wed, 06 Dec 2023 03:55:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701863700; cv=none; d=google.com; s=arc-20160816; b=ZIM9L5IN+MCZHxfo7BGyO95S2Dbydu0czvIxOz05CroSSzfglpKuoEutpWyD+uaB1h Iuz9aXRmqz+d+NvE5TlRPJ/01LwMOJD6Qml0EQlKvJOXJQNXsi2eg6lrAgvHBfOocwzA uU+6I1BOtzygfpaTzCu3nXoiRHJy5IFP58Ziw+bDDwjk5Hz+Llhdi4Nars0bttQN0biF 0zZY9FPAT46lpk68lWCuhQ14eBzfNiap6urKMQvJCCz/NhP2QwkyszU4lAeER1JiLJlg w2owloOy/HQYl6FsH6UWN6CLGdJUxmVkGaqS34lgHN1k1NTukRzy6XYHd8anWg3sjxHa aVqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=UfUHNpfPacLF7I7URyFEVrXNCwdHWIRysCzkc7i/pu4=; fh=ojqu8UPsqdLzQJJ7GP/QE5XPEM01ohbWj5W16qDVGFo=; b=XNApyY9CnM6CEVHebrmPyGj70e/qCFZVO2Oaoww7f0ItNGWuQ2xKxa2+kCjmf1fVPO WC84HqXWgFPmnAVh/e9grmNz0OBwfziwdiO24AhS4k8pPwUhm5aoVNSYl4jgNJNa1x6N 4JkxOXaVMcp3v/0/+3sNA+rjVI5PRS8kiAzhmnCahSak+sGVCVH8TjBLYioqYwt2C8qZ B8VBo2kTFRrYLx2NbpXLiPpFTcqTkAth8OuL8Dk5Iq/HiGBrpPTD42QvhK7lHkHRC2qr Rl1zpOyAQoYIDU0a90Y+BNbWE2Jl8+nLX7gBdDBDoMapBYFYFKriHTJoKkMt+Lz6egs2 8D5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MqHktkue; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id h8-20020a170902ac8800b001d060d48fb3si8223971plr.460.2023.12.06.03.54.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 03:55:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MqHktkue; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id CC4D5805F94C; Wed, 6 Dec 2023 03:54:58 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377817AbjLFLyo (ORCPT <rfc822;pusanteemu@gmail.com> + 99 others); Wed, 6 Dec 2023 06:54:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377869AbjLFLyj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 6 Dec 2023 06:54:39 -0500 Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AD6F1739 for <linux-kernel@vger.kernel.org>; Wed, 6 Dec 2023 03:54:08 -0800 (PST) Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6cb55001124so607802b3a.0 for <linux-kernel@vger.kernel.org>; Wed, 06 Dec 2023 03:54:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701863648; x=1702468448; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=UfUHNpfPacLF7I7URyFEVrXNCwdHWIRysCzkc7i/pu4=; b=MqHktkueGoeNznE3RU5M7tVL9HywbvIBsbKvAMQHaq14duk9NY6ByuKHvoXCSRxv6J aWtmweCJVgv0H6wshZntXjSF5ScMf3ZArYfhlF8o4kJKt3LVopTGSDy7uGITUn5wvpKr LrgNUHwI0cLuPEnPAGFSPAX7acL6g9rDTHXLSX3lh0A0TW/vtz8YIkUX5CCQqRNhtGXr u1q/J0LyBoWouxMiJe2f5dihj2zpEkBN5LBUhONAGBSj4araGMsuco0thnt32d5UaKC0 suFwsj5LwYpcQD6ETKNWva2EC53gEubpM2PaQKffXA3gEX21oq8KbFlybkMyNfocsOJT s0AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701863648; x=1702468448; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UfUHNpfPacLF7I7URyFEVrXNCwdHWIRysCzkc7i/pu4=; b=DSwI0AYXVD8Sz2CM0vsc6SBj/0WTBol3LzXZl6tPELdDKtGZSKc/gAWq0Wpn6wLv8P nu1bt/vKq+S2HG4VOrPHzgAnAOYIok5M9TKfvJz3yxlTpzTUil+rxrMKTRaQlJoZIZOg 9/66KQc9GP+KpVua7D+wG4Y5QOLMrNRctdnnzVxwlzBkNUvUn/iCWfJGnSlvhnRl+Dem y0z9H5NBBBK/u6eJld8PLYTTY4Bw/sZSlbmRNfZVWHjdRFTls/h5GRLN9IB3E814Axis WTFz8tUIu+0DlF3Du0fWA7f/+qnxY1cLFIEjXLhUqcI2jQ28x3SHfvQGBpVp87A340IO PTgw== X-Gm-Message-State: AOJu0Yz3n/6h7hDF43SBiqSICpywFUMlMDm35whaNxe1xp9VEXLYcEd8 gwuK+UB1hZexB6fX3Q/gDpY= X-Received: by 2002:a05:6a20:100f:b0:18f:97c:3854 with SMTP id gs15-20020a056a20100f00b0018f097c3854mr975047pzc.46.1701863647706; Wed, 06 Dec 2023 03:54:07 -0800 (PST) Received: from vultr.guest ([149.28.194.201]) by smtp.gmail.com with ESMTPSA id u22-20020a056a00125600b006ce321a9523sm7542327pfi.49.2023.12.06.03.54.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 03:54:07 -0800 (PST) From: Yafang Shao <laoar.shao@gmail.com> To: gregkh@linuxfoundation.org, rafael@kernel.org Cc: linux-kernel@vger.kernel.org, Yafang Shao <laoar.shao@gmail.com> Subject: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe= Date: Wed, 6 Dec 2023 11:53:55 +0000 Message-Id: <20231206115355.4319-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 06 Dec 2023 03:54:58 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784533431480748919 X-GMAIL-MSGID: 1784533431480748919 |
Series |
drivers: base: Introduce a new kernel parameter driver_sync_probe=
|
|
Commit Message
Yafang Shao
Dec. 6, 2023, 11:53 a.m. UTC
After upgrading our kernel from version 4.19 to 6.1, certain regressions
occurred due to the driver's asynchronous probe behavior. Specifically,
the SCSI driver transitioned to an asynchronous probe by default, resulting
in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
was consistently identified as /dev/sda. However, with kernel 6.1, the root
disk can be any of /dev/sdX, leading to issues for applications reliant on
/dev/sda, notably impacting monitoring systems monitoring the root disk.
To address this, a new kernel parameter 'driver_sync_probe=' is introduced
to enforce synchronous probe behavior for specific drivers.
For instance, using the following kernel parameter:
driver_sync_probe=sd,nvme
The sd (SCSI) and nvme disks will undergo synchronous probing. This ensures
that these disks maintain consistent identification behavior despite the
default asynchronous probe, mitigating the issues experienced by
applications reliant on fixed disk identification.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
.../admin-guide/kernel-parameters.txt | 10 +++++
drivers/base/dd.c | 41 ++++++++++++++-----
2 files changed, 41 insertions(+), 10 deletions(-)
Comments
On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > After upgrading our kernel from version 4.19 to 6.1, certain regressions > occurred due to the driver's asynchronous probe behavior. Specifically, > the SCSI driver transitioned to an asynchronous probe by default, resulting > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > was consistently identified as /dev/sda. However, with kernel 6.1, the root > disk can be any of /dev/sdX, leading to issues for applications reliant on > /dev/sda, notably impacting monitoring systems monitoring the root disk. Device names are never guaranteed to be stable, ALWAYS use a persistant names like a filesystem label or other ways. Look at /dev/disk/ for the needed ways to do this properly. > To address this, a new kernel parameter 'driver_sync_probe=' is introduced > to enforce synchronous probe behavior for specific drivers. This should be a per-bus thing, not a driver-specific thing as drivers for the same bus could have differing settings here which would cause a mess. Please just revert the scsi bus functionality if you have had regressions here, it's not a driver-core thing to do. thanks, greg k-h
On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > occurred due to the driver's asynchronous probe behavior. Specifically, > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > Device names are never guaranteed to be stable, ALWAYS use a persistant > names like a filesystem label or other ways. Look at /dev/disk/ for the > needed ways to do this properly. The root disk is typically identified as /dev/sda or /dev/vda, right? This is because the root disk, which houses the operating system, cannot be removed or hotplugged. Therefore, it usually remains as the first disk in the system. With the synchronous probe, the root disk maintains a stable and consistent identification. > > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced > > to enforce synchronous probe behavior for specific drivers. > > This should be a per-bus thing, not a driver-specific thing as drivers > for the same bus could have differing settings here which would cause a > mess. > > Please just revert the scsi bus functionality if you have had > regressions here, it's not a driver-core thing to do. Are you suggesting a reversal of the asynchronous probe code in the SCSI driver? While reverting to synchronous probing could ensure stability, it's worth noting that asynchronous probing can potentially shorten the reboot duration under specific conditions. Thus, there might be some resistance to reverting this change as it offers performance benefits in certain scenarios. That's why I prefer to introduce a kernel parameter for it.
On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > needed ways to do this properly. > > The root disk is typically identified as /dev/sda or /dev/vda, right? Depends on your system. It can also be identified, in the proper way, as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want (note, fake uuid, use your own disk uuid please.) Why not do that? That's the most stable and recommended way of doing things. > This is because the root disk, which houses the operating system, > cannot be removed or hotplugged. Not true at all, happens for many systems (think about how systems that run their whole OS out of ram work...) > Therefore, it usually remains as the > first disk in the system. With the synchronous probe, the root disk > maintains a stable and consistent identification. > > > > > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced > > > to enforce synchronous probe behavior for specific drivers. > > > > This should be a per-bus thing, not a driver-specific thing as drivers > > for the same bus could have differing settings here which would cause a > > mess. > > > > Please just revert the scsi bus functionality if you have had > > regressions here, it's not a driver-core thing to do. > > Are you suggesting a reversal of the asynchronous probe code in the > SCSI driver? For your broken scsi driver, yes. > While reverting to synchronous probing could ensure > stability, it's worth noting that asynchronous probing can potentially > shorten the reboot duration under specific conditions. Thus, there > might be some resistance to reverting this change as it offers > performance benefits in certain scenarios. That's why I prefer to > introduce a kernel parameter for it. I don't want to add a new parameter that we need to support for forever and add to the complexity of the system unless it is REALLY needed. Please work with the scsi developers to resolve the issue for your hardware, as it's been working for everyone else for well over a year now, right? thanks, greg k-h
On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > needed ways to do this properly. > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > Depends on your system. It can also be identified, in the proper way, > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > (note, fake uuid, use your own disk uuid please.) > > Why not do that? That's the most stable and recommended way of doing > things. Adapting to this change isn't straightforward, especially for a large fleet of servers. Our monitoring system needs to accommodate and adjust accordingly. > > > This is because the root disk, which houses the operating system, > > cannot be removed or hotplugged. > > Not true at all, happens for many systems (think about how systems that > run their whole OS out of ram work...) > > > Therefore, it usually remains as the > > first disk in the system. With the synchronous probe, the root disk > > maintains a stable and consistent identification. > > > > > > > > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced > > > > to enforce synchronous probe behavior for specific drivers. > > > > > > This should be a per-bus thing, not a driver-specific thing as drivers > > > for the same bus could have differing settings here which would cause a > > > mess. > > > > > > Please just revert the scsi bus functionality if you have had > > > regressions here, it's not a driver-core thing to do. > > > > Are you suggesting a reversal of the asynchronous probe code in the > > SCSI driver? > > For your broken scsi driver, yes. > > > While reverting to synchronous probing could ensure > > stability, it's worth noting that asynchronous probing can potentially > > shorten the reboot duration under specific conditions. Thus, there > > might be some resistance to reverting this change as it offers > > performance benefits in certain scenarios. That's why I prefer to > > introduce a kernel parameter for it. > > I don't want to add a new parameter that we need to support for forever > and add to the complexity of the system unless it is REALLY needed. BTW, since there's already a 'driver_async_probe=', introducing another 'driver_sync_probe=' wouldn't significantly increase the maintenance overhead. > Please work with the scsi developers to resolve the issue for your > hardware, as it's been working for everyone else for well over a year > now, right? The SCSI guys are added to this mail thread. I'm uncertain whether it's possible to add SCSI kernel parameters selectively. If that's not feasible, we'll need to maintain the following modification in our local kernel: diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e934779..8148d12 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -607,7 +607,7 @@ static void sd_set_flush_flag(struct scsi_disk *sdkp) .name = "sd", .owner = THIS_MODULE, .probe = sd_probe, - .probe_type = PROBE_PREFER_ASYNCHRONOUS, + .probe_type = PROBE_PREFER_SYNCHRONOUS, .remove = sd_remove, .shutdown = sd_shutdown, .pm = &sd_pm_ops, -- Regards Yafang
On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > needed ways to do this properly. > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > Depends on your system. It can also be identified, in the proper way, > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > (note, fake uuid, use your own disk uuid please.) > > > > Why not do that? That's the most stable and recommended way of doing > > things. > > Adapting to this change isn't straightforward, especially for a large > fleet of servers. Our monitoring system needs to accommodate and > adjust accordingly. Agreed, that can be rough. But as this is an issue that was caused by a scsi core change, perhaps the scsi developers can describe why it's ok. But really, device naming has ALWAYS been known to not be deterministic, which is why Pat and I did all the driver core work 20+ years ago so that you have the ability to properly name your devices in a way that is deterministic. Using the kernel name like sda is NOT using that functionality, so while it has been nice to see that it has been stable for you for a while, you are playing with fire here and will get burned one day when the firmware in your devices decide to change response times. > > > While reverting to synchronous probing could ensure > > > stability, it's worth noting that asynchronous probing can potentially > > > shorten the reboot duration under specific conditions. Thus, there > > > might be some resistance to reverting this change as it offers > > > performance benefits in certain scenarios. That's why I prefer to > > > introduce a kernel parameter for it. > > > > I don't want to add a new parameter that we need to support for forever > > and add to the complexity of the system unless it is REALLY needed. > > BTW, since there's already a 'driver_async_probe=', introducing > another 'driver_sync_probe=' wouldn't significantly increase the > maintenance overhead. Any new code adds maintenance overhead and complexity, so you have to justify it's existance especially when you are not going to be the one maintaining it :) thanks, greg k-h
On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > > needed ways to do this properly. > > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > > > Depends on your system. It can also be identified, in the proper way, > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > > (note, fake uuid, use your own disk uuid please.) > > > > > > Why not do that? That's the most stable and recommended way of doing > > > things. > > > > Adapting to this change isn't straightforward, especially for a large > > fleet of servers. Our monitoring system needs to accommodate and > > adjust accordingly. > > Agreed, that can be rough. But as this is an issue that was caused by a > scsi core change, perhaps the scsi developers can describe why it's ok. > > But really, device naming has ALWAYS been known to not be > deterministic, which is why Pat and I did all the driver core work 20+ > years ago so that you have the ability to properly name your devices in > a way that is deterministic. Using the kernel name like sda is NOT > using that functionality, so while it has been nice to see that it has > been stable for you for a while, you are playing with fire here and will > get burned one day when the firmware in your devices decide to change > response times. I agree that using UUID is a better approach. However, it's worth noting that the widely used IO monitoring tool 'iostat' faces challenges when working with UUIDs. This indicates that there's a significant amount of work ahead of us in this aspect. > > > > > While reverting to synchronous probing could ensure > > > > stability, it's worth noting that asynchronous probing can potentially > > > > shorten the reboot duration under specific conditions. Thus, there > > > > might be some resistance to reverting this change as it offers > > > > performance benefits in certain scenarios. That's why I prefer to > > > > introduce a kernel parameter for it. > > > > > > I don't want to add a new parameter that we need to support for forever > > > and add to the complexity of the system unless it is REALLY needed. > > > > BTW, since there's already a 'driver_async_probe=', introducing > > another 'driver_sync_probe=' wouldn't significantly increase the > > maintenance overhead. > > Any new code adds maintenance overhead and complexity, so you have to > justify it's existance especially when you are not going to be the one > maintaining it :) Understood.
On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote: > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > > > needed ways to do this properly. > > > > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > > > > > Depends on your system. It can also be identified, in the proper way, > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > > > (note, fake uuid, use your own disk uuid please.) > > > > > > > > Why not do that? That's the most stable and recommended way of doing > > > > things. > > > > > > Adapting to this change isn't straightforward, especially for a large > > > fleet of servers. Our monitoring system needs to accommodate and > > > adjust accordingly. > > > > Agreed, that can be rough. But as this is an issue that was caused by a > > scsi core change, perhaps the scsi developers can describe why it's ok. > > > > But really, device naming has ALWAYS been known to not be > > deterministic, which is why Pat and I did all the driver core work 20+ > > years ago so that you have the ability to properly name your devices in > > a way that is deterministic. Using the kernel name like sda is NOT > > using that functionality, so while it has been nice to see that it has > > been stable for you for a while, you are playing with fire here and will > > get burned one day when the firmware in your devices decide to change > > response times. > > I agree that using UUID is a better approach. However, it's worth > noting that the widely used IO monitoring tool 'iostat' faces > challenges when working with UUIDs. This indicates that there's a > significant amount of work ahead of us in this aspect. That indicates that iostat needs to be fixed as this has been an option that people rely on for 20+ years now. Or use a better tool :) thanks, greg k-h
On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote: > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > > > > needed ways to do this properly. > > > > > > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > > > > > > > Depends on your system. It can also be identified, in the proper way, > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > > > > (note, fake uuid, use your own disk uuid please.) > > > > > > > > > > Why not do that? That's the most stable and recommended way of doing > > > > > things. > > > > > > > > Adapting to this change isn't straightforward, especially for a large > > > > fleet of servers. Our monitoring system needs to accommodate and > > > > adjust accordingly. > > > > > > Agreed, that can be rough. But as this is an issue that was caused by a > > > scsi core change, perhaps the scsi developers can describe why it's ok. > > > > > > But really, device naming has ALWAYS been known to not be > > > deterministic, which is why Pat and I did all the driver core work 20+ > > > years ago so that you have the ability to properly name your devices in > > > a way that is deterministic. Using the kernel name like sda is NOT > > > using that functionality, so while it has been nice to see that it has > > > been stable for you for a while, you are playing with fire here and will > > > get burned one day when the firmware in your devices decide to change > > > response times. > > > > I agree that using UUID is a better approach. However, it's worth > > noting that the widely used IO monitoring tool 'iostat' faces > > challenges when working with UUIDs. This indicates that there's a > > significant amount of work ahead of us in this aspect. > > That indicates that iostat needs to be fixed as this has been an option > that people rely on for 20+ years now. Or use a better tool :) The issue arises when a disk contains multiple partitions, such as /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only display 'sda' since only its partitions possess UUIDs. Uncertain how to address it yet.
On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote: > On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote: > > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > > > > > needed ways to do this properly. > > > > > > > > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > > > > > > > > > Depends on your system. It can also be identified, in the proper way, > > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > > > > > (note, fake uuid, use your own disk uuid please.) > > > > > > > > > > > > Why not do that? That's the most stable and recommended way of doing > > > > > > things. > > > > > > > > > > Adapting to this change isn't straightforward, especially for a large > > > > > fleet of servers. Our monitoring system needs to accommodate and > > > > > adjust accordingly. > > > > > > > > Agreed, that can be rough. But as this is an issue that was caused by a > > > > scsi core change, perhaps the scsi developers can describe why it's ok. > > > > > > > > But really, device naming has ALWAYS been known to not be > > > > deterministic, which is why Pat and I did all the driver core work 20+ > > > > years ago so that you have the ability to properly name your devices in > > > > a way that is deterministic. Using the kernel name like sda is NOT > > > > using that functionality, so while it has been nice to see that it has > > > > been stable for you for a while, you are playing with fire here and will > > > > get burned one day when the firmware in your devices decide to change > > > > response times. > > > > > > I agree that using UUID is a better approach. However, it's worth > > > noting that the widely used IO monitoring tool 'iostat' faces > > > challenges when working with UUIDs. This indicates that there's a > > > significant amount of work ahead of us in this aspect. > > > > That indicates that iostat needs to be fixed as this has been an option > > that people rely on for 20+ years now. Or use a better tool :) > > The issue arises when a disk contains multiple partitions, such as > /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only > display 'sda' since only its partitions possess UUIDs. Uncertain how > to address it yet. Then use one of the other many other unique ids that are in /dev/disk/ today. You have loads of things to choose from: $ ls /dev/disk/ by-diskseq by-id by-label by-partlabel by-partuuid by-path by-uuid You have a plethera of choices here, use whatever works best for your systems. This is a userspace decision to make, not a kernel one, as this is a policy choice of yours. good luck! greg k-h
On Fri, Dec 8, 2023 at 3:15 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote: > > On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote: > > > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote: > > > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote: > > > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote: > > > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions > > > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically, > > > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting > > > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk > > > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root > > > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on > > > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk. > > > > > > > > > > > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant > > > > > > > > > names like a filesystem label or other ways. Look at /dev/disk/ for the > > > > > > > > > needed ways to do this properly. > > > > > > > > > > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right? > > > > > > > > > > > > > > Depends on your system. It can also be identified, in the proper way, > > > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want > > > > > > > (note, fake uuid, use your own disk uuid please.) > > > > > > > > > > > > > > Why not do that? That's the most stable and recommended way of doing > > > > > > > things. > > > > > > > > > > > > Adapting to this change isn't straightforward, especially for a large > > > > > > fleet of servers. Our monitoring system needs to accommodate and > > > > > > adjust accordingly. > > > > > > > > > > Agreed, that can be rough. But as this is an issue that was caused by a > > > > > scsi core change, perhaps the scsi developers can describe why it's ok. > > > > > > > > > > But really, device naming has ALWAYS been known to not be > > > > > deterministic, which is why Pat and I did all the driver core work 20+ > > > > > years ago so that you have the ability to properly name your devices in > > > > > a way that is deterministic. Using the kernel name like sda is NOT > > > > > using that functionality, so while it has been nice to see that it has > > > > > been stable for you for a while, you are playing with fire here and will > > > > > get burned one day when the firmware in your devices decide to change > > > > > response times. > > > > > > > > I agree that using UUID is a better approach. However, it's worth > > > > noting that the widely used IO monitoring tool 'iostat' faces > > > > challenges when working with UUIDs. This indicates that there's a > > > > significant amount of work ahead of us in this aspect. > > > > > > That indicates that iostat needs to be fixed as this has been an option > > > that people rely on for 20+ years now. Or use a better tool :) > > > > The issue arises when a disk contains multiple partitions, such as > > /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only > > display 'sda' since only its partitions possess UUIDs. Uncertain how > > to address it yet. > > Then use one of the other many other unique ids that are in /dev/disk/ > today. You have loads of things to choose from: > $ ls /dev/disk/ > by-diskseq by-id by-label by-partlabel by-partuuid by-path by-uuid > > You have a plethera of choices here, use whatever works best for your > systems. This is a userspace decision to make, not a kernel one, as > this is a policy choice of yours. > Indeed, there are alternative methods besides using UUIDs. This example serves to highlight that UUIDs might not cover all scenarios, similar to other IDs listed under /dev/disk/.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 65731b060e3f..9b1a12b24f65 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1144,6 +1144,16 @@ match the *. Format: <driver_name1>,<driver_name2>... + driver_sync_probe= [KNL] + List of driver names to be probed synchronously. * + matches with all driver names. If * is specified, the + rest of the listed driver names are those that will NOT + match the *. + Format: <driver_name1>,<driver_name2>... + + Note that 'driver_sync_probe=' takes precedence over + 'driver_async_probe=' if both parameters are set. + drm.edid_firmware=[<connector>:]<file>[,[<connector>:]<file>] Broken monitors, graphic adapters, KVMs and EDIDless panels may send no or incorrect EDID data sets. diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 0c3725c3eefa..f4d8f0b76b26 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -58,9 +58,11 @@ static atomic_t deferred_trigger_count = ATOMIC_INIT(0); static bool initcalls_done; /* Save the async probe drivers' name from kernel cmdline */ -#define ASYNC_DRV_NAMES_MAX_LEN 256 -static char async_probe_drv_names[ASYNC_DRV_NAMES_MAX_LEN]; +#define PROBE_DRV_NAMES_MAX_LEN 256 +static char async_probe_drv_names[PROBE_DRV_NAMES_MAX_LEN]; static bool async_probe_default; +static char sync_probe_drv_names[PROBE_DRV_NAMES_MAX_LEN]; +static bool sync_probe_default; /* * In some cases, like suspend to RAM or hibernation, It might be reasonable @@ -843,30 +845,48 @@ static int driver_probe_device(struct device_driver *drv, struct device *dev) return ret; } -static inline bool cmdline_requested_async_probing(const char *drv_name) +static inline bool +cmdline_requested_probing(const char *drv_name, const char *drv_names, bool all_drv) { - bool async_drv; + bool probe_drv; - async_drv = parse_option_str(async_probe_drv_names, drv_name); - - return (async_probe_default != async_drv); + probe_drv = parse_option_str(drv_names, drv_name); + return (all_drv != probe_drv); } /* The option format is "driver_async_probe=drv_name1,drv_name2,..." */ static int __init save_async_options(char *buf) { - if (strlen(buf) >= ASYNC_DRV_NAMES_MAX_LEN) + if (strlen(buf) >= PROBE_DRV_NAMES_MAX_LEN) pr_warn("Too long list of driver names for 'driver_async_probe'!\n"); - strscpy(async_probe_drv_names, buf, ASYNC_DRV_NAMES_MAX_LEN); + strscpy(async_probe_drv_names, buf, PROBE_DRV_NAMES_MAX_LEN); async_probe_default = parse_option_str(async_probe_drv_names, "*"); return 1; } __setup("driver_async_probe=", save_async_options); +/* The option format is "driver_sync_probe=drv_name1,drv_name2,..." + * driver_sync_probe is prior to driver_async_probe if both of them are set. + */ +static int __init save_sync_options(char *buf) +{ + if (strlen(buf) >= PROBE_DRV_NAMES_MAX_LEN) + pr_warn("Too long list of driver names for 'driver_sync_probe'!\n"); + + strscpy(sync_probe_drv_names, buf, PROBE_DRV_NAMES_MAX_LEN); + sync_probe_default = parse_option_str(sync_probe_drv_names, "*"); + + return 1; +} +__setup("driver_sync_probe=", save_sync_options); + static bool driver_allows_async_probing(struct device_driver *drv) { + if (cmdline_requested_probing(drv->name, sync_probe_drv_names, sync_probe_default)) + return false; + switch (drv->probe_type) { case PROBE_PREFER_ASYNCHRONOUS: return true; @@ -875,7 +895,8 @@ static bool driver_allows_async_probing(struct device_driver *drv) return false; default: - if (cmdline_requested_async_probing(drv->name)) + if (cmdline_requested_probing(drv->name, async_probe_drv_names, + async_probe_default)) return true; if (module_requested_async_probing(drv->owner))