[5.15] dm init: add dm-mod.waitfor to wait for asynchronously probed block devices

Message ID 20230713055841.24815-1-mark-pk.tsai@mediatek.com
State New
Headers
Series [5.15] dm init: add dm-mod.waitfor to wait for asynchronously probed block devices |

Commit Message

Mark-PK Tsai (蔡沛剛) July 13, 2023, 5:58 a.m. UTC
  From: Peter Korsgaard <peter@korsgaard.com>

Just calling wait_for_device_probe() is not enough to ensure that
asynchronously probed block devices are available (E.G. mmc, usb), so
add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
dm-init to explicitly wait for specific block devices before
initializing the tables with logic similar to the rootwait logic that
was introduced with commit  cc1ed7542c8c ("init: wait for
asynchronously scanned block devices").

E.G. with dm-verity on mmc using:
dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"

[    0.671671] device-mapper: init: waiting for all devices to be available before creating mapped devices
[    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a ...
[    0.710695] mmc0: new HS200 MMC card at address 0001
[    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
[    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
[    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
[    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, chardev (249:0)
[    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
[    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a ...
[    0.751306] device-mapper: init: all devices available
[    0.751683] device-mapper: verity: sha256 using implementation "sha256-generic"
[    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
[    0.766540] VFS: Mounted root (squashfs filesystem) readonly on device 254:0.

Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
---
 .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
 drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
 2 files changed, 29 insertions(+), 1 deletion(-)
  

Comments

Greg KH July 16, 2023, 3:16 p.m. UTC | #1
On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> From: Peter Korsgaard <peter@korsgaard.com>
> 
> Just calling wait_for_device_probe() is not enough to ensure that
> asynchronously probed block devices are available (E.G. mmc, usb), so
> add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> dm-init to explicitly wait for specific block devices before
> initializing the tables with logic similar to the rootwait logic that
> was introduced with commit  cc1ed7542c8c ("init: wait for
> asynchronously scanned block devices").
> 
> E.G. with dm-verity on mmc using:
> dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> 
> [    0.671671] device-mapper: init: waiting for all devices to be available before creating mapped devices
> [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a ...
> [    0.710695] mmc0: new HS200 MMC card at address 0001
> [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, chardev (249:0)
> [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a ...
> [    0.751306] device-mapper: init: all devices available
> [    0.751683] device-mapper: verity: sha256 using implementation "sha256-generic"
> [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on device 254:0.
> 
> Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> ---
>  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
>  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
>  2 files changed, 29 insertions(+), 1 deletion(-)

What is the git commit id of this change in Linus's tree?

thanks,

greg k-h
  
Mike Snitzer July 16, 2023, 3:36 p.m. UTC | #2
On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:

> On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > From: Peter Korsgaard <peter@korsgaard.com>
> > 
> > Just calling wait_for_device_probe() is not enough to ensure that
> > asynchronously probed block devices are available (E.G. mmc, usb), so
> > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > dm-init to explicitly wait for specific block devices before
> > initializing the tables with logic similar to the rootwait logic that
> > was introduced with commit  cc1ed7542c8c ("init: wait for
> > asynchronously scanned block devices").
> > 
> > E.G. with dm-verity on mmc using:
> > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > 
> > [    0.671671] device-mapper: init: waiting for all devices to be 
> available before creating mapped devices
> > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> ...
> > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> chardev (249:0)
> > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> ...
> > [    0.751306] device-mapper: init: all devices available
> > [    0.751683] device-mapper: verity: sha256 using implementation 
> "sha256-generic"
> > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> device 254:0.
> > 
> > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > ---
> >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> >  2 files changed, 29 insertions(+), 1 deletion(-)
>
> What is the git commit id of this change in Linus's tree?
>
> thanks,
>
> greg k-h
>
>

Hey Greg,

This change shouldn't be backported to stable@. It is a feature, if
Mark-PK feels they need it older kernels they need to carry the change
in their own tree. Or at a minimum they need to explain why this
change is warranted in stable@.

But to answer your original question the upstream commit is:

035641b01e72 dm init: add dm-mod.waitfor to wait for asynchronously probed block devices

Thanks,
Mike
  
Greg KH July 16, 2023, 3:43 p.m. UTC | #3
On Sun, Jul 16, 2023 at 11:36:36AM -0400, Mike Snitzer wrote:
> On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> 
> > On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > > From: Peter Korsgaard <peter@korsgaard.com>
> > > 
> > > Just calling wait_for_device_probe() is not enough to ensure that
> > > asynchronously probed block devices are available (E.G. mmc, usb), so
> > > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > > dm-init to explicitly wait for specific block devices before
> > > initializing the tables with logic similar to the rootwait logic that
> > > was introduced with commit  cc1ed7542c8c ("init: wait for
> > > asynchronously scanned block devices").
> > > 
> > > E.G. with dm-verity on mmc using:
> > > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > > 
> > > [    0.671671] device-mapper: init: waiting for all devices to be 
> > available before creating mapped devices
> > > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> > ...
> > > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> > chardev (249:0)
> > > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> > ...
> > > [    0.751306] device-mapper: init: all devices available
> > > [    0.751683] device-mapper: verity: sha256 using implementation 
> > "sha256-generic"
> > > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> > device 254:0.
> > > 
> > > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > ---
> > >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> > >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> > >  2 files changed, 29 insertions(+), 1 deletion(-)
> >
> > What is the git commit id of this change in Linus's tree?
> >
> > thanks,
> >
> > greg k-h
> >
> >
> 
> Hey Greg,
> 
> This change shouldn't be backported to stable@. It is a feature, if
> Mark-PK feels they need it older kernels they need to carry the change
> in their own tree. Or at a minimum they need to explain why this
> change is warranted in stable@.

Ok, I'll drop it from my queue.

> But to answer your original question the upstream commit is:
> 
> 035641b01e72 dm init: add dm-mod.waitfor to wait for asynchronously probed block devices

Ah, showed up in 6.2, so we have to have a 6.1.y backport as well, I
can't take patches for only older kernels, sorry.

thanks,

greg k-h
  
Mark-PK Tsai (蔡沛剛) July 17, 2023, 1:57 a.m. UTC | #4
> On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> 
> > On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > > From: Peter Korsgaard <peter@korsgaard.com>
> > > 
> > > Just calling wait_for_device_probe() is not enough to ensure that
> > > asynchronously probed block devices are available (E.G. mmc, usb), so
> > > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > > dm-init to explicitly wait for specific block devices before
> > > initializing the tables with logic similar to the rootwait logic that
> > > was introduced with commit  cc1ed7542c8c ("init: wait for
> > > asynchronously scanned block devices").
> > > 
> > > E.G. with dm-verity on mmc using:
> > > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > > 
> > > [    0.671671] device-mapper: init: waiting for all devices to be 
> > available before creating mapped devices
> > > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> > ...
> > > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> > chardev (249:0)
> > > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> > ...
> > > [    0.751306] device-mapper: init: all devices available
> > > [    0.751683] device-mapper: verity: sha256 using implementation 
> > "sha256-generic"
> > > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> > device 254:0.
> > > 
> > > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > ---
> > >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> > >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> > >  2 files changed, 29 insertions(+), 1 deletion(-)
> >
> > What is the git commit id of this change in Linus's tree?
> >
> > thanks,
> >
> > greg k-h
> >
> >
> 
> Hey Greg,
> 
> This change shouldn't be backported to stable@. It is a feature, if
> Mark-PK feels they need it older kernels they need to carry the change
> in their own tree. Or at a minimum they need to explain why this
> change is warranted in stable@.

Thanks for your comment.
The reason why we think this should be backported to stable kernel is
that it actually fix the potential race condition when make block
device probe async in stable kernel.
And we'd like to fix this upstream rather than just take it in
our custom tree.

> 
> But to answer your original question the upstream commit is:
> 
> 035641b01e72 dm init: add dm-mod.waitfor to wait for asynchronously probed block devices
> 
> Thanks,
> Mike
  
Greg KH July 20, 2023, 5:57 p.m. UTC | #5
On Mon, Jul 17, 2023 at 09:57:28AM +0800, Mark-PK Tsai wrote:
> > On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > 
> > > On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > > > From: Peter Korsgaard <peter@korsgaard.com>
> > > > 
> > > > Just calling wait_for_device_probe() is not enough to ensure that
> > > > asynchronously probed block devices are available (E.G. mmc, usb), so
> > > > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > > > dm-init to explicitly wait for specific block devices before
> > > > initializing the tables with logic similar to the rootwait logic that
> > > > was introduced with commit  cc1ed7542c8c ("init: wait for
> > > > asynchronously scanned block devices").
> > > > 
> > > > E.G. with dm-verity on mmc using:
> > > > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > > > 
> > > > [    0.671671] device-mapper: init: waiting for all devices to be 
> > > available before creating mapped devices
> > > > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> > > ...
> > > > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > > > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > > > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > > > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > > > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> > > chardev (249:0)
> > > > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > > > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> > > ...
> > > > [    0.751306] device-mapper: init: all devices available
> > > > [    0.751683] device-mapper: verity: sha256 using implementation 
> > > "sha256-generic"
> > > > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > > > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> > > device 254:0.
> > > > 
> > > > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > ---
> > > >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> > > >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> > > >  2 files changed, 29 insertions(+), 1 deletion(-)
> > >
> > > What is the git commit id of this change in Linus's tree?
> > >
> > > thanks,
> > >
> > > greg k-h
> > >
> > >
> > 
> > Hey Greg,
> > 
> > This change shouldn't be backported to stable@. It is a feature, if
> > Mark-PK feels they need it older kernels they need to carry the change
> > in their own tree. Or at a minimum they need to explain why this
> > change is warranted in stable@.
> 
> Thanks for your comment.
> The reason why we think this should be backported to stable kernel is
> that it actually fix the potential race condition when make block
> device probe async in stable kernel.
> And we'd like to fix this upstream rather than just take it in
> our custom tree.

Potential race condition, is this actually able to be hit in real life?

thanks,

greg k-h
  
Mark-PK Tsai (蔡沛剛) July 21, 2023, 6:38 a.m. UTC | #6
> On Mon, Jul 17, 2023 at 09:57:28AM +0800, Mark-PK Tsai wrote:
> > > On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > 
> > > > On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > > > > From: Peter Korsgaard <peter@korsgaard.com>
> > > > > 
> > > > > Just calling wait_for_device_probe() is not enough to ensure that
> > > > > asynchronously probed block devices are available (E.G. mmc, usb), so
> > > > > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > > > > dm-init to explicitly wait for specific block devices before
> > > > > initializing the tables with logic similar to the rootwait logic that
> > > > > was introduced with commit  cc1ed7542c8c ("init: wait for
> > > > > asynchronously scanned block devices").
> > > > > 
> > > > > E.G. with dm-verity on mmc using:
> > > > > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > > > > 
> > > > > [    0.671671] device-mapper: init: waiting for all devices to be 
> > > > available before creating mapped devices
> > > > > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> > > > ...
> > > > > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > > > > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > > > > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > > > > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > > > > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> > > > chardev (249:0)
> > > > > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > > > > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> > > > ...
> > > > > [    0.751306] device-mapper: init: all devices available
> > > > > [    0.751683] device-mapper: verity: sha256 using implementation 
> > > > "sha256-generic"
> > > > > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > > > > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> > > > device 254:0.
> > > > > 
> > > > > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > > > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > > ---
> > > > >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> > > > >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> > > > >  2 files changed, 29 insertions(+), 1 deletion(-)
> > > >
> > > > What is the git commit id of this change in Linus's tree?
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > > >
> > > >
> > > 
> > > Hey Greg,
> > > 
> > > This change shouldn't be backported to stable@. It is a feature, if
> > > Mark-PK feels they need it older kernels they need to carry the change
> > > in their own tree. Or at a minimum they need to explain why this
> > > change is warranted in stable@.
> > 
> > Thanks for your comment.
> > The reason why we think this should be backported to stable kernel is
> > that it actually fix the potential race condition when make block
> > device probe async in stable kernel.
> > And we'd like to fix this upstream rather than just take it in
> > our custom tree.
> 
> Potential race condition, is this actually able to be hit in real life?

Yes it hanppened, and it can lead the kernel init process stuck in
the root wait loop.

Below is the log.
(I add 20 seconds delay in mtk_mci probe to quick reproduce it.)

* Before apply this pactch
[    0.368594][    T1] device-mapper: init: waiting for all devices to be available before creating mapped devices
[   21.673047][   T45] probe of 1c660000.mtk-mmc-fcie returned 0 after 21541020 usecs
[   21.673061][   T45] mtk_mci 1c660000.mtk-mmc-fcie: driver mtk_mci async attach completed: 0
[   21.680006][    T1] device-mapper: table: 254:0: verity: Data device lookup failed <--------------- start after mtk_mci probe done
[   21.680012][    T1] device-mapper: ioctl: error adding target to table <--------------------------- won't create /dev/dm-0
[   21.680067][   T67] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[   21.680184][   T67] bus: 'mmc': __driver_probe_device: matched device mmc0:0001 with driver mmcblk
[   21.680192][   T67] bus: 'mmc': really_probe: probing driver mmcblk with device mmc0:0001
[   21.680500][   T67] mmcblk0: mmc0:0001 016G01 14.5 GiB 
[   21.683404][   T67]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25 p26 p27 p28 p29
[   21.686152][   T67] mmcblk0boot0: mmc0:0001 016G01 8.00 MiB 
[   21.687166][   T67] mmcblk0boot1: mmc0:0001 016G01 8.00 MiB 
[   21.687955][   T67] mmcblk0rpmb: mmc0:0001 016G01 4.00 MiB, chardev (238:0)
[   21.687977][   T67] driver: 'mmcblk': driver_bound: bound to device 'mmc0:0001'
[   21.688004][   T67] bus: 'mmc': really_probe: bound device mmc0:0001 to driver mmcblk
[   21.688010][   T67] probe of mmc0:0001 returned 0 after 7819 usecs
[   21.688166][    T1] Waiting for root device /dev/dm-0...
[   41.023192][    T1] driver_probe_done: probe_count = 0
... can't exit from the root wait loop

* After apply this patch and add dm-mod.waitfor="PARTLABEL=rootfs"
[    0.368417][    T1] device-mapper: init: waiting for all devices to be available before creating mapped devices
[   21.672749][   T45] probe of 1c660000.mtk-mmc-fcie returned 0 after 21540992 usecs
[   21.672767][   T45] mtk_mci 1c660000.mtk-mmc-fcie: driver mtk_mci async attach completed: 0
[   21.672774][    T1] device-mapper: init: waiting for device PARTLABEL=rootfs ...
[   21.672869][   T43] mtk_mci 1c660000.mtk-mmc-fcie: eMMC: HS400 5.1 200MHz
[   21.679743][   T43] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[   21.679852][   T43] bus: 'mmc': __driver_probe_device: matched device mmc0:0001 with driver mmcblk
[   21.679858][   T43] bus: 'mmc': really_probe: probing driver mmcblk with device mmc0:0001
[   21.680204][   T43] mmcblk0: mmc0:0001 016G01 14.5 GiB 
[   21.682866][   T43]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25 p26 p27 p28 p29
[   21.685579][   T43] mmcblk0boot0: mmc0:0001 016G01 8.00 MiB 
[   21.686631][   T43] mmcblk0boot1: mmc0:0001 016G01 8.00 MiB 
[   21.687533][   T43] mmcblk0rpmb: mmc0:0001 016G01 4.00 MiB, chardev (238:0)
[   21.687559][   T43] driver: 'mmcblk': driver_bound: bound to device 'mmc0:0001'
[   21.687585][   T43] bus: 'mmc': really_probe: bound device mmc0:0001 to driver mmcblk
[   21.687591][   T43] probe of mmc0:0001 returned 0 after 7732 usecs
[   21.687838][    T1] device-mapper: init: all devices available <---------------------------------- start after PARTLABEL=rootfs is ready
[   21.688155][    T1] device-mapper: verity: sha1 using implementation "sha1-generic"
[   21.688975][    T1] device-mapper: ioctl: dm-0 (dm-verity) is ready

> 
> thanks,
> 
> greg k-h
  
Greg KH July 21, 2023, 7:02 a.m. UTC | #7
On Fri, Jul 21, 2023 at 02:38:45PM +0800, Mark-PK Tsai wrote:
> > On Mon, Jul 17, 2023 at 09:57:28AM +0800, Mark-PK Tsai wrote:
> > > > On Sun, Jul 16, 2023, 11:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > 
> > > > > On Thu, Jul 13, 2023 at 01:58:37PM +0800, Mark-PK Tsai wrote:
> > > > > > From: Peter Korsgaard <peter@korsgaard.com>
> > > > > > 
> > > > > > Just calling wait_for_device_probe() is not enough to ensure that
> > > > > > asynchronously probed block devices are available (E.G. mmc, usb), so
> > > > > > add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
> > > > > > dm-init to explicitly wait for specific block devices before
> > > > > > initializing the tables with logic similar to the rootwait logic that
> > > > > > was introduced with commit  cc1ed7542c8c ("init: wait for
> > > > > > asynchronously scanned block devices").
> > > > > > 
> > > > > > E.G. with dm-verity on mmc using:
> > > > > > dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
> > > > > > 
> > > > > > [    0.671671] device-mapper: init: waiting for all devices to be 
> > > > > available before creating mapped devices
> > > > > > [    0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a 
> > > > > ...
> > > > > > [    0.710695] mmc0: new HS200 MMC card at address 0001
> > > > > > [    0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
> > > > > > [    0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
> > > > > > [    0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
> > > > > > [    0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, 
> > > > > chardev (249:0)
> > > > > > [    0.738274]  mmcblk0: p1 p2 p3 p4 p5 p6 p7
> > > > > > [    0.751282] device-mapper: init: waiting for device PARTLABEL=root-a 
> > > > > ...
> > > > > > [    0.751306] device-mapper: init: all devices available
> > > > > > [    0.751683] device-mapper: verity: sha256 using implementation 
> > > > > "sha256-generic"
> > > > > > [    0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
> > > > > > [    0.766540] VFS: Mounted root (squashfs filesystem) readonly on 
> > > > > device 254:0.
> > > > > > 
> > > > > > Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
> > > > > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > > > ---
> > > > > >  .../admin-guide/device-mapper/dm-init.rst     |  8 +++++++
> > > > > >  drivers/md/dm-init.c                          | 22 ++++++++++++++++++-
> > > > > >  2 files changed, 29 insertions(+), 1 deletion(-)
> > > > >
> > > > > What is the git commit id of this change in Linus's tree?
> > > > >
> > > > > thanks,
> > > > >
> > > > > greg k-h
> > > > >
> > > > >
> > > > 
> > > > Hey Greg,
> > > > 
> > > > This change shouldn't be backported to stable@. It is a feature, if
> > > > Mark-PK feels they need it older kernels they need to carry the change
> > > > in their own tree. Or at a minimum they need to explain why this
> > > > change is warranted in stable@.
> > > 
> > > Thanks for your comment.
> > > The reason why we think this should be backported to stable kernel is
> > > that it actually fix the potential race condition when make block
> > > device probe async in stable kernel.
> > > And we'd like to fix this upstream rather than just take it in
> > > our custom tree.
> > 
> > Potential race condition, is this actually able to be hit in real life?
> 
> Yes it hanppened, and it can lead the kernel init process stuck in
> the root wait loop.
> 
> Below is the log.
> (I add 20 seconds delay in mtk_mci probe to quick reproduce it.)
> 
> * Before apply this pactch
> [    0.368594][    T1] device-mapper: init: waiting for all devices to be available before creating mapped devices
> [   21.673047][   T45] probe of 1c660000.mtk-mmc-fcie returned 0 after 21541020 usecs
> [   21.673061][   T45] mtk_mci 1c660000.mtk-mmc-fcie: driver mtk_mci async attach completed: 0
> [   21.680006][    T1] device-mapper: table: 254:0: verity: Data device lookup failed <--------------- start after mtk_mci probe done
> [   21.680012][    T1] device-mapper: ioctl: error adding target to table <--------------------------- won't create /dev/dm-0
> [   21.680067][   T67] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> [   21.680184][   T67] bus: 'mmc': __driver_probe_device: matched device mmc0:0001 with driver mmcblk
> [   21.680192][   T67] bus: 'mmc': really_probe: probing driver mmcblk with device mmc0:0001
> [   21.680500][   T67] mmcblk0: mmc0:0001 016G01 14.5 GiB 
> [   21.683404][   T67]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25 p26 p27 p28 p29
> [   21.686152][   T67] mmcblk0boot0: mmc0:0001 016G01 8.00 MiB 
> [   21.687166][   T67] mmcblk0boot1: mmc0:0001 016G01 8.00 MiB 
> [   21.687955][   T67] mmcblk0rpmb: mmc0:0001 016G01 4.00 MiB, chardev (238:0)
> [   21.687977][   T67] driver: 'mmcblk': driver_bound: bound to device 'mmc0:0001'
> [   21.688004][   T67] bus: 'mmc': really_probe: bound device mmc0:0001 to driver mmcblk
> [   21.688010][   T67] probe of mmc0:0001 returned 0 after 7819 usecs
> [   21.688166][    T1] Waiting for root device /dev/dm-0...
> [   41.023192][    T1] driver_probe_done: probe_count = 0
> ... can't exit from the root wait loop
> 
> * After apply this patch and add dm-mod.waitfor="PARTLABEL=rootfs"
> [    0.368417][    T1] device-mapper: init: waiting for all devices to be available before creating mapped devices
> [   21.672749][   T45] probe of 1c660000.mtk-mmc-fcie returned 0 after 21540992 usecs
> [   21.672767][   T45] mtk_mci 1c660000.mtk-mmc-fcie: driver mtk_mci async attach completed: 0
> [   21.672774][    T1] device-mapper: init: waiting for device PARTLABEL=rootfs ...
> [   21.672869][   T43] mtk_mci 1c660000.mtk-mmc-fcie: eMMC: HS400 5.1 200MHz
> [   21.679743][   T43] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> [   21.679852][   T43] bus: 'mmc': __driver_probe_device: matched device mmc0:0001 with driver mmcblk
> [   21.679858][   T43] bus: 'mmc': really_probe: probing driver mmcblk with device mmc0:0001
> [   21.680204][   T43] mmcblk0: mmc0:0001 016G01 14.5 GiB 
> [   21.682866][   T43]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25 p26 p27 p28 p29
> [   21.685579][   T43] mmcblk0boot0: mmc0:0001 016G01 8.00 MiB 
> [   21.686631][   T43] mmcblk0boot1: mmc0:0001 016G01 8.00 MiB 
> [   21.687533][   T43] mmcblk0rpmb: mmc0:0001 016G01 4.00 MiB, chardev (238:0)
> [   21.687559][   T43] driver: 'mmcblk': driver_bound: bound to device 'mmc0:0001'
> [   21.687585][   T43] bus: 'mmc': really_probe: bound device mmc0:0001 to driver mmcblk
> [   21.687591][   T43] probe of mmc0:0001 returned 0 after 7732 usecs
> [   21.687838][    T1] device-mapper: init: all devices available <---------------------------------- start after PARTLABEL=rootfs is ready
> [   21.688155][    T1] device-mapper: verity: sha1 using implementation "sha1-generic"
> [   21.688975][    T1] device-mapper: ioctl: dm-0 (dm-verity) is ready
> 

Ok, seems sane, I'll queue this up for 6.1.y and 5.15.y now.

thanks,

greg k-h
  

Patch

diff --git a/Documentation/admin-guide/device-mapper/dm-init.rst b/Documentation/admin-guide/device-mapper/dm-init.rst
index e5242ff17e9b..981d6a907699 100644
--- a/Documentation/admin-guide/device-mapper/dm-init.rst
+++ b/Documentation/admin-guide/device-mapper/dm-init.rst
@@ -123,3 +123,11 @@  Other examples (per target):
     0 1638400 verity 1 8:1 8:2 4096 4096 204800 1 sha256
     fb1a5a0f00deb908d8b53cb270858975e76cf64105d412ce764225d53b8f3cfd
     51934789604d1b92399c52e7cb149d1b3a1b74bbbcb103b2a0aaacbed5c08584
+
+For setups using device-mapper on top of asynchronously probed block
+devices (MMC, USB, ..), it may be necessary to tell dm-init to
+explicitly wait for them to become available before setting up the
+device-mapper tables. This can be done with the "dm-mod.waitfor="
+module parameter, which takes a list of devices to wait for::
+
+  dm-mod.waitfor=<device1>[,..,<deviceN>]
diff --git a/drivers/md/dm-init.c b/drivers/md/dm-init.c
index b0c45c6ebe0b..dc4381d68313 100644
--- a/drivers/md/dm-init.c
+++ b/drivers/md/dm-init.c
@@ -8,6 +8,7 @@ 
  */
 
 #include <linux/ctype.h>
+#include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/device-mapper.h>
 #include <linux/init.h>
@@ -18,12 +19,17 @@ 
 #define DM_MAX_DEVICES 256
 #define DM_MAX_TARGETS 256
 #define DM_MAX_STR_SIZE 4096
+#define DM_MAX_WAITFOR 256
 
 static char *create;
 
+static char *waitfor[DM_MAX_WAITFOR];
+
 /*
  * Format: dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+]
  * Table format: <start_sector> <num_sectors> <target_type> <target_args>
+ * Block devices to wait for to become available before setting up tables:
+ * dm-mod.waitfor=<device1>[,..,<deviceN>]
  *
  * See Documentation/admin-guide/device-mapper/dm-init.rst for dm-mod.create="..." format
  * details.
@@ -266,7 +272,7 @@  static int __init dm_init_init(void)
 	struct dm_device *dev;
 	LIST_HEAD(devices);
 	char *str;
-	int r;
+	int i, r;
 
 	if (!create)
 		return 0;
@@ -286,6 +292,17 @@  static int __init dm_init_init(void)
 	DMINFO("waiting for all devices to be available before creating mapped devices");
 	wait_for_device_probe();
 
+	for (i = 0; i < ARRAY_SIZE(waitfor); i++) {
+		if (waitfor[i]) {
+			DMINFO("waiting for device %s ...", waitfor[i]);
+			while (!dm_get_dev_t(waitfor[i]))
+				msleep(5);
+		}
+	}
+
+	if (waitfor[0])
+		DMINFO("all devices available");
+
 	list_for_each_entry(dev, &devices, list) {
 		if (dm_early_create(&dev->dmi, dev->table,
 				    dev->target_args_array))
@@ -301,3 +318,6 @@  late_initcall(dm_init_init);
 
 module_param(create, charp, 0);
 MODULE_PARM_DESC(create, "Create a mapped device in early boot");
+
+module_param_array(waitfor, charp, NULL, 0);
+MODULE_PARM_DESC(waitfor, "Devices to wait for before setting up tables");