soundwire: bus_type: Avoid lockdep assert in sdw_drv_probe()

Message ID 20221121162453.1834170-1-rf@opensource.cirrus.com
State New
Headers
Series soundwire: bus_type: Avoid lockdep assert in sdw_drv_probe() |

Commit Message

Richard Fitzgerald Nov. 21, 2022, 4:24 p.m. UTC
  Don't hold sdw_dev_lock while calling the peripheral driver
probe() and remove() callbacks.

Holding sdw_dev_lock around the probe() and remove() calls
causes a theoretical mutex inversion which lockdep will
assert on. The peripheral driver probe will probably register
a soundcard, which will take ALSA and ASoC locks. During
normal operation a runtime resume suspend can be triggered
while these locks are held and will then take sdw_dev_lock.

It's not necessary to hold sdw_dev_lock when calling the
probe() and remove(), it is only used to prevent the bus core
calling the driver callbacks if there isn't a driver or the
driver is removing.

If sdw_dev_lock is held while setting and clearing the
'probed' flag this is sufficient to guarantee the safety of
callback functions.

The potential race of a bus event happening while probe() is
executing is the same as the existing race of the bus event
handler taking the mutex first and processing the event
before probe() can run. In both cases the event has already
happened before the driver is probed and ready to accept
callbacks.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
---
 drivers/soundwire/bus_type.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)
  

Comments

Charles Keepax Nov. 23, 2022, 9:35 a.m. UTC | #1
On Mon, Nov 21, 2022 at 04:24:52PM +0000, Richard Fitzgerald wrote:
> Don't hold sdw_dev_lock while calling the peripheral driver
> probe() and remove() callbacks.
> 
> Holding sdw_dev_lock around the probe() and remove() calls
> causes a theoretical mutex inversion which lockdep will
> assert on. The peripheral driver probe will probably register
> a soundcard, which will take ALSA and ASoC locks. During
> normal operation a runtime resume suspend can be triggered
> while these locks are held and will then take sdw_dev_lock.
> 
> It's not necessary to hold sdw_dev_lock when calling the
> probe() and remove(), it is only used to prevent the bus core
> calling the driver callbacks if there isn't a driver or the
> driver is removing.
> 
> If sdw_dev_lock is held while setting and clearing the
> 'probed' flag this is sufficient to guarantee the safety of
> callback functions.
> 
> The potential race of a bus event happening while probe() is
> executing is the same as the existing race of the bus event
> handler taking the mutex first and processing the event
> before probe() can run. In both cases the event has already
> happened before the driver is probed and ready to accept
> callbacks.
> 
> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
> ---

Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>

Thanks,
Charles
  
Pierre-Louis Bossart Nov. 28, 2022, 5:18 p.m. UTC | #2
On 11/21/22 10:24, Richard Fitzgerald wrote:
> Don't hold sdw_dev_lock while calling the peripheral driver
> probe() and remove() callbacks.
> 
> Holding sdw_dev_lock around the probe() and remove() calls
> causes a theoretical mutex inversion which lockdep will
> assert on. The peripheral driver probe will probably register
> a soundcard, which will take ALSA and ASoC locks. During

It's extremely unlikely that a peripheral driver would register a sound
card, this is what machine drivers do.

Which leads me to the question: is this a real problem?

Or did you mean 'register components', and if yes what would the problem
with lockdep be?

> normal operation a runtime resume suspend can be triggered
> while these locks are held and will then take sdw_dev_lock.
> 
> It's not necessary to hold sdw_dev_lock when calling the
> probe() and remove(), it is only used to prevent the bus core
> calling the driver callbacks if there isn't a driver or the
> driver is removing.


> If sdw_dev_lock is held while setting and clearing the
> 'probed' flag this is sufficient to guarantee the safety of
> callback functions.

not really, the 'probed' flag was kept for convenience. what this lock
really protects is the dereferencing of ops after the driver .remove
happens.

> The potential race of a bus event happening while probe() is
> executing is the same as the existing race of the bus event
> handler taking the mutex first and processing the event
> before probe() can run. In both cases the event has already
> happened before the driver is probed and ready to accept
> callbacks.

Sorry, I wasn't able to parse the first sentence in this paragraph. what
'existing race' are you referring to?

> 
> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
> ---
>  drivers/soundwire/bus_type.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c
> index 04b3529f8929..963498db0fd2 100644
> --- a/drivers/soundwire/bus_type.c
> +++ b/drivers/soundwire/bus_type.c
> @@ -105,20 +105,19 @@ static int sdw_drv_probe(struct device *dev)
>  	if (ret)
>  		return ret;
>  
> -	mutex_lock(&slave->sdw_dev_lock);
> -
>  	ret = drv->probe(slave, id);
>  	if (ret) {
>  		name = drv->name;
>  		if (!name)
>  			name = drv->driver.name;
> -		mutex_unlock(&slave->sdw_dev_lock);
>  
>  		dev_err(dev, "Probe of %s failed: %d\n", name, ret);
>  		dev_pm_domain_detach(dev, false);
>  		return ret;
>  	}
>  
> +	mutex_lock(&slave->sdw_dev_lock);
> +
>  	/* device is probed so let's read the properties now */
>  	if (drv->ops && drv->ops->read_prop)
>  		drv->ops->read_prop(slave);
> @@ -167,14 +166,12 @@ static int sdw_drv_remove(struct device *dev)
>  	int ret = 0;
>  
>  	mutex_lock(&slave->sdw_dev_lock);
> -
>  	slave->probed = false;
> +	mutex_unlock(&slave->sdw_dev_lock);
>  
>  	if (drv->remove)
>  		ret = drv->remove(slave);
>  
> -	mutex_unlock(&slave->sdw_dev_lock);
> -
>  	dev_pm_domain_detach(dev, false);
>  
>  	return ret;
  
Richard Fitzgerald Nov. 29, 2022, 10:47 a.m. UTC | #3
On 28/11/2022 17:18, Pierre-Louis Bossart wrote:
> 
> 
> On 11/21/22 10:24, Richard Fitzgerald wrote:
>> Don't hold sdw_dev_lock while calling the peripheral driver
>> probe() and remove() callbacks.
>>
>> Holding sdw_dev_lock around the probe() and remove() calls
>> causes a theoretical mutex inversion which lockdep will
>> assert on. The peripheral driver probe will probably register
>> a soundcard, which will take ALSA and ASoC locks. During
> 
> It's extremely unlikely that a peripheral driver would register a sound
> card, this is what machine drivers do.
> 
> Which leads me to the question: is this a real problem?
> 

Yes, try turning on lockdep checking and you will get an assert.
During probe the existing code takes sdw_dev_lock and then calls the
codec driver probe, so you will get a mutex sequence like:

sdw_dev_lock -> controls_rw_sem -> pcm_mutex

but in normal operation the ALSA/ASoC code will take its mutexes first
and call runtime_resume which then takes the sdw_dev_lock, so you get

pcm_mutex -> sdw_dev_lock

and lockdep will assert on that opposite ordering.
The full assert is at the end of this email.

> Or did you mean 'register components', and if yes what would the problem
> with lockdep be?
> 

Argh. Yes, I meant snd_soc_register_component(). I'll re-send this with
the comment fixed.

>> normal operation a runtime resume suspend can be triggered
>> while these locks are held and will then take sdw_dev_lock.
>>
>> It's not necessary to hold sdw_dev_lock when calling the
>> probe() and remove(), it is only used to prevent the bus core
>> calling the driver callbacks if there isn't a driver or the
>> driver is removing.
> 
> 
>> If sdw_dev_lock is held while setting and clearing the
>> 'probed' flag this is sufficient to guarantee the safety of
>> callback functions.
> 
> not really, the 'probed' flag was kept for convenience. what this lock
> really protects is the dereferencing of ops after the driver .remove
> happens.
> 

Yes, I thought about removing it but that's a larger code change that's
not necessary for the fix. The point is that we don't need to hold the
mutex around the remove call, only around clearing 'probed' (or, if
probed was removed, NULLing the callback pointers)

>> The potential race of a bus event happening while probe() is
>> executing is the same as the existing race of the bus event
>> handler taking the mutex first and processing the event
>> before probe() can run. In both cases the event has already
>> happened before the driver is probed and ready to accept
>> callbacks.
> 
> Sorry, I wasn't able to parse the first sentence in this paragraph. what
> 'existing race' are you referring to?
> 

I will re-write that. The point was that without the lock the probe()
can run in parallel with a status update. But that is already a race
with the existing code because the status update could take the
sdw_dev_lock first. So I think that this change isn't introducing any
new races.

The lockdep assert looks like this:
[   46.098514] ======================================================
[   46.104736] WARNING: possible circular locking dependency detected
[   46.110961] 6.1.0-rc4-jamerson #1 Tainted: G            E
[   46.116842] ------------------------------------------------------
[   46.123063] mpg123/1130 is trying to acquire lock:
[   46.127883] ffff8b445031fb80 (&slave->sdw_dev_lock){+.+.}-{3:3}, at: 
sdw_update_slave_status+0x26/0x70
[   46.137225]
                but task is already holding lock:
[   46.143074] ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at: 
dpcm_fe_dai_open+0x49/0x830
[   46.151536]
                which lock already depends on the new lock.[   46.159732]
                the existing dependency chain (in reverse order) is:
[   46.167231]
                -> #4 (&card->pcm_mutex){+.+.}-{3:3}:
[   46.173428]        __mutex_lock+0x94/0x920
[   46.177542]        snd_soc_dpcm_runtime_update+0x2e/0x100
[   46.182958]        snd_soc_dapm_put_enum_double+0x1c2/0x200
[   46.188548]        snd_ctl_elem_write+0x10c/0x1d0
[   46.193268]        snd_ctl_ioctl+0x126/0x850
[   46.197556]        __x64_sys_ioctl+0x87/0xc0
[   46.201845]        do_syscall_64+0x38/0x90
[   46.205959]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.211553]
                -> #3 (&card->controls_rwsem){++++}-{3:3}:
[   46.218188]        down_write+0x2b/0xd0
[   46.222038]        snd_ctl_add_replace+0x39/0xb0
[   46.226672]        snd_soc_add_controls+0x53/0x80
[   46.231393]        soc_probe_component+0x1e4/0x2a0
[   46.236202]        snd_soc_bind_card+0x51a/0xc80
[   46.240836]        devm_snd_soc_register_card+0x43/0x90
[   46.246079]        mc_probe+0x982/0xfe0 [snd_soc_sof_sdw]
[   46.251500]        platform_probe+0x3c/0xa0
[   46.255700]        really_probe+0xde/0x390
[   46.259814]        __driver_probe_device+0x78/0x180
[   46.264710]        driver_probe_device+0x1e/0x90
[   46.269347]        __driver_attach+0x9f/0x1f0
[   46.273721]        bus_for_each_dev+0x78/0xc0
[   46.278098]        bus_add_driver+0x1ac/0x200
[   46.282473]        driver_register+0x8f/0xf0
[   46.286759]        do_one_initcall+0x58/0x310
[   46.291136]        do_init_module+0x4c/0x1f0
[   46.295422]        __do_sys_finit_module+0xb4/0x130
[   46.300321]        do_syscall_64+0x38/0x90
[   46.304434]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.310027]
                -> #2 (&card->mutex){+.+.}-{3:3}:
[   46.315883]        __mutex_lock+0x94/0x920
[   46.320000]        snd_soc_bind_card+0x3e/0xc80
[   46.324551]        devm_snd_soc_register_card+0x43/0x90
[   46.329798]        mc_probe+0x982/0xfe0 [snd_soc_sof_sdw]
[   46.335219]        platform_probe+0x3c/0xa0
[   46.339420]        really_probe+0xde/0x390
[   46.343532]        __driver_probe_device+0x78/0x180
[   46.348430]        driver_probe_device+0x1e/0x90
[   46.353065]        __driver_attach+0x9f/0x1f0
[   46.357437]        bus_for_each_dev+0x78/0xc0
[   46.361812]        bus_add_driver+0x1ac/0x200
[   46.366716]        driver_register+0x8f/0xf0
[   46.371528]        do_one_initcall+0x58/0x310
[   46.376424]        do_init_module+0x4c/0x1f0
[   46.381239]        __do_sys_finit_module+0xb4/0x130
[   46.386665]        do_syscall_64+0x38/0x90
[   46.391299]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.397416]
                -> #1 (client_mutex){+.+.}-{3:3}:
[   46.404307]        __mutex_lock+0x94/0x920
[   46.408941]        snd_soc_add_component+0x24/0x2c0
[   46.414345]        devm_snd_soc_register_component+0x54/0xa0
[   46.420522]        cs35l56_common_probe+0x280/0x370 [snd_soc_cs35l56]
[   46.427487]        cs35l56_sdw_probe+0xf4/0x170 [snd_soc_cs35l56_sdw]
[   46.434442]        sdw_drv_probe+0x80/0x1a0
[   46.439136]        really_probe+0xde/0x390
[   46.443738]        __driver_probe_device+0x78/0x180
[   46.449120]        driver_probe_device+0x1e/0x90
[   46.454247]        __driver_attach+0x9f/0x1f0
[   46.459106]        bus_for_each_dev+0x78/0xc0
[   46.463971]        bus_add_driver+0x1ac/0x200
[   46.468825]        driver_register+0x8f/0xf0
[   46.473592]        do_one_initcall+0x58/0x310
[   46.478441]        do_init_module+0x4c/0x1f0
[   46.483202]        __do_sys_finit_module+0xb4/0x130
[   46.488572]        do_syscall_64+0x38/0x90
[   46.493158]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.499229]
                -> #0 (&slave->sdw_dev_lock){+.+.}-{3:3}:
[   46.506737]        __lock_acquire+0x1121/0x1df0
[   46.511765]        lock_acquire+0xd5/0x300
[   46.516360]        __mutex_lock+0x94/0x920
[   46.520949]        sdw_update_slave_status+0x26/0x70
[   46.526409]        sdw_clear_slave_status+0xd8/0xe0
[   46.531783]        intel_resume_runtime+0x139/0x2a0
[   46.537155]        __rpm_callback+0x41/0x120
[   46.541919]        rpm_callback+0x5d/0x70
[   46.546422]        rpm_resume+0x531/0x7e0
[   46.550920]        __pm_runtime_resume+0x4a/0x80
[   46.556024]        snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0
[   46.562611]        __soc_pcm_open+0x62/0x520
[   46.567375]        dpcm_be_dai_startup+0x116/0x210
[   46.572661]        dpcm_fe_dai_open+0xf7/0x830
[   46.577597]        snd_pcm_open_substream+0x54a/0x8b0
[   46.583145]        snd_pcm_open.part.0+0xdc/0x200
[   46.588341]        snd_pcm_playback_open+0x51/0x80
[   46.593625]        chrdev_open+0xc0/0x250
[   46.598129]        do_dentry_open+0x15f/0x430
[   46.602981]        path_openat+0x75e/0xa80
[   46.607575]        do_filp_open+0xb2/0x160
[   46.612162]        do_sys_openat2+0x9a/0x160
[   46.616922]        __x64_sys_openat+0x53/0xa0
[   46.621767]        do_syscall_64+0x38/0x90
[   46.626352]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.632414]
                other info that might help us debug this:[   46.641862] 
Chain exists of:
                  &slave->sdw_dev_lock --> &card->controls_rwsem --> 
&card->pcm_mutex[   46.655145]  Possible unsafe locking scenario:[   
46.662048]        CPU0                    CPU1
[   46.667080]        ----                    ----
[   46.672108]   lock(&card->pcm_mutex);
[   46.676267]                                lock(&card->controls_rwsem);
[   46.683382]                                lock(&card->pcm_mutex);
[   46.690063]   lock(&slave->sdw_dev_lock);
[   46.694574]
                 *** DEADLOCK ***[   46.701942] 2 locks held by mpg123/1130:
[   46.706356]  #0: ffff8b4457b22b90 (&pcm->open_mutex){+.+.}-{3:3}, at: 
snd_pcm_open.part.0+0xc9/0x200
[   46.715999]  #1: ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at: 
dpcm_fe_dai_open+0x49/0x830
[   46.725390]
                stack backtrace:
[   46.730752] CPU: 0 PID: 1130 Comm: mpg123 Tainted: G            E     
  6.1.0-rc4-jamerson #1
[   46.739703] Hardware name: AAEON UP-WHL01/UP-WHL01, BIOS UPW1AM19 
11/10/2020
[   46.747270] Call Trace:
[   46.750239]  <TASK>
[   46.752857]  dump_stack_lvl+0x56/0x73
[   46.757045]  check_noncircular+0x102/0x120
[   46.761664]  __lock_acquire+0x1121/0x1df0
[   46.766197]  lock_acquire+0xd5/0x300
[   46.770292]  ? sdw_update_slave_status+0x26/0x70
[   46.775432]  ? lock_is_held_type+0xe2/0x140
[   46.780143]  __mutex_lock+0x94/0x920
[   46.784241]  ? sdw_update_slave_status+0x26/0x70
[   46.789387]  ? find_held_lock+0x2b/0x80
[   46.793750]  ? sdw_update_slave_status+0x26/0x70
[   46.798894]  ? lock_release+0x147/0x2f0
[   46.803262]  ? lockdep_init_map_type+0x47/0x250
[   46.808315]  ? sdw_update_slave_status+0x26/0x70
[   46.813456]  sdw_update_slave_status+0x26/0x70
[   46.818422]  sdw_clear_slave_status+0xd8/0xe0
[   46.823302]  ? pm_generic_runtime_suspend+0x30/0x30
[   46.828706]  intel_resume_runtime+0x139/0x2a0
[   46.833583]  ? _raw_spin_unlock_irq+0x24/0x50
[   46.838462]  ? pm_generic_runtime_suspend+0x30/0x30
[   46.843866]  __rpm_callback+0x41/0x120
[   46.848142]  ? pm_generic_runtime_suspend+0x30/0x30
[   46.853550]  rpm_callback+0x5d/0x70
[   46.857568]  rpm_resume+0x531/0x7e0
[   46.861578]  ? _raw_spin_lock_irqsave+0x62/0x70
[   46.866634]  __pm_runtime_resume+0x4a/0x80
[   46.871258]  snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0
[   46.877358]  __soc_pcm_open+0x62/0x520
[   46.881634]  ? dpcm_add_paths.isra.0+0x35d/0x4c0
[   46.886784]  dpcm_be_dai_startup+0x116/0x210
[   46.891592]  dpcm_fe_dai_open+0xf7/0x830
[   46.896046]  ? debug_mutex_init+0x33/0x50
[   46.900591]  snd_pcm_open_substream+0x54a/0x8b0
[   46.905658]  snd_pcm_open.part.0+0xdc/0x200
[   46.910376]  ? wake_up_q+0x90/0x90
[   46.914312]  snd_pcm_playback_open+0x51/0x80
[   46.919118]  chrdev_open+0xc0/0x250
[   46.923147]  ? cdev_device_add+0x90/0x90
[   46.927608]  do_dentry_open+0x15f/0x430
[   46.931976]  path_openat+0x75e/0xa80
[   46.936086]  do_filp_open+0xb2/0x160
[   46.940194]  ? lock_release+0x147/0x2f0
[   46.944563]  ? _raw_spin_unlock+0x29/0x50
[   46.949101]  do_sys_openat2+0x9a/0x160
[   46.953377]  __x64_sys_openat+0x53/0xa0
[   46.957733]  do_syscall_64+0x38/0x90
[   46.961829]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   46.967402] RIP: 0033:0x7fa6397ccd3b
[   46.971506] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 
00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 
05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[   46.991413] RSP: 002b:00007fff838e8990 EFLAGS: 00000246 ORIG_RAX: 
0000000000000101
[   46.999580] RAX: ffffffffffffffda RBX: 0000000000080802 RCX: 
00007fa6397ccd3b
[   47.007311] RDX: 0000000000080802 RSI: 00007fff838e8b50 RDI: 
00000000ffffff9c
[   47.015047] RBP: 00007fff838e8b50 R08: 0000000000000000 R09: 
0000000000000011
[   47.022787] R10: 0000000000000000 R11: 0000000000000246 R12: 
0000000000080802
[   47.030539] R13: 0000000000000004 R14: 0000000000000000 R15: 
00007fff838e8b50
[   47.038289]  </TASK>
  
Pierre-Louis Bossart Nov. 29, 2022, 3:44 p.m. UTC | #4
>>> Don't hold sdw_dev_lock while calling the peripheral driver
>>> probe() and remove() callbacks.
>>>
>>> Holding sdw_dev_lock around the probe() and remove() calls
>>> causes a theoretical mutex inversion which lockdep will
>>> assert on. The peripheral driver probe will probably register
>>> a soundcard, which will take ALSA and ASoC locks. During
>>
>> It's extremely unlikely that a peripheral driver would register a sound
>> card, this is what machine drivers do.
>>
>> Which leads me to the question: is this a real problem?
>>
> 
> Yes, try turning on lockdep checking and you will get an assert.
> During probe the existing code takes sdw_dev_lock and then calls the
> codec driver probe, so you will get a mutex sequence like:
> 
> sdw_dev_lock -> controls_rw_sem -> pcm_mutex
> 
> but in normal operation the ALSA/ASoC code will take its mutexes first
> and call runtime_resume which then takes the sdw_dev_lock, so you get
> 
> pcm_mutex -> sdw_dev_lock
> 
> and lockdep will assert on that opposite ordering.
> The full assert is at the end of this email.

Humm, you lost me with the reference to runtime_resume. I don't fully
understand how it's possible to invoke pm_runtime during probe.
pm_runtime should only enabled during the codec update_status() which
can only be done once the probe completes.

I am fine with the changes that you are suggesting, the introduction of
the sdw_dev_lock was probably too conservative and it'd be fine to only
protect what is required.

However we do have lockdep enabled

CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=15
CONFIG_LOCKDEP_CHAINS_BITS=16
CONFIG_LOCKDEP_STACK_TRACE_BITS=19
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
# CONFIG_DEBUG_LOCKDEP is not set

and we've never seen this, so I wonder if there is a codec driver
sequence that is not expected and causes this problem to pop-up. It
could also be that we missed something in the config, do you have
anything additional in your .config?

>> Or did you mean 'register components', and if yes what would the problem
>> with lockdep be?
>>
> 
> Argh. Yes, I meant snd_soc_register_component(). I'll re-send this with
> the comment fixed.

ok

>>> normal operation a runtime resume suspend can be triggered
>>> while these locks are held and will then take sdw_dev_lock.

here I am lost as well, was the problem on resume or suspend?

>>> It's not necessary to hold sdw_dev_lock when calling the
>>> probe() and remove(), it is only used to prevent the bus core
>>> calling the driver callbacks if there isn't a driver or the
>>> driver is removing.
>>
>>
>>> If sdw_dev_lock is held while setting and clearing the
>>> 'probed' flag this is sufficient to guarantee the safety of
>>> callback functions.
>>
>> not really, the 'probed' flag was kept for convenience. what this lock
>> really protects is the dereferencing of ops after the driver .remove
>> happens.
>>
> 
> Yes, I thought about removing it but that's a larger code change that's
> not necessary for the fix. The point is that we don't need to hold the
> mutex around the remove call, only around clearing 'probed' (or, if
> probed was removed, NULLing the callback pointers)

Agree.

>>> The potential race of a bus event happening while probe() is
>>> executing is the same as the existing race of the bus event
>>> handler taking the mutex first and processing the event
>>> before probe() can run. In both cases the event has already
>>> happened before the driver is probed and ready to accept
>>> callbacks.
>>
>> Sorry, I wasn't able to parse the first sentence in this paragraph. what
>> 'existing race' are you referring to?
>>
> 
> I will re-write that. The point was that without the lock the probe()
> can run in parallel with a status update. But that is already a race
> with the existing code because the status update could take the
> sdw_dev_lock first. So I think that this change isn't introducing any
> new races.

Still not following. The requirement is that probe() and
update_slave_status() are serialized, without assuming any order/dependency.


> The lockdep assert looks like this:
> [   46.098514] ======================================================
> [   46.104736] WARNING: possible circular locking dependency detected
> [   46.110961] 6.1.0-rc4-jamerson #1 Tainted: G            E
> [   46.116842] ------------------------------------------------------
> [   46.123063] mpg123/1130 is trying to acquire lock:
> [   46.127883] ffff8b445031fb80 (&slave->sdw_dev_lock){+.+.}-{3:3}, at:
> sdw_update_slave_status+0x26/0x70
> [   46.137225]
>                but task is already holding lock:
> [   46.143074] ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at:
> dpcm_fe_dai_open+0x49/0x830
> [   46.151536]
>                which lock already depends on the new lock.[   46.159732]
>                the existing dependency chain (in reverse order) is:
> [   46.167231]
>                -> #4 (&card->pcm_mutex){+.+.}-{3:3}:
> [   46.173428]        __mutex_lock+0x94/0x920
> [   46.177542]        snd_soc_dpcm_runtime_update+0x2e/0x100
> [   46.182958]        snd_soc_dapm_put_enum_double+0x1c2/0x200
> [   46.188548]        snd_ctl_elem_write+0x10c/0x1d0
> [   46.193268]        snd_ctl_ioctl+0x126/0x850
> [   46.197556]        __x64_sys_ioctl+0x87/0xc0
> [   46.201845]        do_syscall_64+0x38/0x90
> [   46.205959]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.211553]
>                -> #3 (&card->controls_rwsem){++++}-{3:3}:
> [   46.218188]        down_write+0x2b/0xd0
> [   46.222038]        snd_ctl_add_replace+0x39/0xb0
> [   46.226672]        snd_soc_add_controls+0x53/0x80
> [   46.231393]        soc_probe_component+0x1e4/0x2a0
> [   46.236202]        snd_soc_bind_card+0x51a/0xc80
> [   46.240836]        devm_snd_soc_register_card+0x43/0x90
> [   46.246079]        mc_probe+0x982/0xfe0 [snd_soc_sof_sdw]
> [   46.251500]        platform_probe+0x3c/0xa0
> [   46.255700]        really_probe+0xde/0x390
> [   46.259814]        __driver_probe_device+0x78/0x180
> [   46.264710]        driver_probe_device+0x1e/0x90
> [   46.269347]        __driver_attach+0x9f/0x1f0
> [   46.273721]        bus_for_each_dev+0x78/0xc0
> [   46.278098]        bus_add_driver+0x1ac/0x200
> [   46.282473]        driver_register+0x8f/0xf0
> [   46.286759]        do_one_initcall+0x58/0x310
> [   46.291136]        do_init_module+0x4c/0x1f0
> [   46.295422]        __do_sys_finit_module+0xb4/0x130
> [   46.300321]        do_syscall_64+0x38/0x90
> [   46.304434]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.310027]
>                -> #2 (&card->mutex){+.+.}-{3:3}:
> [   46.315883]        __mutex_lock+0x94/0x920
> [   46.320000]        snd_soc_bind_card+0x3e/0xc80
> [   46.324551]        devm_snd_soc_register_card+0x43/0x90
> [   46.329798]        mc_probe+0x982/0xfe0 [snd_soc_sof_sdw]
> [   46.335219]        platform_probe+0x3c/0xa0
> [   46.339420]        really_probe+0xde/0x390
> [   46.343532]        __driver_probe_device+0x78/0x180
> [   46.348430]        driver_probe_device+0x1e/0x90
> [   46.353065]        __driver_attach+0x9f/0x1f0
> [   46.357437]        bus_for_each_dev+0x78/0xc0
> [   46.361812]        bus_add_driver+0x1ac/0x200
> [   46.366716]        driver_register+0x8f/0xf0
> [   46.371528]        do_one_initcall+0x58/0x310
> [   46.376424]        do_init_module+0x4c/0x1f0
> [   46.381239]        __do_sys_finit_module+0xb4/0x130
> [   46.386665]        do_syscall_64+0x38/0x90
> [   46.391299]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.397416]
>                -> #1 (client_mutex){+.+.}-{3:3}:
> [   46.404307]        __mutex_lock+0x94/0x920
> [   46.408941]        snd_soc_add_component+0x24/0x2c0
> [   46.414345]        devm_snd_soc_register_component+0x54/0xa0
> [   46.420522]        cs35l56_common_probe+0x280/0x370 [snd_soc_cs35l56]
> [   46.427487]        cs35l56_sdw_probe+0xf4/0x170 [snd_soc_cs35l56_sdw]
> [   46.434442]        sdw_drv_probe+0x80/0x1a0
> [   46.439136]        really_probe+0xde/0x390
> [   46.443738]        __driver_probe_device+0x78/0x180
> [   46.449120]        driver_probe_device+0x1e/0x90
> [   46.454247]        __driver_attach+0x9f/0x1f0
> [   46.459106]        bus_for_each_dev+0x78/0xc0
> [   46.463971]        bus_add_driver+0x1ac/0x200
> [   46.468825]        driver_register+0x8f/0xf0
> [   46.473592]        do_one_initcall+0x58/0x310
> [   46.478441]        do_init_module+0x4c/0x1f0
> [   46.483202]        __do_sys_finit_module+0xb4/0x130
> [   46.488572]        do_syscall_64+0x38/0x90
> [   46.493158]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.499229]
>                -> #0 (&slave->sdw_dev_lock){+.+.}-{3:3}:
> [   46.506737]        __lock_acquire+0x1121/0x1df0
> [   46.511765]        lock_acquire+0xd5/0x300
> [   46.516360]        __mutex_lock+0x94/0x920
> [   46.520949]        sdw_update_slave_status+0x26/0x70
> [   46.526409]        sdw_clear_slave_status+0xd8/0xe0
> [   46.531783]        intel_resume_runtime+0x139/0x2a0
> [   46.537155]        __rpm_callback+0x41/0x120
> [   46.541919]        rpm_callback+0x5d/0x70
> [   46.546422]        rpm_resume+0x531/0x7e0
> [   46.550920]        __pm_runtime_resume+0x4a/0x80
> [   46.556024]        snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0
> [   46.562611]        __soc_pcm_open+0x62/0x520
> [   46.567375]        dpcm_be_dai_startup+0x116/0x210
> [   46.572661]        dpcm_fe_dai_open+0xf7/0x830
> [   46.577597]        snd_pcm_open_substream+0x54a/0x8b0
> [   46.583145]        snd_pcm_open.part.0+0xdc/0x200
> [   46.588341]        snd_pcm_playback_open+0x51/0x80
> [   46.593625]        chrdev_open+0xc0/0x250
> [   46.598129]        do_dentry_open+0x15f/0x430
> [   46.602981]        path_openat+0x75e/0xa80
> [   46.607575]        do_filp_open+0xb2/0x160
> [   46.612162]        do_sys_openat2+0x9a/0x160
> [   46.616922]        __x64_sys_openat+0x53/0xa0
> [   46.621767]        do_syscall_64+0x38/0x90
> [   46.626352]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.632414]
>                other info that might help us debug this:[   46.641862]
> Chain exists of:
>                  &slave->sdw_dev_lock --> &card->controls_rwsem -->
> &card->pcm_mutex[   46.655145]  Possible unsafe locking scenario:[  
> 46.662048]        CPU0                    CPU1
> [   46.667080]        ----                    ----
> [   46.672108]   lock(&card->pcm_mutex);
> [   46.676267]                                lock(&card->controls_rwsem);
> [   46.683382]                                lock(&card->pcm_mutex);
> [   46.690063]   lock(&slave->sdw_dev_lock);
> [   46.694574]
>                 *** DEADLOCK ***[   46.701942] 2 locks held by mpg123/1130:
> [   46.706356]  #0: ffff8b4457b22b90 (&pcm->open_mutex){+.+.}-{3:3}, at:
> snd_pcm_open.part.0+0xc9/0x200
> [   46.715999]  #1: ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at:
> dpcm_fe_dai_open+0x49/0x830
> [   46.725390]
>                stack backtrace:
> [   46.730752] CPU: 0 PID: 1130 Comm: mpg123 Tainted: G            E    
>  6.1.0-rc4-jamerson #1
> [   46.739703] Hardware name: AAEON UP-WHL01/UP-WHL01, BIOS UPW1AM19
> 11/10/2020
> [   46.747270] Call Trace:
> [   46.750239]  <TASK>
> [   46.752857]  dump_stack_lvl+0x56/0x73
> [   46.757045]  check_noncircular+0x102/0x120
> [   46.761664]  __lock_acquire+0x1121/0x1df0
> [   46.766197]  lock_acquire+0xd5/0x300
> [   46.770292]  ? sdw_update_slave_status+0x26/0x70
> [   46.775432]  ? lock_is_held_type+0xe2/0x140
> [   46.780143]  __mutex_lock+0x94/0x920
> [   46.784241]  ? sdw_update_slave_status+0x26/0x70
> [   46.789387]  ? find_held_lock+0x2b/0x80
> [   46.793750]  ? sdw_update_slave_status+0x26/0x70
> [   46.798894]  ? lock_release+0x147/0x2f0
> [   46.803262]  ? lockdep_init_map_type+0x47/0x250
> [   46.808315]  ? sdw_update_slave_status+0x26/0x70
> [   46.813456]  sdw_update_slave_status+0x26/0x70
> [   46.818422]  sdw_clear_slave_status+0xd8/0xe0
> [   46.823302]  ? pm_generic_runtime_suspend+0x30/0x30
> [   46.828706]  intel_resume_runtime+0x139/0x2a0
> [   46.833583]  ? _raw_spin_unlock_irq+0x24/0x50
> [   46.838462]  ? pm_generic_runtime_suspend+0x30/0x30
> [   46.843866]  __rpm_callback+0x41/0x120
> [   46.848142]  ? pm_generic_runtime_suspend+0x30/0x30
> [   46.853550]  rpm_callback+0x5d/0x70
> [   46.857568]  rpm_resume+0x531/0x7e0
> [   46.861578]  ? _raw_spin_lock_irqsave+0x62/0x70
> [   46.866634]  __pm_runtime_resume+0x4a/0x80
> [   46.871258]  snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0
> [   46.877358]  __soc_pcm_open+0x62/0x520
> [   46.881634]  ? dpcm_add_paths.isra.0+0x35d/0x4c0
> [   46.886784]  dpcm_be_dai_startup+0x116/0x210
> [   46.891592]  dpcm_fe_dai_open+0xf7/0x830
> [   46.896046]  ? debug_mutex_init+0x33/0x50
> [   46.900591]  snd_pcm_open_substream+0x54a/0x8b0
> [   46.905658]  snd_pcm_open.part.0+0xdc/0x200
> [   46.910376]  ? wake_up_q+0x90/0x90
> [   46.914312]  snd_pcm_playback_open+0x51/0x80
> [   46.919118]  chrdev_open+0xc0/0x250
> [   46.923147]  ? cdev_device_add+0x90/0x90
> [   46.927608]  do_dentry_open+0x15f/0x430
> [   46.931976]  path_openat+0x75e/0xa80
> [   46.936086]  do_filp_open+0xb2/0x160
> [   46.940194]  ? lock_release+0x147/0x2f0
> [   46.944563]  ? _raw_spin_unlock+0x29/0x50
> [   46.949101]  do_sys_openat2+0x9a/0x160
> [   46.953377]  __x64_sys_openat+0x53/0xa0
> [   46.957733]  do_syscall_64+0x38/0x90
> [   46.961829]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   46.967402] RIP: 0033:0x7fa6397ccd3b
> [   46.971506] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18
> 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f
> 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
> [   46.991413] RSP: 002b:00007fff838e8990 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000101
> [   46.999580] RAX: ffffffffffffffda RBX: 0000000000080802 RCX:
> 00007fa6397ccd3b
> [   47.007311] RDX: 0000000000080802 RSI: 00007fff838e8b50 RDI:
> 00000000ffffff9c
> [   47.015047] RBP: 00007fff838e8b50 R08: 0000000000000000 R09:
> 0000000000000011
> [   47.022787] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000080802
> [   47.030539] R13: 0000000000000004 R14: 0000000000000000 R15:
> 00007fff838e8b50
> [   47.038289]  </TASK>
  
Richard Fitzgerald Dec. 1, 2022, 2:32 p.m. UTC | #5
On 29/11/2022 15:44, Pierre-Louis Bossart wrote:
> 
> 
>>>> Don't hold sdw_dev_lock while calling the peripheral driver
>>>> probe() and remove() callbacks.
>>>>
>>>> Holding sdw_dev_lock around the probe() and remove() calls
>>>> causes a theoretical mutex inversion which lockdep will
>>>> assert on. The peripheral driver probe will probably register
>>>> a soundcard, which will take ALSA and ASoC locks. During
>>>
>>> It's extremely unlikely that a peripheral driver would register a sound
>>> card, this is what machine drivers do.
>>>
>>> Which leads me to the question: is this a real problem?
>>>
>>
>> Yes, try turning on lockdep checking and you will get an assert.
>> During probe the existing code takes sdw_dev_lock and then calls the
>> codec driver probe, so you will get a mutex sequence like:
>>
>> sdw_dev_lock -> controls_rw_sem -> pcm_mutex
>>
>> but in normal operation the ALSA/ASoC code will take its mutexes first
>> and call runtime_resume which then takes the sdw_dev_lock, so you get
>>
>> pcm_mutex -> sdw_dev_lock
>>
>> and lockdep will assert on that opposite ordering.
>> The full assert is at the end of this email.
> 
> Humm, you lost me with the reference to runtime_resume. I don't fully
> understand how it's possible to invoke pm_runtime during probe.
> pm_runtime should only enabled during the codec update_status() which
> can only be done once the probe completes.
> 
> I am fine with the changes that you are suggesting, the introduction of
> the sdw_dev_lock was probably too conservative and it'd be fine to only
> protect what is required.
> 
> However we do have lockdep enabled
> 

I wonder whether this is because the Cirrus devices use full DP prepare,
so there will be a DP prepare interrupt during the attempt to prepare
the dailink. The lockdep assert was when sdw_update_slave_status() tried
to take sdw_dev_lock.

If the Realtek codecs only use Soundwire interrupts for jack detect you
probably won't see a sdw_dev_lock inside a pcm_mutex.
  
Richard Fitzgerald Dec. 1, 2022, 4:55 p.m. UTC | #6
On 01/12/2022 14:32, Richard Fitzgerald wrote:
> On 29/11/2022 15:44, Pierre-Louis Bossart wrote:
>>
>>
>>>>> Don't hold sdw_dev_lock while calling the peripheral driver
>>>>> probe() and remove() callbacks.
>>>>>
>>>>> Holding sdw_dev_lock around the probe() and remove() calls
>>>>> causes a theoretical mutex inversion which lockdep will
>>>>> assert on. The peripheral driver probe will probably register
>>>>> a soundcard, which will take ALSA and ASoC locks. During
>>>>
>>>> It's extremely unlikely that a peripheral driver would register a sound
>>>> card, this is what machine drivers do.
>>>>
>>>> Which leads me to the question: is this a real problem?
>>>>
>>>
>>> Yes, try turning on lockdep checking and you will get an assert.
>>> During probe the existing code takes sdw_dev_lock and then calls the
>>> codec driver probe, so you will get a mutex sequence like:
>>>
>>> sdw_dev_lock -> controls_rw_sem -> pcm_mutex
>>>
>>> but in normal operation the ALSA/ASoC code will take its mutexes first
>>> and call runtime_resume which then takes the sdw_dev_lock, so you get
>>>
>>> pcm_mutex -> sdw_dev_lock
>>>
>>> and lockdep will assert on that opposite ordering.
>>> The full assert is at the end of this email.
>>
>> Humm, you lost me with the reference to runtime_resume. I don't fully
>> understand how it's possible to invoke pm_runtime during probe.
>> pm_runtime should only enabled during the codec update_status() which
>> can only be done once the probe completes.
>>
>> I am fine with the changes that you are suggesting, the introduction of
>> the sdw_dev_lock was probably too conservative and it'd be fine to only
>> protect what is required.
>>
>> However we do have lockdep enabled
>>
> 
> I wonder whether this is because the Cirrus devices use full DP prepare,
> so there will be a DP prepare interrupt during the attempt to prepare
> the dailink. The lockdep assert was when sdw_update_slave_status() tried
> to take sdw_dev_lock.
> 
> If the Realtek codecs only use Soundwire interrupts for jack detect you
> probably won't see a sdw_dev_lock inside a pcm_mutex.

Ok, I think I understand this now.
lockdep tries to prove locking correctness without requiring that a bad
lock order actually happened. This makes sense, because some deadlocks
might be extremely difficult to reproduce and lockdep wouldn't help much
if you had to reproduce it to get an assert.

There's a lot of detail in Documentation/locking/lockdep-design.rst, but
in summary it is looking for a relationship between three locks that is
theoretically incompatible. So if it sees:
a) L1->L2
b) L2->L3
c) L3->L1
this will assert because the relationship of L1 and L2 on L3 is opposite
to the relationship of L1 to L2. IOW lockdep is assuming the possibility
of L2->L3->L1. You could look at that another way and say that lockdep
is telling you that even though you didn't intend L2->L3->L1 could ever 
happen there is a risk lurking in the code.

What we've seen is three normal call chains that give these orderings:

a) pcm_mutex -> sdw_dev_lock (when starting up a playback)
b) controls_rwsem -> pcm_mutex (a write to a DAPM MUX control
      re-evaluates DAPM power states of the DPCM links)
c) sdw_dev_lock -> controls_rwsem (when snd_soc_register_component()
      adds ALSA controls)

controls_rwsem is the "L3" here, the lock that has a relationship to two
other locks that is incompatible to those two locks' relationship to
each other in (a).

The theoretical case sdw_dev_lock->controls_rwsem->pcm_mutex should be
impossible, but it's always nicer to fix the problem that try to make
it a special case "safe to ignore".

The only places sdw_dev_lock can be taken before the ALSA/ASoC locks is
during probe() and remove(). At other times it would only be taken as a
result of an ALSA/ASoC activity so the ALSA/ASoC locks would be taken
first.
  

Patch

diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c
index 04b3529f8929..963498db0fd2 100644
--- a/drivers/soundwire/bus_type.c
+++ b/drivers/soundwire/bus_type.c
@@ -105,20 +105,19 @@  static int sdw_drv_probe(struct device *dev)
 	if (ret)
 		return ret;
 
-	mutex_lock(&slave->sdw_dev_lock);
-
 	ret = drv->probe(slave, id);
 	if (ret) {
 		name = drv->name;
 		if (!name)
 			name = drv->driver.name;
-		mutex_unlock(&slave->sdw_dev_lock);
 
 		dev_err(dev, "Probe of %s failed: %d\n", name, ret);
 		dev_pm_domain_detach(dev, false);
 		return ret;
 	}
 
+	mutex_lock(&slave->sdw_dev_lock);
+
 	/* device is probed so let's read the properties now */
 	if (drv->ops && drv->ops->read_prop)
 		drv->ops->read_prop(slave);
@@ -167,14 +166,12 @@  static int sdw_drv_remove(struct device *dev)
 	int ret = 0;
 
 	mutex_lock(&slave->sdw_dev_lock);
-
 	slave->probed = false;
+	mutex_unlock(&slave->sdw_dev_lock);
 
 	if (drv->remove)
 		ret = drv->remove(slave);
 
-	mutex_unlock(&slave->sdw_dev_lock);
-
 	dev_pm_domain_detach(dev, false);
 
 	return ret;