[0/1] autofs: fix memory leak of waitqueues in autofs_catatonic_mode

Message ID 20230211195950.452364-1-pchelkin@ispras.ru
Headers
Series autofs: fix memory leak of waitqueues in autofs_catatonic_mode |

Message

Fedor Pchelkin Feb. 11, 2023, 7:59 p.m. UTC
  Syzkaller reports the leak [1]. It is reproducible.

The following patch fixes the leak. It was proposed by Takeshi Misawa and
tested by Syzbot.

In other places of the code the waitqueue is freed when its wait_ctr
becomes zero (see autofs_wait_release). So I think it is not actually
supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to
zero. Please correct me if I'm wrong.

Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up()
instead of wake_up_interruptible', shouldn't wake_up_interruptible()
inside autofs_catatonic_mode() be replaced with wake_up()?

[1] https://syzkaller.appspot.com/bug?id=a9412f636e2d733130f8def7975897d0b57f6e37
[2] https://www.spinics.net/lists/autofs/msg01875.html
  

Comments

Ian Kent Feb. 13, 2023, 1:25 a.m. UTC | #1
On 12/2/23 03:59, Fedor Pchelkin wrote:
> Syzkaller reports the leak [1]. It is reproducible.
>
> The following patch fixes the leak. It was proposed by Takeshi Misawa and
> tested by Syzbot.
>
> In other places of the code the waitqueue is freed when its wait_ctr
> becomes zero (see autofs_wait_release). So I think it is not actually
> supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to
> zero. Please correct me if I'm wrong.

Clearly there's a problem here but I'll need to think about what's going

a bit more myself.


>
> Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up()
> instead of wake_up_interruptible', shouldn't wake_up_interruptible()
> inside autofs_catatonic_mode() be replaced with wake_up()?

Yes, I think so but that also deserves a bit of thought.


Ian
  
Ian Kent Feb. 13, 2023, 4:27 a.m. UTC | #2
On 12/2/23 03:59, Fedor Pchelkin wrote:
> Syzkaller reports the leak [1]. It is reproducible.
>
> The following patch fixes the leak. It was proposed by Takeshi Misawa and
> tested by Syzbot.
>
> In other places of the code the waitqueue is freed when its wait_ctr
> becomes zero (see autofs_wait_release). So I think it is not actually
> supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to
> zero. Please correct me if I'm wrong.

This is a bit had to read but I think your saying there's an assumption

that wait_ctr can't become zero in autofs_catatonic_mode().


That's correct, the case of a waiting process getting sent a signal is

not accounted for and this can (as you observed) lead to the wait not

being freed and also not being freed at umount.


I think the change here should be sufficient to resolve the leak and

I can't think of any cases where this could cause a further problem.


>
> Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up()
> instead of wake_up_interruptible', shouldn't wake_up_interruptible()
> inside autofs_catatonic_mode() be replaced with wake_up()?

This does imply that [2] should have been applied to autofs_catatonic_mode()

as well, I'm still trying to grok if that change would cause side effects

for the change here but I think not.


Ian
  
Ian Kent Feb. 13, 2023, 4:37 a.m. UTC | #3
On 13/2/23 12:27, Ian Kent wrote:
> On 12/2/23 03:59, Fedor Pchelkin wrote:
>> Syzkaller reports the leak [1]. It is reproducible.
>>
>> The following patch fixes the leak. It was proposed by Takeshi Misawa 
>> and
>> tested by Syzbot.
>>
>> In other places of the code the waitqueue is freed when its wait_ctr
>> becomes zero (see autofs_wait_release). So I think it is not actually
>> supposed that inside autofs_catatonic_mode wait_ctr cannot be 
>> decreased to
>> zero. Please correct me if I'm wrong.
>
> This is a bit had to read but I think your saying there's an assumption
>
> that wait_ctr can't become zero in autofs_catatonic_mode().
>
>
> That's correct, the case of a waiting process getting sent a signal is
>
> not accounted for and this can (as you observed) lead to the wait not
>
> being freed and also not being freed at umount.
>
>
> I think the change here should be sufficient to resolve the leak and
>
> I can't think of any cases where this could cause a further problem.
>
>
>>
>> Also, looking at the discussion [2] of the '[PATCH] autofs4: use 
>> wake_up()
>> instead of wake_up_interruptible', shouldn't wake_up_interruptible()
>> inside autofs_catatonic_mode() be replaced with wake_up()?
>
> This does imply that [2] should have been applied to 
> autofs_catatonic_mode()
>
> as well, I'm still trying to grok if that change would cause side effects
>
> for the change here but I think not.

I was going to Ack the patch but I wondering if we should wait a little

while and perhaps (probably) include the wake up call change as well.


In any case we need Al to accept it (cc'd).

Hopefully Al will offer his opinion on the changes too.


>
>
> Ian
>
  
Fedor Pchelkin March 10, 2023, 5:56 p.m. UTC | #4
On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote:
> 
> I was going to Ack the patch but I wondering if we should wait a little
> 
> while and perhaps (probably) include the wake up call change as well.
>

Hmm, those would be separate patches?

An interesting thing is that the code itself supposes the wake up calls
from autofs_wait_release() and autofs_catatonic_mode() to be related in
some way (see autofs_wait fragment):

	/*
	 * wq->name.name is NULL iff the lock is already released
	 * or the mount has been made catatonic.
	 */
	wait_event_killable(wq->queue, wq->name.name == NULL);
	status = wq->status;

It seems 'the lock is already released' refers to autofs_wait_release()
as there is no alternative except the call to catatonic function where
wq->name.name is NULL. So apparently the wake up calls should be the same
(although I don't know if autofs_catatonic_mode has some different
behaviour in such case, but probably it doesn't differ here).

It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and
currently it just decrements the wait_ctr's and it is not clear to me
where the waitqueues are eventually freed in such case. Only if
autofs_wait_release() or autofs_wait() are called? I'm not sure whether
they are definitely called after that or not.

[1] https://www.spinics.net/lists/autofs/msg01878.html
> 
> In any case we need Al to accept it (cc'd).
> 
> Hopefully Al will offer his opinion on the changes too.
> 

It would be very nice if probably Al would make it more clear.

At the moment I think that the leak issue should be fixed with the
currenly discussed patch and the wake up call issue should be fixed like
in [1], but perhaps I'm missing something.
  
Ian Kent March 11, 2023, 7:01 a.m. UTC | #5
On 11/3/23 01:56, Fedor Pchelkin wrote:
> On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote:
>> I was going to Ack the patch but I wondering if we should wait a little
>>
>> while and perhaps (probably) include the wake up call change as well.
>>
> Hmm, those would be separate patches?
>
> An interesting thing is that the code itself supposes the wake up calls
> from autofs_wait_release() and autofs_catatonic_mode() to be related in
> some way (see autofs_wait fragment):
>
> 	/*
> 	 * wq->name.name is NULL iff the lock is already released
> 	 * or the mount has been made catatonic.
> 	 */
> 	wait_event_killable(wq->queue, wq->name.name == NULL);
> 	status = wq->status;
>
> It seems 'the lock is already released' refers to autofs_wait_release()
> as there is no alternative except the call to catatonic function where
> wq->name.name is NULL. So apparently the wake up calls should be the same
> (although I don't know if autofs_catatonic_mode has some different
> behaviour in such case, but probably it doesn't differ here).

I think that, because there are processes waiting, they will always go

via the tail of autofs_wait() so the wait will be freed at that point.


Alternately autofs_wait_release() will be called from user space daemon

to tell the kernel it's done with the current notification.


I think there was an order of execution problem at some point between

autofs_wait() and autofs_wait_release() hence the code there. The same

may be the case for autofs_catatonic_mode() which is what the patch

implies.


These mount points can be left mounted after the user space daemon

exits with the processes still blocked so umounting the mount should

trigger the freeing of the name or they may be set catatonic by the

daemon at exit, again freeing the name, and in both cases unblocking

the processes to free the wait.


So I didn't think there was a memory leak here but SyZkaller says

there is.


>
> It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and
> currently it just decrements the wait_ctr's and it is not clear to me
> where the waitqueues are eventually freed in such case. Only if
> autofs_wait_release() or autofs_wait() are called? I'm not sure whether
> they are definitely called after that or not.
>
> [1] https://www.spinics.net/lists/autofs/msg01878.html
>> In any case we need Al to accept it (cc'd).
>>
>> Hopefully Al will offer his opinion on the changes too.
>>
> It would be very nice if probably Al would make it more clear.
>
> At the moment I think that the leak issue should be fixed with the
> currenly discussed patch and the wake up call issue should be fixed like
> in [1], but perhaps I'm missing something.

The question I have is, is it possible a process waiting on the wait

queue gets unblocked after the wait is freed in autofs_catatonic_mode?


Ian