Message ID | 20230211195950.452364-1-pchelkin@ispras.ru |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp1637279wrn; Sat, 11 Feb 2023 12:06:50 -0800 (PST) X-Google-Smtp-Source: AK7set/y99mMC8pSxmNN4tagrWMsJTlqkm+zP6uSI81uhtE/tBliwgPoTVVobIj/kBVjLwpToPyw X-Received: by 2002:a17:90a:1a02:b0:230:9802:fbaf with SMTP id 2-20020a17090a1a0200b002309802fbafmr22191450pjk.35.1676146009793; Sat, 11 Feb 2023 12:06:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676146009; cv=none; d=google.com; s=arc-20160816; b=WGMA7iCcbPOjfZyJWCuyYH9KrHZDTfIU6oVFCmp7su88FPwwgatjc0pfwW4UZPMWBh CLa3kt/58nqSTL1XJgBdjXvZ7qLQBiGR/MA2Eaac0JyjQcW936nJn0SABnL6pL5jPp7A FpZ+GUFso69X8P/bwZ4To4WM8nr8F8zlt6oY+jhzAEpEY5IAkhRGT1vtdQTHzRhI2Q+s hKO1eKpNRrT9VbmVIPTV9K5SU43XPBVYJtLMAD0OhtwKhjQ+Zi//Oh6R9DuV8kzKksFJ 1uVz852wkoRkjWmjESxfXZD1OZhyew7vOT4Ah5miVhAC+ctcgpGy4aMjXxcyAIf6rjyT JPtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature:dkim-filter; bh=9zZWgSw2dv+vNDtDe9Rvl8F7Fv/omJW33B6s3CwSepw=; b=dbg5Ui29S/78faisaEgApeVWmAMcQcl/P7e45ljXh4YBdhcSj9kXZ0H8cAZDNlZDBi H1Ol+YYhv30Tam7Qf2xHT43m06LKIWpSMoocJeNcuFG/33QchqOlKRqNB1eUXWxRYiTf FCyfTG8TElmcwxCPIwMySEQa3twDwn+hSTbAgSMiGrBqLBGdEotrf4vtTS9wGRyrV39K wSdnuGOy4L+0Itus2iDFmO5kLNHmpGRE0+gg5kT1yqZklS4SmPuWZkfSiCZtiUz+g14j myTixiyNhoCDTWj1TQBhBvt+38Y04HkUhfxlSxuFJBpWojwQHqYKVPRbs+vNjxNDeNcH MB4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ispras.ru header.s=default header.b="UV8cC8i/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ispras.ru Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nl18-20020a17090b385200b00229732d501fsi12590819pjb.155.2023.02.11.12.06.37; Sat, 11 Feb 2023 12:06:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ispras.ru header.s=default header.b="UV8cC8i/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ispras.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229599AbjBKUAT (ORCPT <rfc822;olliecheer@gmail.com> + 99 others); Sat, 11 Feb 2023 15:00:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229447AbjBKUAQ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 11 Feb 2023 15:00:16 -0500 Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7196016327; Sat, 11 Feb 2023 12:00:14 -0800 (PST) Received: from fpc.. (unknown [46.242.14.200]) by mail.ispras.ru (Postfix) with ESMTPSA id F329840737BC; Sat, 11 Feb 2023 20:00:06 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru F329840737BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ispras.ru; s=default; t=1676145607; bh=9zZWgSw2dv+vNDtDe9Rvl8F7Fv/omJW33B6s3CwSepw=; h=From:To:Cc:Subject:Date:From; b=UV8cC8i/VJR6dOw5qrWQVjrvZobF4sAHOKO77Dx2Olqw71vl+31G7UsLhS1nkWpfW 58CMu34LCak4dnoqrykA+37uVW1/2/CCWuw5liMg7ovorVKf7PD1Qt4rqtzhfgDTa+ TthFD0JyphXqEbDSU4MvCRhHQmYy9m44AKjWDDzw= From: Fedor Pchelkin <pchelkin@ispras.ru> To: Ian Kent <raven@themaw.net> Cc: Fedor Pchelkin <pchelkin@ispras.ru>, Matthew Wilcox <willy@infradead.org>, Andrei Vagin <avagin@gmail.com>, Takeshi Misawa <jeliantsurux@gmail.com>, autofs@vger.kernel.org, linux-kernel@vger.kernel.org, Alexey Khoroshilov <khoroshilov@ispras.ru>, lvc-project@linuxtesting.org Subject: [PATCH 0/1] autofs: fix memory leak of waitqueues in autofs_catatonic_mode Date: Sat, 11 Feb 2023 22:59:49 +0300 Message-Id: <20230211195950.452364-1-pchelkin@ispras.ru> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757566478607141682?= X-GMAIL-MSGID: =?utf-8?q?1757566478607141682?= |
Series |
autofs: fix memory leak of waitqueues in autofs_catatonic_mode
|
|
Message
Fedor Pchelkin
Feb. 11, 2023, 7:59 p.m. UTC
Syzkaller reports the leak [1]. It is reproducible. The following patch fixes the leak. It was proposed by Takeshi Misawa and tested by Syzbot. In other places of the code the waitqueue is freed when its wait_ctr becomes zero (see autofs_wait_release). So I think it is not actually supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to zero. Please correct me if I'm wrong. Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() instead of wake_up_interruptible', shouldn't wake_up_interruptible() inside autofs_catatonic_mode() be replaced with wake_up()? [1] https://syzkaller.appspot.com/bug?id=a9412f636e2d733130f8def7975897d0b57f6e37 [2] https://www.spinics.net/lists/autofs/msg01875.html
Comments
On 12/2/23 03:59, Fedor Pchelkin wrote: > Syzkaller reports the leak [1]. It is reproducible. > > The following patch fixes the leak. It was proposed by Takeshi Misawa and > tested by Syzbot. > > In other places of the code the waitqueue is freed when its wait_ctr > becomes zero (see autofs_wait_release). So I think it is not actually > supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to > zero. Please correct me if I'm wrong. Clearly there's a problem here but I'll need to think about what's going a bit more myself. > > Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() > instead of wake_up_interruptible', shouldn't wake_up_interruptible() > inside autofs_catatonic_mode() be replaced with wake_up()? Yes, I think so but that also deserves a bit of thought. Ian
On 12/2/23 03:59, Fedor Pchelkin wrote: > Syzkaller reports the leak [1]. It is reproducible. > > The following patch fixes the leak. It was proposed by Takeshi Misawa and > tested by Syzbot. > > In other places of the code the waitqueue is freed when its wait_ctr > becomes zero (see autofs_wait_release). So I think it is not actually > supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to > zero. Please correct me if I'm wrong. This is a bit had to read but I think your saying there's an assumption that wait_ctr can't become zero in autofs_catatonic_mode(). That's correct, the case of a waiting process getting sent a signal is not accounted for and this can (as you observed) lead to the wait not being freed and also not being freed at umount. I think the change here should be sufficient to resolve the leak and I can't think of any cases where this could cause a further problem. > > Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() > instead of wake_up_interruptible', shouldn't wake_up_interruptible() > inside autofs_catatonic_mode() be replaced with wake_up()? This does imply that [2] should have been applied to autofs_catatonic_mode() as well, I'm still trying to grok if that change would cause side effects for the change here but I think not. Ian
On 13/2/23 12:27, Ian Kent wrote: > On 12/2/23 03:59, Fedor Pchelkin wrote: >> Syzkaller reports the leak [1]. It is reproducible. >> >> The following patch fixes the leak. It was proposed by Takeshi Misawa >> and >> tested by Syzbot. >> >> In other places of the code the waitqueue is freed when its wait_ctr >> becomes zero (see autofs_wait_release). So I think it is not actually >> supposed that inside autofs_catatonic_mode wait_ctr cannot be >> decreased to >> zero. Please correct me if I'm wrong. > > This is a bit had to read but I think your saying there's an assumption > > that wait_ctr can't become zero in autofs_catatonic_mode(). > > > That's correct, the case of a waiting process getting sent a signal is > > not accounted for and this can (as you observed) lead to the wait not > > being freed and also not being freed at umount. > > > I think the change here should be sufficient to resolve the leak and > > I can't think of any cases where this could cause a further problem. > > >> >> Also, looking at the discussion [2] of the '[PATCH] autofs4: use >> wake_up() >> instead of wake_up_interruptible', shouldn't wake_up_interruptible() >> inside autofs_catatonic_mode() be replaced with wake_up()? > > This does imply that [2] should have been applied to > autofs_catatonic_mode() > > as well, I'm still trying to grok if that change would cause side effects > > for the change here but I think not. I was going to Ack the patch but I wondering if we should wait a little while and perhaps (probably) include the wake up call change as well. In any case we need Al to accept it (cc'd). Hopefully Al will offer his opinion on the changes too. > > > Ian >
On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote: > > I was going to Ack the patch but I wondering if we should wait a little > > while and perhaps (probably) include the wake up call change as well. > Hmm, those would be separate patches? An interesting thing is that the code itself supposes the wake up calls from autofs_wait_release() and autofs_catatonic_mode() to be related in some way (see autofs_wait fragment): /* * wq->name.name is NULL iff the lock is already released * or the mount has been made catatonic. */ wait_event_killable(wq->queue, wq->name.name == NULL); status = wq->status; It seems 'the lock is already released' refers to autofs_wait_release() as there is no alternative except the call to catatonic function where wq->name.name is NULL. So apparently the wake up calls should be the same (although I don't know if autofs_catatonic_mode has some different behaviour in such case, but probably it doesn't differ here). It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and currently it just decrements the wait_ctr's and it is not clear to me where the waitqueues are eventually freed in such case. Only if autofs_wait_release() or autofs_wait() are called? I'm not sure whether they are definitely called after that or not. [1] https://www.spinics.net/lists/autofs/msg01878.html > > In any case we need Al to accept it (cc'd). > > Hopefully Al will offer his opinion on the changes too. > It would be very nice if probably Al would make it more clear. At the moment I think that the leak issue should be fixed with the currenly discussed patch and the wake up call issue should be fixed like in [1], but perhaps I'm missing something.
On 11/3/23 01:56, Fedor Pchelkin wrote: > On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote: >> I was going to Ack the patch but I wondering if we should wait a little >> >> while and perhaps (probably) include the wake up call change as well. >> > Hmm, those would be separate patches? > > An interesting thing is that the code itself supposes the wake up calls > from autofs_wait_release() and autofs_catatonic_mode() to be related in > some way (see autofs_wait fragment): > > /* > * wq->name.name is NULL iff the lock is already released > * or the mount has been made catatonic. > */ > wait_event_killable(wq->queue, wq->name.name == NULL); > status = wq->status; > > It seems 'the lock is already released' refers to autofs_wait_release() > as there is no alternative except the call to catatonic function where > wq->name.name is NULL. So apparently the wake up calls should be the same > (although I don't know if autofs_catatonic_mode has some different > behaviour in such case, but probably it doesn't differ here). I think that, because there are processes waiting, they will always go via the tail of autofs_wait() so the wait will be freed at that point. Alternately autofs_wait_release() will be called from user space daemon to tell the kernel it's done with the current notification. I think there was an order of execution problem at some point between autofs_wait() and autofs_wait_release() hence the code there. The same may be the case for autofs_catatonic_mode() which is what the patch implies. These mount points can be left mounted after the user space daemon exits with the processes still blocked so umounting the mount should trigger the freeing of the name or they may be set catatonic by the daemon at exit, again freeing the name, and in both cases unblocking the processes to free the wait. So I didn't think there was a memory leak here but SyZkaller says there is. > > It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and > currently it just decrements the wait_ctr's and it is not clear to me > where the waitqueues are eventually freed in such case. Only if > autofs_wait_release() or autofs_wait() are called? I'm not sure whether > they are definitely called after that or not. > > [1] https://www.spinics.net/lists/autofs/msg01878.html >> In any case we need Al to accept it (cc'd). >> >> Hopefully Al will offer his opinion on the changes too. >> > It would be very nice if probably Al would make it more clear. > > At the moment I think that the leak issue should be fixed with the > currenly discussed patch and the wake up call issue should be fixed like > in [1], but perhaps I'm missing something. The question I have is, is it possible a process waiting on the wait queue gets unblocked after the wait is freed in autofs_catatonic_mode? Ian