mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread
Message ID | 20230218075956.1563118-1-zyytlz.wz@163.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp271832wrn; Sat, 18 Feb 2023 00:07:49 -0800 (PST) X-Google-Smtp-Source: AK7set/iIaaB5O9iGbZBMljoBB0DgbTDkZqytcoftQyk7wejQN6GTFqK4lT1hu18ndje9oKImjo/ X-Received: by 2002:a17:907:9614:b0:8b1:7e1f:91c2 with SMTP id gb20-20020a170907961400b008b17e1f91c2mr5526605ejc.74.1676707669167; Sat, 18 Feb 2023 00:07:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676707669; cv=none; d=google.com; s=arc-20160816; b=O13KQfJb0JZeW6mYrhnu8fktZ7rTwV4BtVJuDGOdoLzoHqD5NUi0GS8LWkZbqosdV7 SRu4fTwbBUcsEl2QxqPXYVEvayhxWRfeVZ3Uq+79cpOFC4Z8YIZM31onuKolNNUueWhf 9PWD2KL67xc2TIzrnBi9WriNw03fu3oFgTPNSv5L71WoQs4rHRtPqFR7qkbY8l3JJLz4 CQ7DGCX/hjgPvAHcx7NghHKekEUvboYrqD3WQ9OXAoDfdOw58Jau4c+PAwV5X2dvAWxJ 4HSTcRwUt964WLTCTCWp3gPNjHE9GhVQcaG82ujAkt5XggA/kO6Lp0p7brsmZfG7g6f2 FdLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=mSlY79GAIPqWM6x0QcLyyzMg/OV9+44uQhIVEfrs9mk=; b=xvJojGFojU11QkUkqEbr4CQwqNtbmbdN3eO/oli906YuUTELvxW5y6kb93gQPfHQZT ckw2xqCjzt3qEUjqnr6BDU0tDE+gz0HFUXzlu02u/y+WgxqjDFoNttFcL4N7FTG8Ho2Q pfiohm94Di/1whc4R2DOoXCtnnmnRB6v277aZiaQptGoLIOQ510uqgstjb4xrafSLW10 5CvcI5erHaoJ7u3w3K3C66BsMflHjmtrbRn0F/XhCLmQ3kc5Mie3ZGTPsFUSchgpQhfS 2k6C3vfwqTU6T9wftrzabFJTOSlKhdi/bjtMx63atX3hyGCNIiGqaA+1zOLaFYSfOAFZ Q+IQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=Bnb2TMzO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f22-20020a170906739600b0088d2a412a3dsi8539389ejl.773.2023.02.18.00.07.24; Sat, 18 Feb 2023 00:07:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=Bnb2TMzO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229811AbjBRIAl (ORCPT <rfc822;daweilics@gmail.com> + 99 others); Sat, 18 Feb 2023 03:00:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjBRIAk (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 18 Feb 2023 03:00:40 -0500 Received: from m12.mail.163.com (m12.mail.163.com [220.181.12.214]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9AB224DE2C; Sat, 18 Feb 2023 00:00:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:Subject:Date:Message-Id:MIME-Version; bh=mSlY7 9GAIPqWM6x0QcLyyzMg/OV9+44uQhIVEfrs9mk=; b=Bnb2TMzOvDXgckSvXKh83 CbSRZ2OVRYKEiVux4Lw0HdUcarmHuEGjVZJYoNBBBdrfFp7w3BkWL4FlXyR4V3xB OXDwhZCVQatzckOqZ3Agj4O6+i4KfJ0XRZEQ4CC9bmQSR0KqdX4Nl++QCGkxRb/B gBGm+3cKliG938gAGorafk= Received: from leanderwang-LC2.localdomain (unknown [111.206.145.21]) by zwqz-smtp-mta-g0-4 (Coremail) with SMTP id _____wCnGEx9hfBjeoLtAA--.39355S2; Sat, 18 Feb 2023 15:59:57 +0800 (CST) From: Zheng Wang <zyytlz.wz@163.com> To: ganapathi017@gmail.com Cc: alex000young@gmail.com, amitkarwar@gmail.com, sharvari.harisangam@nxp.com, huxinming820@gmail.com, kvalo@kernel.org, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, linux-wireless@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Zheng Wang <zyytlz.wz@163.com> Subject: [PATCH] mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread Date: Sat, 18 Feb 2023 15:59:56 +0800 Message-Id: <20230218075956.1563118-1-zyytlz.wz@163.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: _____wCnGEx9hfBjeoLtAA--.39355S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxXF4UuryrtFy5CF18Xr45GFg_yoW5Gr47pa nxA3s7uw4Iqr4qkw4kJFy0yFWjg3WrKrWjkrWkAwn5Gr4rJas3ZrW5KFy0gr15XF4vga43 Ar1qq343Z3WkXFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0ziIJPiUUUUU= X-Originating-IP: [111.206.145.21] X-CM-SenderInfo: h2113zf2oz6qqrwthudrp/xtbBzgAaU2I0XXgkYAAAsw X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758155420779371067?= X-GMAIL-MSGID: =?utf-8?q?1758155420779371067?= |
Series |
mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread
|
|
Commit Message
Zheng Wang
Feb. 18, 2023, 7:59 a.m. UTC
This is a potential race condition by executing the following order.
In summary, the adapter could be freed in timer function and be used after
that. The race condition needs 10s window which could be extended by the
paper : https://www.usenix.org/system/files/sec21-lee-yoochan.pdf
And the function in wakeup_timer_fn may have the same problem.
I dont't really know how to fix that, so I just removed the reset call,
which is totally wrong. If you know anything abouth the fix,
plz free to let me know.
Note that, this bug is found by static analysis, it could be wrong. We
could discuss that before writing the fix.
CPU0 CPU1
mwifiex_sdio_probe
mwifiex_add_card
mwifiex_init_hw_fw
request_firmware_nowait
mwifiex_fw_dpc
_mwifiex_fw_dpc
mwifiex_init_fw
mwifiex_main_process
mwifiex_exec_next_cmd
mwifiex_dnld_cmd_to_fw
mod_timer(&adapter->cmd_timer,..)
mwifiex_cmd_timeout_func
if_ops.card_reset(adapter)
mwifiex_sdio_card_reset
schedule_work(&card->work)
mwifiex_sdio_work
mwifiex_sdio_card_reset_work
mwifiex_reinit_sw
_mwifiex_fw_dpc
mwifiex_free_adapter
mwifiex_unregister
kfree(adapter) //free adapter
mwifiex_get_priv
// Use adapter
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
---
drivers/net/wireless/marvell/mwifiex/cmdevt.c | 2 --
drivers/net/wireless/marvell/mwifiex/init.c | 2 --
2 files changed, 4 deletions(-)
Comments
On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote: > Note that, this bug is found by static analysis, it could be wrong. We > could discuss that before writing the fix. Yeah, please don't accept this patch. It deserves an "RFC" in the title at best. Sure, it's an identified race condition, but the cure here (deleting all possible recovery from firmware crashes) is worse than the disease. There's no real attempt at analyzing the race or providing solutions, so there's not much to discuss yet. Brian
Brian Norris <briannorris@chromium.org> 于2023年2月22日周三 11:31写道: > > On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote: > > Note that, this bug is found by static analysis, it could be wrong. We > > could discuss that before writing the fix. > > Yeah, please don't accept this patch. It deserves an "RFC" in the title > at best. Sure, it's an identified race condition, but the cure here > (deleting all possible recovery from firmware crashes) is worse than the > disease. > Hello Brain, Thanks for your reply. I do need add "RFC" in the title. Sorry about that. The invoking chain is as descriped in the original mail. There is some place which may confuse you. In mwifiex_exec_next_cmd function, the adapter is got from cmd_node->priv->adapter. You might think if this is the same adapter in outer function. In mwifiex_register function, it inits the priv_arrary with adapter->priv[i]->adapter = adapter, Then use mwifiex_init_lock_list to init the list. When it fetch adapter like calling mwifiex_cfg80211_get_tx_power, it travers the array to find the target priv. So all the adapter paramter is pointed to the same adapter. The UAF of it is vulnerable. In summary, after the firmware is downloaded, it will start the timer function, which is known as mwifiex_cmd_timeout_func. The if_ops.card_reset function pointer is assigned with mwifiex_sdio_card_reset, which will schedule_work the card->work. It finally pass the check becauese card->work_flags has MWIFIEX_IFACE_WORK_CARD_RESET. Finnaly, in _mwifiex_fw_dp, if init is failed, it will call mwifiex_free_adapter and free the adapter. > There's no real attempt at analyzing the race or providing solutions, so > there's not much to discuss yet. Yes, I did't figure out a good plan to fix. As I say, it free the adapter in the error handling path. Could you please provide some advice about the fix? Best regards, Zheng
On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote:
> Could you please provide some advice about the fix?
This entire driver's locking patterns (or lack
thereof) need rewritten. This driver was probably written by someone
that doesn't really understand concurrent programming. It really only
works because the bulk of normal operation is sequentialized into the
main loop (mwifiex_main_process()). Any time you get outside that,
you're likely to find bugs.
But now that I've looked a little further, I'm not confident you pointed
out a real bug. How does mwifiex_sdio_card_reset_work() get past
mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That
should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit
the race you point out.
Note to self: ignore most "static analysis" reports of race conditions,
unless they have thorough analysis or a runtime reproduction.
Brian
Hello Brain, Thanks for your detailed review. Sorry we missed something. As you say, We are lacking some consideration, the pointed race path could not happen. But after diving deep into the code, we found another path. Here is the possible path. In summary, if the execution of CPU1 is stuck by fw_done, the cpu0 will continue executing and free the adapter gets into error-path. The cpu1 did not notice that and UAF happened. If there is something else we didn't know, please feel free to let us know. CPU0 CPU1 mwifiex_sdio_probe mwifiex_add_card mwifiex_init_hw_fw request_firmware_nowait mwifiex_fw_dpc _mwifiex_fw_dpc mwifiex_init_fw mwifiex_main_process mwifiex_exec_next_cmd mwifiex_dnld_cmd_to_fw mod_timer(&adapter->cmd_timer,..) mwifiex_cmd_timeout_func if_ops.card_reset(adapter) mwifiex_sdio_card_reset [*] stuck in mwifiex_shutdown_sw return 0 in mwifiex_dnld_cmd_to_fw return 0 in mwifiex_exec_next_cmd return 0 in mwifiex_main_process return -EINPROGRESS in mwifiex_init_fw get into error path, mwifiex_free_adapter // free adapter complete_all(fw_done); [*]Go on Use adapter->fw_done Best regards, Zheng Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道: > > On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote: > > Could you please provide some advice about the fix? > > This entire driver's locking patterns (or lack > thereof) need rewritten. This driver was probably written by someone > that doesn't really understand concurrent programming. It really only > works because the bulk of normal operation is sequentialized into the > main loop (mwifiex_main_process()). Any time you get outside that, > you're likely to find bugs. > > But now that I've looked a little further, I'm not confident you pointed > out a real bug. How does mwifiex_sdio_card_reset_work() get past > mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That > should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit > the race you point out. > > Note to self: ignore most "static analysis" reports of race conditions, > unless they have thorough analysis or a runtime reproduction. > > Brian
This email is broken for the statement is too long, Here is the newest email. Hello Brain, Thanks for your detailed review. Sorry we missed something. As you say, We are lacking some consideration, the pointed race path could not happen. But after diving deep into the code, we found another path. Here is the possible path. In summary, if the execution of CPU1 is stuck by fw_done, the cpu0 will continue executing and free the adapter gets into error-path. The cpu1 did not notice that and UAF happened. If there is something else we didn't know, please feel free to let us know. CPU0 CPU1 mwifiex_sdio_probe mwifiex_add_card mwifiex_init_hw_fw request_firmware_nowait mwifiex_fw_dpc _mwifiex_fw_dpc mwifiex_init_fw mwifiex_main_process mwifiex_exec_next_cmd mwifiex_dnld_cmd_to_fw mod_timer(&adapter->cmd_timer,..) mwifiex_cmd_timeout_func if_ops.card_reset(adapter) mwifiex_sdio_card_reset [*] stuck in mwifiex_shutdown_sw retn 0 in mwifiex_dnld_cmd_to_fw retn 0 in mwifiex_exec_next_cmd retn 0 in mwifiex_main_process retn -EINPROGRESS in mwifiex_init_fw mwifiex_free_adapter when in error // free adapter complete_all(fw_done); [*]Go on Use adapter->fw_done Best regards, Zheng Zheng Hacker <hackerzheng666@gmail.com> 于2023年2月24日周五 13:37写道: > > Hello Brain, > > Thanks for your detailed review. Sorry we missed something. As you say, We are > lacking some consideration, the pointed race path could not happen. But after > diving deep into the code, we found another path. > > Here is the possible path. In summary, if the execution of CPU1 is > stuck by fw_done, > the cpu0 will continue executing and free the adapter gets into > error-path. The cpu1 > did not notice that and UAF happened. > > If there is something else we didn't know, please feel free to let us know. > > CPU0 CPU1 > mwifiex_sdio_probe > mwifiex_add_card > mwifiex_init_hw_fw > request_firmware_nowait > mwifiex_fw_dpc > _mwifiex_fw_dpc > mwifiex_init_fw > mwifiex_main_process > mwifiex_exec_next_cmd > mwifiex_dnld_cmd_to_fw > mod_timer(&adapter->cmd_timer,..) > mwifiex_cmd_timeout_func > > if_ops.card_reset(adapter) > > mwifiex_sdio_card_reset > [*] stuck > in mwifiex_shutdown_sw > return 0 in mwifiex_dnld_cmd_to_fw > return 0 in mwifiex_exec_next_cmd > return 0 in mwifiex_main_process > return -EINPROGRESS in mwifiex_init_fw > get into error path, mwifiex_free_adapter > // free adapter > complete_all(fw_done); > [*]Go on > Use > adapter->fw_done > > > Best regards, > Zheng > > > Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道: > > > > On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote: > > > Could you please provide some advice about the fix? > > > > This entire driver's locking patterns (or lack > > thereof) need rewritten. This driver was probably written by someone > > that doesn't really understand concurrent programming. It really only > > works because the bulk of normal operation is sequentialized into the > > main loop (mwifiex_main_process()). Any time you get outside that, > > you're likely to find bugs. > > > > But now that I've looked a little further, I'm not confident you pointed > > out a real bug. How does mwifiex_sdio_card_reset_work() get past > > mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That > > should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit > > the race you point out. > > > > Note to self: ignore most "static analysis" reports of race conditions, > > unless they have thorough analysis or a runtime reproduction. > > > > Brian
On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote: > This email is broken for the statement is too long, Here is the newest email. It still wraps a bit weird, but it's good enough I suppose. > retn -EINPROGRESS in mwifiex_init_fw > mwifiex_free_adapter when in error These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as a true error; -EINPROGRESS is treated as success, such that we continue to wait for the command response. Now, we might hang here if that response doesn't come, but that's a different problem... I'm sure there are true bugs in here somewhere, but I've spent enough time reading your incorrect reports and don't plan to spend more. (If you're lucky, maybe you can pique my curiosity again, but don't count on it.) Regards, Brian
Brian Norris <briannorris@chromium.org> 于2023年2月25日周六 05:39写道: > > On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote: > > This email is broken for the statement is too long, Here is the newest email. > > It still wraps a bit weird, but it's good enough I suppose. > > > retn -EINPROGRESS in mwifiex_init_fw > > mwifiex_free_adapter when in error > > These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as > a true error; -EINPROGRESS is treated as success, such that we continue > to wait for the command response. Now, we might hang here if that > response doesn't come, but that's a different problem... > Hi Brain, Sorry for my unclear description. As you say, they don't have any connection. What I really want to express is after mwifiex_init_fw return -EINPROGRESS to its invoker, which is _mwifiex_fw_dpc. It will pass the check of return value, as the following code. ```cpp ret = mwifiex_init_fw(adapter); if (ret == -1) { goto err_init_fw; } else if (!ret) { adapter->hw_status = MWIFIEX_HW_STATUS_READY; goto done; } ``` it continues executing in _mwifiex_fw_dpc. Then in some unexpected situation,, it'll get into error path like mwifiex_init_channel_scan_gap return non-zero code if there is no more memory to use. It'll then get into err_init_chan_scan label and call mwifiex_free_adapte finally. The other thread may USE it afterwards. ```cpp if (mwifiex_init_channel_scan_gap(adapter)) { mwifiex_dbg(adapter, ERROR, "could not init channel stats table\n"); goto err_init_chan_scan; } ``` > I'm sure there are true bugs in here somewhere, but I've spent enough > time reading your incorrect reports and don't plan to spend more. (If > you're lucky, maybe you can pique my curiosity again, but don't count on > it.) > BUT after reviewing the code carefully, I found this might not happen due to some exclusive condition. So yes, I also think there is still some problem here but I'm kind of busy nowadays. I promise to attach a clearer report next time. So sorry to bother you so many times. And appreciate for your precious time spending on this report. Best regards, Zheng
diff --git a/drivers/net/wireless/marvell/mwifiex/cmdevt.c b/drivers/net/wireless/marvell/mwifiex/cmdevt.c index d3339d67e7a0..688dd451aba9 100644 --- a/drivers/net/wireless/marvell/mwifiex/cmdevt.c +++ b/drivers/net/wireless/marvell/mwifiex/cmdevt.c @@ -1016,8 +1016,6 @@ mwifiex_cmd_timeout_func(struct timer_list *t) if (adapter->if_ops.device_dump) adapter->if_ops.device_dump(adapter); - if (adapter->if_ops.card_reset) - adapter->if_ops.card_reset(adapter); } void diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c index 7dddb4b5dea1..ff2d447c1de3 100644 --- a/drivers/net/wireless/marvell/mwifiex/init.c +++ b/drivers/net/wireless/marvell/mwifiex/init.c @@ -47,8 +47,6 @@ static void wakeup_timer_fn(struct timer_list *t) adapter->hw_status = MWIFIEX_HW_STATUS_RESET; mwifiex_cancel_all_pending_cmd(adapter); - if (adapter->if_ops.card_reset) - adapter->if_ops.card_reset(adapter); } static void fw_dump_work(struct work_struct *work)