Message ID | 20230224115310.kernel.v2.1.If0578b001c1f12567f2ebcac5856507f1adee745@changeid |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1114858wrd; Fri, 24 Feb 2023 11:58:54 -0800 (PST) X-Google-Smtp-Source: AK7set/ScW44kslWquHUenc2VuIx8oK0gmuWDyxpiwQvLY878tjB8RdWSgcZzvBD5ztGxvfAWZXP X-Received: by 2002:a62:1ccb:0:b0:5a9:d5c7:199a with SMTP id c194-20020a621ccb000000b005a9d5c7199amr14596562pfc.8.1677268733972; Fri, 24 Feb 2023 11:58:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677268733; cv=none; d=google.com; s=arc-20160816; b=scVy/RfQt6oXl7oUl9aJv+a5hp8uvj7tAfTPqIa4Sq6NeJXkeDl/0kh+2rL4MHS1th lPhVAczDTbUmtVRC97VJWAUO2kPe/o2HKS1dEUT4386isA2vI1MH/7/7lLBi+gtUlXLi qUG5kd15nxrItFJV/p1qJqdjQKvaEVKnTFuvjz8pB59BC/mPDBlGxnZKFJkCEBowZXri UwOKtLyx6bXotnbyrnC1k0ysDWmCfB3mVcYx5ghlk151pcMeWobcjsD65c9smNJ/FevK 3bUFq/NS6/owEui5yrVqaH21TYV8NFeYLIcnzcE05BxAfYHk+RAUILirNg1e9+KcXMzq bPQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=lkRn+6AoxobFYVvnDPTVGQq+gGNHibmmav9mWkGQcVk=; b=Rmk/r+ER4PXdiIcZR8C1nkqQWYOFvcZqIqm0t6+Jexjqa2mROpb3PGB27tGLEVCyoz HEFU+dZfd/JxgsVtMIvblRcGqqeownf5KOO7dHQJ/GrUdvdUssoDQ1xmeokW5iaAr9R0 QU5yv1ABOTz/8yPoY0eZKOolt1WPlTbqBatpCUBQTUYLo+GX4RNLgdIMz3HqpJ9yxqD8 Snfh8MPz/RMcMUaCiID3jT6CxBKGGxVFtICD7F1r/qNKdPCKZbySAFX4qizkIC0dKn9l nWGERIalPtqcVxS45WFUszvgSuxF8vxxy51drse7bpp37g/xZoTSTosCC6UdXBB0XRBa 8ddQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=ILZLCrgZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f18-20020a056a0022d200b0056cb8f6f027si17509294pfj.357.2023.02.24.11.58.41; Fri, 24 Feb 2023 11:58:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=ILZLCrgZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229661AbjBXTxV (ORCPT <rfc822;jeff.pang.chn@gmail.com> + 99 others); Fri, 24 Feb 2023 14:53:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229659AbjBXTxS (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 24 Feb 2023 14:53:18 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE84B31E02 for <linux-kernel@vger.kernel.org>; Fri, 24 Feb 2023 11:53:17 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id 202-20020a6218d3000000b005a89856900eso59186pfy.14 for <linux-kernel@vger.kernel.org>; Fri, 24 Feb 2023 11:53:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lkRn+6AoxobFYVvnDPTVGQq+gGNHibmmav9mWkGQcVk=; b=ILZLCrgZS1VLizZWWAK+C1MzdNr6RID8mWa3hzE+XcEW2p6iM/BvO/yRxHWiOSEEgy gYSwauIfvjWlABo2DKzMfpSLuGdnPjEuKIUgTlKzhKRh+odB1SLOlZV+yDufYDWEhxUy WsDH90DLicnkiGnUWJwiT9OWnpMSKDl7RfsPud9BUtPeR+Ks+VTMRRYFBdUOI8SB5jhY NBaABFh0QiuVy14AjA9bAdJQRChJ7oHSMOTA4UYQCdBLtb4FybP1jZsLwElPtf9IXhLF w3I2g4LJ5yfkWDEc5tNZPkNydfd6+LFhQ/rnKnA/m9qv5RR6lMdOwz5AZm9QDj0iv95E BdGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lkRn+6AoxobFYVvnDPTVGQq+gGNHibmmav9mWkGQcVk=; b=VolZcSHY2OWK+5BoTWX1wmkNMed0u3jP65Q+b/0Hri/0lbqGJdAYTHgK3YJd1/SsBZ IHJQPIBXA0bcXg4M6KiXp7iTWXeZooAU4nOVlczW2k8ll0gEq4QVmlFG0tKhhMy5lFp3 BI4mT3hNvPuMkAzyzC2SU9/rdCMknzWRCCfdLLtE5bf+3+/rEJfaogcNRZEysJAEJ9Xl 8QeDRaBZA06aSAQL9xPAmJilMN9qHqGMnuylsu032GnlWqGGsGKDnGxfxyvv+SaGN445 mLx3hCaPP8axsRFXRCLJd5YzUKMzLoTZNgsDROYRk+aNYcl5A6Mr80fiipXvIsp8ZAgw kWeg== X-Gm-Message-State: AO0yUKVm9wSgMOLCK4v3PANzJo95NJwaTj+p/B6n89YOQRWHCYIjdwPs wOwmFp0vd7IBtJ1yL8Ts14htWhgMgpkA X-Received: from jiangzp-glinux-dev.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:4c52]) (user=jiangzp job=sendgmr) by 2002:a17:903:2348:b0:199:6e3:187a with SMTP id c8-20020a170903234800b0019906e3187amr443395plh.6.1677268397322; Fri, 24 Feb 2023 11:53:17 -0800 (PST) Date: Fri, 24 Feb 2023 11:53:13 -0800 In-Reply-To: <20230224195313.1877313-1-jiangzp@google.com> Mime-Version: 1.0 References: <20230224195313.1877313-1-jiangzp@google.com> X-Mailer: git-send-email 2.39.2.722.g9855ee24e9-goog Message-ID: <20230224115310.kernel.v2.1.If0578b001c1f12567f2ebcac5856507f1adee745@changeid> Subject: [kernel PATCH v2 1/1] Bluetooth: hci_sync: clear workqueue before clear mgmt cmd From: Zhengping Jiang <jiangzp@google.com> To: linux-bluetooth@vger.kernel.org, marcel@holtmann.org, luiz.dentz@gmail.com Cc: mmandlik@google.com, chromeos-bluetooth-upstreaming@chromium.org, Zhengping Jiang <jiangzp@google.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Johan Hedberg <johan.hedberg@gmail.com>, Paolo Abeni <pabeni@redhat.com>, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758743740438790070?= X-GMAIL-MSGID: =?utf-8?q?1758743740438790070?= |
Series |
Clear workqueue to avoid use-after-free
|
|
Commit Message
Zhengping Jiang
Feb. 24, 2023, 7:53 p.m. UTC
Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid
racing conditions which cause use-after-free.
When powering off the adapter, the mgmt cmd list will be cleared. If a
work is queued in the cmd_sync_work queue at the same time, it will
cause the risk of use-after-free, as the cmd pointer is not checked
before use.
Signed-off-by: Zhengping Jiang <jiangzp@google.com>
---
Changes in v2:
- Add function to clear the queue without stop the timer
Changes in v1:
- Clear cmd_sync_work queue before clearing the mgmt cmd list
net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
Comments
Hi Zhengping, On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > racing conditions which cause use-after-free. > > When powering off the adapter, the mgmt cmd list will be cleared. If a > work is queued in the cmd_sync_work queue at the same time, it will > cause the risk of use-after-free, as the cmd pointer is not checked > before use. > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > --- > > Changes in v2: > - Add function to clear the queue without stop the timer > > Changes in v1: > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > index 117eedb6f709..b70365dfff0c 100644 > --- a/net/bluetooth/hci_sync.c > +++ b/net/bluetooth/hci_sync.c > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > } > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > +{ > + struct hci_cmd_sync_work_entry *entry, *tmp; > + > + mutex_lock(&hdev->cmd_sync_work_lock); > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > + if (entry->destroy) { > + hci_req_sync_lock(hdev); > + entry->destroy(hdev, entry->data, -ECANCELED); > + hci_req_sync_unlock(hdev); > + } > + list_del(&entry->list); > + kfree(entry); > + } > + mutex_unlock(&hdev->cmd_sync_work_lock); > +} > + > void hci_cmd_sync_clear(struct hci_dev *hdev) > { > struct hci_cmd_sync_work_entry *entry, *tmp; > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > - hci_dev_test_flag(hdev, HCI_MGMT)) > + hci_dev_test_flag(hdev, HCI_MGMT)) { > + hci_pend_cmd_sync_clear(hdev); Any particular reason why you are not using hci_cmd_sync_clear instead? We also may want to move the clearing logic to hci_dev_close_sync since it should be equivalent to hci_request_cancel_all. > __mgmt_power_off(hdev); > + } > > hci_inquiry_cache_flush(hdev); > hci_pend_le_actions_clear(hdev); > -- > 2.39.2.722.g9855ee24e9-goog >
Hi Luiz, > Any particular reason why you are not using hci_cmd_sync_clear > instead? That is a good question and we used hci_cmd_sync_clear in the first version, but it will clear the queue and also close the timer. As a result, when the adapter is turned on again, the timer will not schedule any new jobs. So the option is to use hci_cmd_sync_clear and re-initiate the queue or to write a new function which only clears the queue. > We also may want to move the clearing logic to > hci_dev_close_sync since it should be equivalent to > hci_request_cancel_all. I actually have a question here. I saw "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought it should force clearing the cmd_sync queue. But it seems cannot prevent the use-after-free situation. Any suggestions to improve the solution? Thanks, Zhengping On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote: > > Hi Zhengping, > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > racing conditions which cause use-after-free. > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > work is queued in the cmd_sync_work queue at the same time, it will > > cause the risk of use-after-free, as the cmd pointer is not checked > > before use. > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > --- > > > > Changes in v2: > > - Add function to clear the queue without stop the timer > > > > Changes in v1: > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > index 117eedb6f709..b70365dfff0c 100644 > > --- a/net/bluetooth/hci_sync.c > > +++ b/net/bluetooth/hci_sync.c > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > } > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > +{ > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > + > > + mutex_lock(&hdev->cmd_sync_work_lock); > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > + if (entry->destroy) { > > + hci_req_sync_lock(hdev); > > + entry->destroy(hdev, entry->data, -ECANCELED); > > + hci_req_sync_unlock(hdev); > > + } > > + list_del(&entry->list); > > + kfree(entry); > > + } > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > +} > > + > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > { > > struct hci_cmd_sync_work_entry *entry, *tmp; > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > + hci_pend_cmd_sync_clear(hdev); > > Any particular reason why you are not using hci_cmd_sync_clear > instead? We also may want to move the clearing logic to > hci_dev_close_sync since it should be equivalent to > hci_request_cancel_all. > > > __mgmt_power_off(hdev); > > + } > > > > hci_inquiry_cache_flush(hdev); > > hci_pend_le_actions_clear(hdev); > > -- > > 2.39.2.722.g9855ee24e9-goog > > > > > -- > Luiz Augusto von Dentz
Hi Luiz, I have a question. Given that each command in the cmd_sync queue should clean up the memory in a callback function. I was wondering if the call to cmd_complete_rsp in __mgmt_power_off function is still necessary? Will this always risk a race condition that cmd has been released when the complete callback or _sync function is run? Thanks, Zhengping On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > > Any particular reason why you are not using hci_cmd_sync_clear > > instead? > > That is a good question and we used hci_cmd_sync_clear in the first > version, but it will clear the queue and also close the timer. As a > result, when the adapter is turned on again, the timer will not > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > re-initiate the queue or to write a new function which only clears the > queue. > > > We also may want to move the clearing logic to > > hci_dev_close_sync since it should be equivalent to > > hci_request_cancel_all. > > I actually have a question here. I saw > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > it should force clearing the cmd_sync queue. But it seems cannot > prevent the use-after-free situation. > > Any suggestions to improve the solution? > > Thanks, > Zhengping > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > <luiz.dentz@gmail.com> wrote: > > > > Hi Zhengping, > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > racing conditions which cause use-after-free. > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > work is queued in the cmd_sync_work queue at the same time, it will > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > before use. > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > --- > > > > > > Changes in v2: > > > - Add function to clear the queue without stop the timer > > > > > > Changes in v1: > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > index 117eedb6f709..b70365dfff0c 100644 > > > --- a/net/bluetooth/hci_sync.c > > > +++ b/net/bluetooth/hci_sync.c > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > } > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > +{ > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > + > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > + if (entry->destroy) { > > > + hci_req_sync_lock(hdev); > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > + hci_req_sync_unlock(hdev); > > > + } > > > + list_del(&entry->list); > > > + kfree(entry); > > > + } > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > +} > > > + > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > { > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > + hci_pend_cmd_sync_clear(hdev); > > > > Any particular reason why you are not using hci_cmd_sync_clear > > instead? We also may want to move the clearing logic to > > hci_dev_close_sync since it should be equivalent to > > hci_request_cancel_all. > > > > > __mgmt_power_off(hdev); > > > + } > > > > > > hci_inquiry_cache_flush(hdev); > > > hci_pend_le_actions_clear(hdev); > > > -- > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > -- > > Luiz Augusto von Dentz
Hi Zhengping, On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > I have a question. Given that each command in the cmd_sync queue > should clean up the memory in a callback function. I was wondering if > the call to cmd_complete_rsp in __mgmt_power_off function is still > necessary? Will this always risk a race condition that cmd has been > released when the complete callback or _sync function is run? Not sure I follow you here, do you have a stack trace when the user after free occurs? > Thanks, > Zhengping > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Hi Luiz, > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > instead? > > > > That is a good question and we used hci_cmd_sync_clear in the first > > version, but it will clear the queue and also close the timer. As a > > result, when the adapter is turned on again, the timer will not > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > re-initiate the queue or to write a new function which only clears the > > queue. > > > > > We also may want to move the clearing logic to > > > hci_dev_close_sync since it should be equivalent to > > > hci_request_cancel_all. > > > > I actually have a question here. I saw > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > it should force clearing the cmd_sync queue. But it seems cannot > > prevent the use-after-free situation. > > > > Any suggestions to improve the solution? > > > > Thanks, > > Zhengping > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > <luiz.dentz@gmail.com> wrote: > > > > > > Hi Zhengping, > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > racing conditions which cause use-after-free. > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > before use. > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > --- > > > > > > > > Changes in v2: > > > > - Add function to clear the queue without stop the timer > > > > > > > > Changes in v1: > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > --- a/net/bluetooth/hci_sync.c > > > > +++ b/net/bluetooth/hci_sync.c > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > } > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > +{ > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > + > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > + if (entry->destroy) { > > > > + hci_req_sync_lock(hdev); > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > + hci_req_sync_unlock(hdev); > > > > + } > > > > + list_del(&entry->list); > > > > + kfree(entry); > > > > + } > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > +} > > > > + > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > { > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > instead? We also may want to move the clearing logic to > > > hci_dev_close_sync since it should be equivalent to > > > hci_request_cancel_all. > > > > > > > __mgmt_power_off(hdev); > > > > + } > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > hci_pend_le_actions_clear(hdev); > > > > -- > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > -- > > > Luiz Augusto von Dentz
Hi Luiz, Sure. Hope this helps. Here is one log from the user. [ 53.368740] ================================================================== [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD platform (rev5+) (DT) [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] [ 53.422780] Call trace: [ 53.425310] dump_backtrace+0x0/0x424 [ 53.429108] show_stack+0x20/0x2c [ 53.432534] dump_stack_lvl+0x84/0xb4 [ 53.436514] print_address_description+0x30/0x2fc [ 53.441369] kasan_report+0x15c/0x19c [ 53.445975] __asan_report_load8_noabort+0x44/0x50 [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] [ 53.472214] process_one_work+0x59c/0xa88 [ 53.476990] worker_thread+0x81c/0xd18 [ 53.480854] kthread+0x2d4/0x3d8 [ 53.484272] ret_from_fork+0x10/0x20 [ 53.489733] Allocated by task 1162: [ 53.493336] kasan_save_stack+0x38/0x68 [ 53.498115] __kasan_kmalloc+0xb4/0xd0 [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] [ 53.531850] sock_sendmsg+0xb4/0xd8 [ 53.535454] sock_write_iter+0x1c0/0x2d0 [ 53.539494] do_iter_readv_writev+0x350/0x4e0 [ 53.543980] do_iter_write+0xf0/0x2e4 [ 53.547747] vfs_writev+0xd0/0x13c [ 53.551254] do_writev+0xe8/0x1fc [ 53.554672] __arm64_sys_writev+0x84/0x98 [ 53.558805] invoke_syscall+0x78/0x20c [ 53.562665] el0_svc_common+0x12c/0x2f0 [ 53.566618] do_el0_svc+0x94/0x13c [ 53.570125] el0_svc+0x5c/0x108 [ 53.573374] el0t_64_sync_handler+0x78/0x108 [ 53.577773] el0t_64_sync+0x1a4/0x1a8 [ 53.583089] Freed by task 3207: [ 53.586325] kasan_save_stack+0x38/0x68 [ 53.590282] kasan_set_track+0x28/0x3c [ 53.594153] kasan_set_free_info+0x28/0x4c [ 53.598369] ____kasan_slab_free+0x138/0x17c [ 53.602767] __kasan_slab_free+0x18/0x28 [ 53.606803] slab_free_freelist_hook+0x188/0x260 [ 53.611559] kfree+0x138/0x29c [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] [ 53.662117] sock_do_ioctl+0xd0/0x1e8 [ 53.665900] sock_ioctl+0x4fc/0x72c [ 53.669500] __arm64_sys_ioctl+0x118/0x154 [ 53.673726] invoke_syscall+0x78/0x20c [ 53.677587] el0_svc_common+0x12c/0x2f0 [ 53.681533] do_el0_svc+0x94/0x13c [ 53.685043] el0_svc+0x5c/0x108 [ 53.688278] el0t_64_sync_handler+0x78/0x108 [ 53.692677] el0t_64_sync+0x1a4/0x1a8 [ 53.697988] Last potentially related work creation: [ 53.703009] kasan_save_stack+0x38/0x68 [ 53.706962] kasan_record_aux_stack+0x104/0x130 [ 53.711622] __call_rcu+0x14c/0x860 [ 53.715212] call_rcu+0x18/0x24 [ 53.718448] sk_filter_uncharge+0xc0/0x120 [ 53.722667] __sk_destruct+0xb4/0x4a8 [ 53.726435] sk_destruct+0x78/0xa0 [ 53.729941] __sk_free+0x190/0x270 [ 53.733453] sk_free+0x54/0x8c [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c [ 53.741000] rcu_do_batch+0x3e8/0xd08 [ 53.744772] nocb_cb_wait+0xc8/0xa3c [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 [ 53.752768] kthread+0x2d4/0x3d8 [ 53.756098] ret_from_fork+0x10/0x20 This is another one at a different function but with the same signature. [ 43.363512] ================================================================== [ 43.370966] BUG: KASAN: use-after-free in mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 [ 43.387158] [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] [ 43.418887] Call trace: [ 43.422407] dump_backtrace+0x0/0x424 [ 43.426191] show_stack+0x20/0x2c [ 43.429608] dump_stack_lvl+0x84/0xb4 [ 43.433395] print_address_description+0x30/0x2fc [ 43.438243] kasan_report+0x15c/0x19c [ 43.442070] __asan_report_load8_noabort+0x44/0x50 hciconfig hci0 up [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth 8dae3a82177133cfa9626e7322b3b0c8f665102d] [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth 8dae3a82177133cfa9626e7322b3b0c8f665102d] [ 43.467656] process_one_work+0x59c/0xa88 [ 43.472530] worker_thread+0x81c/0xd18 [ 43.476410] kthread+0x2d4/0x3d8 localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 [ 43.486588] [ 43.488156] Allocated by task 1118: [ 43.491751] kasan_save_stack+0x38/0x68 [ 43.495709] __kasan_kmalloc+0xb4/0xd0 [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] [ 43.531873] sock_sendmsg+0xb4/0xd8 [ 43.535472] sock_write_iter+0x1c0/0x2d0 [ 43.539519] do_iter_readv_writev+0x350/0x4e0 [ 43.544012] do_iter_write+0xf0/0x2e4 [ 43.547788] vfs_writev+0xd0/0x13c [ 43.551295] do_writev+0xe8/0x1fc [ 43.554710] __arm64_sys_writev+0x84/0x98 [ 43.558838] invoke_syscall+0x78/0x20c [ 43.562709] el0_svc_common+0x12c/0x2f0 [ 43.566654] do_el0_svc+0x94/0x13c [ 43.570155] el0_svc+0x5c/0x108 [ 43.573391] el0t_64_sync_handler+0x78/0x108 [ 43.577785] el0t_64_sync+0x1a4/0x1a8 [ 43.581564] [ 43.583115] Freed by task 3217: [ 43.586356] kasan_save_stack+0x38/0x68 [ 43.590314] kasan_set_track+0x28/0x3c [ 43.594180] kasan_set_free_info+0x28/0x4c [ 43.598396] ____kasan_slab_free+0x138/0x17c [ 43.602794] __kasan_slab_free+0x18/0x28 [ 43.606838] slab_free_freelist_hook+0x188/0x260 [ 43.611591] kfree+0x138/0x29c [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] [ 43.662353] sock_do_ioctl+0xd0/0x1e8 [ 43.666134] sock_ioctl+0x4fc/0x72c [ 43.669736] __arm64_sys_ioctl+0x118/0x154 [ 43.673961] invoke_syscall+0x78/0x20c [ 43.677820] el0_svc_common+0x12c/0x2f0 [ 43.681770] do_el0_svc+0x94/0x13c [ 43.685278] el0_svc+0x5c/0x108 [ 43.688514] el0t_64_sync_handler+0x78/0x108 [ 43.692913] el0t_64_sync+0x1a4/0x1a8 Thanks, Zhengping On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote: > > Hi Zhengping, > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Hi Luiz, > > > > I have a question. Given that each command in the cmd_sync queue > > should clean up the memory in a callback function. I was wondering if > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > necessary? Will this always risk a race condition that cmd has been > > released when the complete callback or _sync function is run? > > Not sure I follow you here, do you have a stack trace when the user > after free occurs? > > > Thanks, > > Zhengping > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > Hi Luiz, > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > instead? > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > version, but it will clear the queue and also close the timer. As a > > > result, when the adapter is turned on again, the timer will not > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > re-initiate the queue or to write a new function which only clears the > > > queue. > > > > > > > We also may want to move the clearing logic to > > > > hci_dev_close_sync since it should be equivalent to > > > > hci_request_cancel_all. > > > > > > I actually have a question here. I saw > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > it should force clearing the cmd_sync queue. But it seems cannot > > > prevent the use-after-free situation. > > > > > > Any suggestions to improve the solution? > > > > > > Thanks, > > > Zhengping > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > Hi Zhengping, > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > racing conditions which cause use-after-free. > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > before use. > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > --- > > > > > > > > > > Changes in v2: > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > Changes in v1: > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > --- a/net/bluetooth/hci_sync.c > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > } > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > +{ > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > + > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > + if (entry->destroy) { > > > > > + hci_req_sync_lock(hdev); > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > + hci_req_sync_unlock(hdev); > > > > > + } > > > > > + list_del(&entry->list); > > > > > + kfree(entry); > > > > > + } > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > +} > > > > > + > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > { > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > instead? We also may want to move the clearing logic to > > > > hci_dev_close_sync since it should be equivalent to > > > > hci_request_cancel_all. > > > > > > > > > __mgmt_power_off(hdev); > > > > > + } > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > hci_pend_le_actions_clear(hdev); > > > > > -- > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > -- > > > > Luiz Augusto von Dentz > > > > -- > Luiz Augusto von Dentz
Hi Zhengping, On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > Sure. Hope this helps. > Here is one log from the user. > > [ 53.368740] ================================================================== > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > platform (rev5+) (DT) > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > [ 53.422780] Call trace: > [ 53.425310] dump_backtrace+0x0/0x424 > [ 53.429108] show_stack+0x20/0x2c > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > [ 53.436514] print_address_description+0x30/0x2fc > [ 53.441369] kasan_report+0x15c/0x19c > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > [ 53.472214] process_one_work+0x59c/0xa88 > [ 53.476990] worker_thread+0x81c/0xd18 > [ 53.480854] kthread+0x2d4/0x3d8 > [ 53.484272] ret_from_fork+0x10/0x20 > > [ 53.489733] Allocated by task 1162: > [ 53.493336] kasan_save_stack+0x38/0x68 > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > [ 53.531850] sock_sendmsg+0xb4/0xd8 > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > [ 53.543980] do_iter_write+0xf0/0x2e4 > [ 53.547747] vfs_writev+0xd0/0x13c > [ 53.551254] do_writev+0xe8/0x1fc > [ 53.554672] __arm64_sys_writev+0x84/0x98 > [ 53.558805] invoke_syscall+0x78/0x20c > [ 53.562665] el0_svc_common+0x12c/0x2f0 > [ 53.566618] do_el0_svc+0x94/0x13c > [ 53.570125] el0_svc+0x5c/0x108 > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > [ 53.583089] Freed by task 3207: > [ 53.586325] kasan_save_stack+0x38/0x68 > [ 53.590282] kasan_set_track+0x28/0x3c > [ 53.594153] kasan_set_free_info+0x28/0x4c > [ 53.598369] ____kasan_slab_free+0x138/0x17c > [ 53.602767] __kasan_slab_free+0x18/0x28 > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > [ 53.611559] kfree+0x138/0x29c > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > [ 53.665900] sock_ioctl+0x4fc/0x72c > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > [ 53.673726] invoke_syscall+0x78/0x20c > [ 53.677587] el0_svc_common+0x12c/0x2f0 > [ 53.681533] do_el0_svc+0x94/0x13c > [ 53.685043] el0_svc+0x5c/0x108 > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > [ 53.697988] Last potentially related work creation: > [ 53.703009] kasan_save_stack+0x38/0x68 > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > [ 53.711622] __call_rcu+0x14c/0x860 > [ 53.715212] call_rcu+0x18/0x24 > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > [ 53.722667] __sk_destruct+0xb4/0x4a8 > [ 53.726435] sk_destruct+0x78/0xa0 > [ 53.729941] __sk_free+0x190/0x270 > [ 53.733453] sk_free+0x54/0x8c > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > [ 53.752768] kthread+0x2d4/0x3d8 > [ 53.756098] ret_from_fork+0x10/0x20 > > This is another one at a different function but with the same signature. > > [ 43.363512] ================================================================== > [ 43.370966] BUG: KASAN: use-after-free in > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > [ 43.387158] > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > [ 43.418887] Call trace: > [ 43.422407] dump_backtrace+0x0/0x424 > [ 43.426191] show_stack+0x20/0x2c > [ 43.429608] dump_stack_lvl+0x84/0xb4 > [ 43.433395] print_address_description+0x30/0x2fc > [ 43.438243] kasan_report+0x15c/0x19c > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > hciconfig hci0 up > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > [ 43.467656] process_one_work+0x59c/0xa88 > [ 43.472530] worker_thread+0x81c/0xd18 > [ 43.476410] kthread+0x2d4/0x3d8 > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > [ 43.486588] > [ 43.488156] Allocated by task 1118: > [ 43.491751] kasan_save_stack+0x38/0x68 > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > [ 43.531873] sock_sendmsg+0xb4/0xd8 > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > [ 43.544012] do_iter_write+0xf0/0x2e4 > [ 43.547788] vfs_writev+0xd0/0x13c > [ 43.551295] do_writev+0xe8/0x1fc > [ 43.554710] __arm64_sys_writev+0x84/0x98 > [ 43.558838] invoke_syscall+0x78/0x20c > [ 43.562709] el0_svc_common+0x12c/0x2f0 > [ 43.566654] do_el0_svc+0x94/0x13c > [ 43.570155] el0_svc+0x5c/0x108 > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > [ 43.581564] > [ 43.583115] Freed by task 3217: > [ 43.586356] kasan_save_stack+0x38/0x68 > [ 43.590314] kasan_set_track+0x28/0x3c > [ 43.594180] kasan_set_free_info+0x28/0x4c > [ 43.598396] ____kasan_slab_free+0x138/0x17c > [ 43.602794] __kasan_slab_free+0x18/0x28 > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > [ 43.611591] kfree+0x138/0x29c > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > [ 43.666134] sock_ioctl+0x4fc/0x72c > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > [ 43.673961] invoke_syscall+0x78/0x20c > [ 43.677820] el0_svc_common+0x12c/0x2f0 > [ 43.681770] do_el0_svc+0x94/0x13c > [ 43.685278] el0_svc+0x5c/0x108 > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > Thanks, > Zhengping Ok, how about we do something like the following: https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 The actual real culprit seem to be __mgmt_power_off does cleanup mgmt_pending but that is still accessible via cmd_sync_work_list, this is probably how hci_request was designed but in case of cmd_sync we normally have the data as part of cmd_sync_work_list. > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > <luiz.dentz@gmail.com> wrote: > > > > Hi Zhengping, > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > Hi Luiz, > > > > > > I have a question. Given that each command in the cmd_sync queue > > > should clean up the memory in a callback function. I was wondering if > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > necessary? Will this always risk a race condition that cmd has been > > > released when the complete callback or _sync function is run? > > > > Not sure I follow you here, do you have a stack trace when the user > > after free occurs? > > > > > Thanks, > > > Zhengping > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > Hi Luiz, > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > instead? > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > version, but it will clear the queue and also close the timer. As a > > > > result, when the adapter is turned on again, the timer will not > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > re-initiate the queue or to write a new function which only clears the > > > > queue. > > > > > > > > > We also may want to move the clearing logic to > > > > > hci_dev_close_sync since it should be equivalent to > > > > > hci_request_cancel_all. > > > > > > > > I actually have a question here. I saw > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > prevent the use-after-free situation. > > > > > > > > Any suggestions to improve the solution? > > > > > > > > Thanks, > > > > Zhengping > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > Hi Zhengping, > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > before use. > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > --- > > > > > > > > > > > > Changes in v2: > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > Changes in v1: > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > } > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > +{ > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > + > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > + if (entry->destroy) { > > > > > > + hci_req_sync_lock(hdev); > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > + hci_req_sync_unlock(hdev); > > > > > > + } > > > > > > + list_del(&entry->list); > > > > > > + kfree(entry); > > > > > > + } > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > +} > > > > > > + > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > { > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > instead? We also may want to move the clearing logic to > > > > > hci_dev_close_sync since it should be equivalent to > > > > > hci_request_cancel_all. > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > + } > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > -- > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > -- > > > > > Luiz Augusto von Dentz > > > > > > > > -- > > Luiz Augusto von Dentz
Hi Luiz, This looks good to me. I still have a question. Does this prevent a job scheduled between "hci_cmd_sync_work_list_clear(hdev);" and "__mgmt_power_off(hdev);"? Otherwise, the chance for a race condition is still there. Maybe using cancel_work_sync and re-init the workqueue timer is the right thing to do? Thanks, Zhengping On Tue, Feb 28, 2023 at 1:11 PM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote: > > Hi Zhengping, > > On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Hi Luiz, > > > > Sure. Hope this helps. > > Here is one log from the user. > > > > [ 53.368740] ================================================================== > > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > > platform (rev5+) (DT) > > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > [ 53.422780] Call trace: > > [ 53.425310] dump_backtrace+0x0/0x424 > > [ 53.429108] show_stack+0x20/0x2c > > > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > > [ 53.436514] print_address_description+0x30/0x2fc > > [ 53.441369] kasan_report+0x15c/0x19c > > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > [ 53.472214] process_one_work+0x59c/0xa88 > > [ 53.476990] worker_thread+0x81c/0xd18 > > [ 53.480854] kthread+0x2d4/0x3d8 > > [ 53.484272] ret_from_fork+0x10/0x20 > > > > [ 53.489733] Allocated by task 1162: > > [ 53.493336] kasan_save_stack+0x38/0x68 > > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > [ 53.531850] sock_sendmsg+0xb4/0xd8 > > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > > [ 53.543980] do_iter_write+0xf0/0x2e4 > > [ 53.547747] vfs_writev+0xd0/0x13c > > [ 53.551254] do_writev+0xe8/0x1fc > > [ 53.554672] __arm64_sys_writev+0x84/0x98 > > [ 53.558805] invoke_syscall+0x78/0x20c > > [ 53.562665] el0_svc_common+0x12c/0x2f0 > > [ 53.566618] do_el0_svc+0x94/0x13c > > [ 53.570125] el0_svc+0x5c/0x108 > > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > > > [ 53.583089] Freed by task 3207: > > [ 53.586325] kasan_save_stack+0x38/0x68 > > [ 53.590282] kasan_set_track+0x28/0x3c > > [ 53.594153] kasan_set_free_info+0x28/0x4c > > [ 53.598369] ____kasan_slab_free+0x138/0x17c > > [ 53.602767] __kasan_slab_free+0x18/0x28 > > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > > [ 53.611559] kfree+0x138/0x29c > > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > > [ 53.665900] sock_ioctl+0x4fc/0x72c > > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > > [ 53.673726] invoke_syscall+0x78/0x20c > > [ 53.677587] el0_svc_common+0x12c/0x2f0 > > [ 53.681533] do_el0_svc+0x94/0x13c > > [ 53.685043] el0_svc+0x5c/0x108 > > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > > > [ 53.697988] Last potentially related work creation: > > [ 53.703009] kasan_save_stack+0x38/0x68 > > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > > [ 53.711622] __call_rcu+0x14c/0x860 > > [ 53.715212] call_rcu+0x18/0x24 > > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > > [ 53.722667] __sk_destruct+0xb4/0x4a8 > > [ 53.726435] sk_destruct+0x78/0xa0 > > [ 53.729941] __sk_free+0x190/0x270 > > [ 53.733453] sk_free+0x54/0x8c > > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > > [ 53.752768] kthread+0x2d4/0x3d8 > > [ 53.756098] ret_from_fork+0x10/0x20 > > > > This is another one at a different function but with the same signature. > > > > [ 43.363512] ================================================================== > > [ 43.370966] BUG: KASAN: use-after-free in > > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > > [ 43.387158] > > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > [ 43.418887] Call trace: > > [ 43.422407] dump_backtrace+0x0/0x424 > > [ 43.426191] show_stack+0x20/0x2c > > [ 43.429608] dump_stack_lvl+0x84/0xb4 > > [ 43.433395] print_address_description+0x30/0x2fc > > [ 43.438243] kasan_report+0x15c/0x19c > > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > > hciconfig hci0 up > > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > [ 43.467656] process_one_work+0x59c/0xa88 > > [ 43.472530] worker_thread+0x81c/0xd18 > > [ 43.476410] kthread+0x2d4/0x3d8 > > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > > [ 43.486588] > > [ 43.488156] Allocated by task 1118: > > [ 43.491751] kasan_save_stack+0x38/0x68 > > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > [ 43.531873] sock_sendmsg+0xb4/0xd8 > > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > > [ 43.544012] do_iter_write+0xf0/0x2e4 > > [ 43.547788] vfs_writev+0xd0/0x13c > > [ 43.551295] do_writev+0xe8/0x1fc > > [ 43.554710] __arm64_sys_writev+0x84/0x98 > > [ 43.558838] invoke_syscall+0x78/0x20c > > [ 43.562709] el0_svc_common+0x12c/0x2f0 > > [ 43.566654] do_el0_svc+0x94/0x13c > > [ 43.570155] el0_svc+0x5c/0x108 > > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > > [ 43.581564] > > [ 43.583115] Freed by task 3217: > > [ 43.586356] kasan_save_stack+0x38/0x68 > > [ 43.590314] kasan_set_track+0x28/0x3c > > [ 43.594180] kasan_set_free_info+0x28/0x4c > > [ 43.598396] ____kasan_slab_free+0x138/0x17c > > [ 43.602794] __kasan_slab_free+0x18/0x28 > > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > > [ 43.611591] kfree+0x138/0x29c > > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > > [ 43.666134] sock_ioctl+0x4fc/0x72c > > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > > [ 43.673961] invoke_syscall+0x78/0x20c > > [ 43.677820] el0_svc_common+0x12c/0x2f0 > > [ 43.681770] do_el0_svc+0x94/0x13c > > [ 43.685278] el0_svc+0x5c/0x108 > > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > > > Thanks, > > Zhengping > > Ok, how about we do something like the following: > > https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 > > The actual real culprit seem to be __mgmt_power_off does cleanup > mgmt_pending but that is still accessible via cmd_sync_work_list, this > is probably how hci_request was designed but in case of cmd_sync we > normally have the data as part of cmd_sync_work_list. > > > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > > <luiz.dentz@gmail.com> wrote: > > > > > > Hi Zhengping, > > > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > Hi Luiz, > > > > > > > > I have a question. Given that each command in the cmd_sync queue > > > > should clean up the memory in a callback function. I was wondering if > > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > > necessary? Will this always risk a race condition that cmd has been > > > > released when the complete callback or _sync function is run? > > > > > > Not sure I follow you here, do you have a stack trace when the user > > > after free occurs? > > > > > > > Thanks, > > > > Zhengping > > > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > Hi Luiz, > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > instead? > > > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > > version, but it will clear the queue and also close the timer. As a > > > > > result, when the adapter is turned on again, the timer will not > > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > > re-initiate the queue or to write a new function which only clears the > > > > > queue. > > > > > > > > > > > We also may want to move the clearing logic to > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > hci_request_cancel_all. > > > > > > > > > > I actually have a question here. I saw > > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > > prevent the use-after-free situation. > > > > > > > > > > Any suggestions to improve the solution? > > > > > > > > > > Thanks, > > > > > Zhengping > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > > before use. > > > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > > --- > > > > > > > > > > > > > > Changes in v2: > > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > > > Changes in v1: > > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > > } > > > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > +{ > > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > + > > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > > + if (entry->destroy) { > > > > > > > + hci_req_sync_lock(hdev); > > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > > + hci_req_sync_unlock(hdev); > > > > > > > + } > > > > > > > + list_del(&entry->list); > > > > > > > + kfree(entry); > > > > > > > + } > > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > > +} > > > > > > > + > > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > { > > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > instead? We also may want to move the clearing logic to > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > > + } > > > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > > -- > > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > -- > > > Luiz Augusto von Dentz > > > > -- > Luiz Augusto von Dentz
Hi Zhengping, On Tue, Feb 28, 2023 at 4:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > This looks good to me. I still have a question. Does this prevent a > job scheduled between "hci_cmd_sync_work_list_clear(hdev);" and > "__mgmt_power_off(hdev);"? Otherwise, the chance for a race condition > is still there. Maybe using cancel_work_sync and re-init the workqueue > timer is the right thing to do? I tried the cancel_work_sync but it doesn't work since to cmd_sync_work itself can call hci_dev_close_sync so it deadlocks, Ive also tried stopping new scheduling of new work based on HCI_UP flag but that causes some tests not to run, perhaps we need a dedicated flag to ensure cmd_sync cannot be schedule after a certain point but I could found the exact point it is, anyway I fine leaving this to when we actually have a more clear understanding or a reproducer. > Thanks, > Zhengping > > On Tue, Feb 28, 2023 at 1:11 PM Luiz Augusto von Dentz > <luiz.dentz@gmail.com> wrote: > > > > Hi Zhengping, > > > > On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > Hi Luiz, > > > > > > Sure. Hope this helps. > > > Here is one log from the user. > > > > > > [ 53.368740] ================================================================== > > > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > > > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > > > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > > > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > > > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > > > platform (rev5+) (DT) > > > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > [ 53.422780] Call trace: > > > [ 53.425310] dump_backtrace+0x0/0x424 > > > [ 53.429108] show_stack+0x20/0x2c > > > > > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > > > [ 53.436514] print_address_description+0x30/0x2fc > > > [ 53.441369] kasan_report+0x15c/0x19c > > > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > > > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > [ 53.472214] process_one_work+0x59c/0xa88 > > > [ 53.476990] worker_thread+0x81c/0xd18 > > > [ 53.480854] kthread+0x2d4/0x3d8 > > > [ 53.484272] ret_from_fork+0x10/0x20 > > > > > > [ 53.489733] Allocated by task 1162: > > > [ 53.493336] kasan_save_stack+0x38/0x68 > > > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > > > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > > > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > > > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > > > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > > > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > [ 53.531850] sock_sendmsg+0xb4/0xd8 > > > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > > > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > > > [ 53.543980] do_iter_write+0xf0/0x2e4 > > > [ 53.547747] vfs_writev+0xd0/0x13c > > > [ 53.551254] do_writev+0xe8/0x1fc > > > [ 53.554672] __arm64_sys_writev+0x84/0x98 > > > [ 53.558805] invoke_syscall+0x78/0x20c > > > [ 53.562665] el0_svc_common+0x12c/0x2f0 > > > [ 53.566618] do_el0_svc+0x94/0x13c > > > [ 53.570125] el0_svc+0x5c/0x108 > > > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > > > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > > > > > [ 53.583089] Freed by task 3207: > > > [ 53.586325] kasan_save_stack+0x38/0x68 > > > [ 53.590282] kasan_set_track+0x28/0x3c > > > [ 53.594153] kasan_set_free_info+0x28/0x4c > > > [ 53.598369] ____kasan_slab_free+0x138/0x17c > > > [ 53.602767] __kasan_slab_free+0x18/0x28 > > > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > > > [ 53.611559] kfree+0x138/0x29c > > > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > > > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > > > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > > > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > > > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > > > [ 53.665900] sock_ioctl+0x4fc/0x72c > > > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > > > [ 53.673726] invoke_syscall+0x78/0x20c > > > [ 53.677587] el0_svc_common+0x12c/0x2f0 > > > [ 53.681533] do_el0_svc+0x94/0x13c > > > [ 53.685043] el0_svc+0x5c/0x108 > > > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > > > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > > > > > [ 53.697988] Last potentially related work creation: > > > [ 53.703009] kasan_save_stack+0x38/0x68 > > > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > > > [ 53.711622] __call_rcu+0x14c/0x860 > > > [ 53.715212] call_rcu+0x18/0x24 > > > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > > > [ 53.722667] __sk_destruct+0xb4/0x4a8 > > > [ 53.726435] sk_destruct+0x78/0xa0 > > > [ 53.729941] __sk_free+0x190/0x270 > > > [ 53.733453] sk_free+0x54/0x8c > > > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > > > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > > > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > > > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > > > [ 53.752768] kthread+0x2d4/0x3d8 > > > [ 53.756098] ret_from_fork+0x10/0x20 > > > > > > This is another one at a different function but with the same signature. > > > > > > [ 43.363512] ================================================================== > > > [ 43.370966] BUG: KASAN: use-after-free in > > > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > > > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > > > [ 43.387158] > > > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > > > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > > > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > > [ 43.418887] Call trace: > > > [ 43.422407] dump_backtrace+0x0/0x424 > > > [ 43.426191] show_stack+0x20/0x2c > > > [ 43.429608] dump_stack_lvl+0x84/0xb4 > > > [ 43.433395] print_address_description+0x30/0x2fc > > > [ 43.438243] kasan_report+0x15c/0x19c > > > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > > > hciconfig hci0 up > > > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > [ 43.467656] process_one_work+0x59c/0xa88 > > > [ 43.472530] worker_thread+0x81c/0xd18 > > > [ 43.476410] kthread+0x2d4/0x3d8 > > > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > > > [ 43.486588] > > > [ 43.488156] Allocated by task 1118: > > > [ 43.491751] kasan_save_stack+0x38/0x68 > > > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > > > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > > > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > > > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > > > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > [ 43.531873] sock_sendmsg+0xb4/0xd8 > > > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > > > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > > > [ 43.544012] do_iter_write+0xf0/0x2e4 > > > [ 43.547788] vfs_writev+0xd0/0x13c > > > [ 43.551295] do_writev+0xe8/0x1fc > > > [ 43.554710] __arm64_sys_writev+0x84/0x98 > > > [ 43.558838] invoke_syscall+0x78/0x20c > > > [ 43.562709] el0_svc_common+0x12c/0x2f0 > > > [ 43.566654] do_el0_svc+0x94/0x13c > > > [ 43.570155] el0_svc+0x5c/0x108 > > > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > > > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > > > [ 43.581564] > > > [ 43.583115] Freed by task 3217: > > > [ 43.586356] kasan_save_stack+0x38/0x68 > > > [ 43.590314] kasan_set_track+0x28/0x3c > > > [ 43.594180] kasan_set_free_info+0x28/0x4c > > > [ 43.598396] ____kasan_slab_free+0x138/0x17c > > > [ 43.602794] __kasan_slab_free+0x18/0x28 > > > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > > > [ 43.611591] kfree+0x138/0x29c > > > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > > > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > > > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > > > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > > > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > > > [ 43.666134] sock_ioctl+0x4fc/0x72c > > > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > > > [ 43.673961] invoke_syscall+0x78/0x20c > > > [ 43.677820] el0_svc_common+0x12c/0x2f0 > > > [ 43.681770] do_el0_svc+0x94/0x13c > > > [ 43.685278] el0_svc+0x5c/0x108 > > > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > > > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > > > > > Thanks, > > > Zhengping > > > > Ok, how about we do something like the following: > > > > https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 > > > > The actual real culprit seem to be __mgmt_power_off does cleanup > > mgmt_pending but that is still accessible via cmd_sync_work_list, this > > is probably how hci_request was designed but in case of cmd_sync we > > normally have the data as part of cmd_sync_work_list. > > > > > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > Hi Zhengping, > > > > > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > Hi Luiz, > > > > > > > > > > I have a question. Given that each command in the cmd_sync queue > > > > > should clean up the memory in a callback function. I was wondering if > > > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > > > necessary? Will this always risk a race condition that cmd has been > > > > > released when the complete callback or _sync function is run? > > > > > > > > Not sure I follow you here, do you have a stack trace when the user > > > > after free occurs? > > > > > > > > > Thanks, > > > > > Zhengping > > > > > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > instead? > > > > > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > > > version, but it will clear the queue and also close the timer. As a > > > > > > result, when the adapter is turned on again, the timer will not > > > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > > > re-initiate the queue or to write a new function which only clears the > > > > > > queue. > > > > > > > > > > > > > We also may want to move the clearing logic to > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > I actually have a question here. I saw > > > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > > > prevent the use-after-free situation. > > > > > > > > > > > > Any suggestions to improve the solution? > > > > > > > > > > > > Thanks, > > > > > > Zhengping > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > > > before use. > > > > > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > > > --- > > > > > > > > > > > > > > > > Changes in v2: > > > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > > > > > Changes in v1: > > > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > > > } > > > > > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > +{ > > > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > + > > > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > > > + if (entry->destroy) { > > > > > > > > + hci_req_sync_lock(hdev); > > > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > > > + hci_req_sync_unlock(hdev); > > > > > > > > + } > > > > > > > > + list_del(&entry->list); > > > > > > > > + kfree(entry); > > > > > > > > + } > > > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > > > +} > > > > > > > > + > > > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > { > > > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > instead? We also may want to move the clearing logic to > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > > > + } > > > > > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > > > -- > > > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > -- > > > > Luiz Augusto von Dentz > > > > > > > > -- > > Luiz Augusto von Dentz
Hi Luiz, Thanks for testing these options! > perhaps we need a dedicated flag to ensure cmd_sync cannot be schedule after a certain point This actually sounds promising to me. I would think about this. This does not happen in regular use, but one of our customers has a script to run a stress test by turning on/off the adapter and rebooting for a few cycles. Then the crash can be reproduced. If you have any new ideas, I can schedule a test. Just to confirm if you will submit the current patch or you would hold it for a solid solution? The current patch to clear the cmd_sync list indeed reduces the crash frequency. Best, Zhengping On Tue, Feb 28, 2023 at 5:53 PM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote: > > Hi Zhengping, > > On Tue, Feb 28, 2023 at 4:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Hi Luiz, > > > > This looks good to me. I still have a question. Does this prevent a > > job scheduled between "hci_cmd_sync_work_list_clear(hdev);" and > > "__mgmt_power_off(hdev);"? Otherwise, the chance for a race condition > > is still there. Maybe using cancel_work_sync and re-init the workqueue > > timer is the right thing to do? > > I tried the cancel_work_sync but it doesn't work since to > cmd_sync_work itself can call hci_dev_close_sync so it deadlocks, Ive > also tried stopping new scheduling of new work based on HCI_UP flag > but that causes some tests not to run, perhaps we need a dedicated > flag to ensure cmd_sync cannot be schedule after a certain point but I > could found the exact point it is, anyway I fine leaving this to when > we actually have a more clear understanding or a reproducer. > > > Thanks, > > Zhengping > > > > On Tue, Feb 28, 2023 at 1:11 PM Luiz Augusto von Dentz > > <luiz.dentz@gmail.com> wrote: > > > > > > Hi Zhengping, > > > > > > On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > Hi Luiz, > > > > > > > > Sure. Hope this helps. > > > > Here is one log from the user. > > > > > > > > [ 53.368740] ================================================================== > > > > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > > > > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > > > > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > > > > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > > > > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > > > > platform (rev5+) (DT) > > > > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > [ 53.422780] Call trace: > > > > [ 53.425310] dump_backtrace+0x0/0x424 > > > > [ 53.429108] show_stack+0x20/0x2c > > > > > > > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > > > > [ 53.436514] print_address_description+0x30/0x2fc > > > > [ 53.441369] kasan_report+0x15c/0x19c > > > > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > > > > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > [ 53.472214] process_one_work+0x59c/0xa88 > > > > [ 53.476990] worker_thread+0x81c/0xd18 > > > > [ 53.480854] kthread+0x2d4/0x3d8 > > > > [ 53.484272] ret_from_fork+0x10/0x20 > > > > > > > > [ 53.489733] Allocated by task 1162: > > > > [ 53.493336] kasan_save_stack+0x38/0x68 > > > > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > > > > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > > > > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > > > > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > > > > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > [ 53.531850] sock_sendmsg+0xb4/0xd8 > > > > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > > > > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > > > > [ 53.543980] do_iter_write+0xf0/0x2e4 > > > > [ 53.547747] vfs_writev+0xd0/0x13c > > > > [ 53.551254] do_writev+0xe8/0x1fc > > > > [ 53.554672] __arm64_sys_writev+0x84/0x98 > > > > [ 53.558805] invoke_syscall+0x78/0x20c > > > > [ 53.562665] el0_svc_common+0x12c/0x2f0 > > > > [ 53.566618] do_el0_svc+0x94/0x13c > > > > [ 53.570125] el0_svc+0x5c/0x108 > > > > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > > > > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > [ 53.583089] Freed by task 3207: > > > > [ 53.586325] kasan_save_stack+0x38/0x68 > > > > [ 53.590282] kasan_set_track+0x28/0x3c > > > > [ 53.594153] kasan_set_free_info+0x28/0x4c > > > > [ 53.598369] ____kasan_slab_free+0x138/0x17c > > > > [ 53.602767] __kasan_slab_free+0x18/0x28 > > > > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > > > > [ 53.611559] kfree+0x138/0x29c > > > > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > > > > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > > > > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > > > > [ 53.665900] sock_ioctl+0x4fc/0x72c > > > > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > > > > [ 53.673726] invoke_syscall+0x78/0x20c > > > > [ 53.677587] el0_svc_common+0x12c/0x2f0 > > > > [ 53.681533] do_el0_svc+0x94/0x13c > > > > [ 53.685043] el0_svc+0x5c/0x108 > > > > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > > > > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > [ 53.697988] Last potentially related work creation: > > > > [ 53.703009] kasan_save_stack+0x38/0x68 > > > > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > > > > [ 53.711622] __call_rcu+0x14c/0x860 > > > > [ 53.715212] call_rcu+0x18/0x24 > > > > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > > > > [ 53.722667] __sk_destruct+0xb4/0x4a8 > > > > [ 53.726435] sk_destruct+0x78/0xa0 > > > > [ 53.729941] __sk_free+0x190/0x270 > > > > [ 53.733453] sk_free+0x54/0x8c > > > > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > > > > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > > > > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > > > > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > > > > [ 53.752768] kthread+0x2d4/0x3d8 > > > > [ 53.756098] ret_from_fork+0x10/0x20 > > > > > > > > This is another one at a different function but with the same signature. > > > > > > > > [ 43.363512] ================================================================== > > > > [ 43.370966] BUG: KASAN: use-after-free in > > > > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > > > > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > > > > [ 43.387158] > > > > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > > > > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > > > > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > > > > [ 43.418887] Call trace: > > > > [ 43.422407] dump_backtrace+0x0/0x424 > > > > [ 43.426191] show_stack+0x20/0x2c > > > > [ 43.429608] dump_stack_lvl+0x84/0xb4 > > > > [ 43.433395] print_address_description+0x30/0x2fc > > > > [ 43.438243] kasan_report+0x15c/0x19c > > > > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > > > > hciconfig hci0 up > > > > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > [ 43.467656] process_one_work+0x59c/0xa88 > > > > [ 43.472530] worker_thread+0x81c/0xd18 > > > > [ 43.476410] kthread+0x2d4/0x3d8 > > > > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > > > > [ 43.486588] > > > > [ 43.488156] Allocated by task 1118: > > > > [ 43.491751] kasan_save_stack+0x38/0x68 > > > > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > > > > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > > > > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > > > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > > > > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > [ 43.531873] sock_sendmsg+0xb4/0xd8 > > > > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > > > > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > > > > [ 43.544012] do_iter_write+0xf0/0x2e4 > > > > [ 43.547788] vfs_writev+0xd0/0x13c > > > > [ 43.551295] do_writev+0xe8/0x1fc > > > > [ 43.554710] __arm64_sys_writev+0x84/0x98 > > > > [ 43.558838] invoke_syscall+0x78/0x20c > > > > [ 43.562709] el0_svc_common+0x12c/0x2f0 > > > > [ 43.566654] do_el0_svc+0x94/0x13c > > > > [ 43.570155] el0_svc+0x5c/0x108 > > > > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > > > > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > > > > [ 43.581564] > > > > [ 43.583115] Freed by task 3217: > > > > [ 43.586356] kasan_save_stack+0x38/0x68 > > > > [ 43.590314] kasan_set_track+0x28/0x3c > > > > [ 43.594180] kasan_set_free_info+0x28/0x4c > > > > [ 43.598396] ____kasan_slab_free+0x138/0x17c > > > > [ 43.602794] __kasan_slab_free+0x18/0x28 > > > > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > > > > [ 43.611591] kfree+0x138/0x29c > > > > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > > > > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > > > > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > > > > [ 43.666134] sock_ioctl+0x4fc/0x72c > > > > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > > > > [ 43.673961] invoke_syscall+0x78/0x20c > > > > [ 43.677820] el0_svc_common+0x12c/0x2f0 > > > > [ 43.681770] do_el0_svc+0x94/0x13c > > > > [ 43.685278] el0_svc+0x5c/0x108 > > > > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > > > > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > Thanks, > > > > Zhengping > > > > > > Ok, how about we do something like the following: > > > > > > https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 > > > > > > The actual real culprit seem to be __mgmt_power_off does cleanup > > > mgmt_pending but that is still accessible via cmd_sync_work_list, this > > > is probably how hci_request was designed but in case of cmd_sync we > > > normally have the data as part of cmd_sync_work_list. > > > > > > > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > Hi Zhengping, > > > > > > > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > I have a question. Given that each command in the cmd_sync queue > > > > > > should clean up the memory in a callback function. I was wondering if > > > > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > > > > necessary? Will this always risk a race condition that cmd has been > > > > > > released when the complete callback or _sync function is run? > > > > > > > > > > Not sure I follow you here, do you have a stack trace when the user > > > > > after free occurs? > > > > > > > > > > > Thanks, > > > > > > Zhengping > > > > > > > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > instead? > > > > > > > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > > > > version, but it will clear the queue and also close the timer. As a > > > > > > > result, when the adapter is turned on again, the timer will not > > > > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > > > > re-initiate the queue or to write a new function which only clears the > > > > > > > queue. > > > > > > > > > > > > > > > We also may want to move the clearing logic to > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > I actually have a question here. I saw > > > > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > > > > prevent the use-after-free situation. > > > > > > > > > > > > > > Any suggestions to improve the solution? > > > > > > > > > > > > > > Thanks, > > > > > > > Zhengping > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > > > > before use. > > > > > > > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > > > > --- > > > > > > > > > > > > > > > > > > Changes in v2: > > > > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > > > > > > > Changes in v1: > > > > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > > > > } > > > > > > > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > +{ > > > > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > + > > > > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > > > > + if (entry->destroy) { > > > > > > > > > + hci_req_sync_lock(hdev); > > > > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > > > > + hci_req_sync_unlock(hdev); > > > > > > > > > + } > > > > > > > > > + list_del(&entry->list); > > > > > > > > > + kfree(entry); > > > > > > > > > + } > > > > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > > > > +} > > > > > > > > > + > > > > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > { > > > > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > instead? We also may want to move the clearing logic to > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > > > > + } > > > > > > > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > > > > -- > > > > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > > > > > -- > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > -- > > > Luiz Augusto von Dentz > > > > -- > Luiz Augusto von Dentz
Hi Luiz, I have revisited this issue and asked our partner to do some more tests. I think the root cause is the use of hdev->mgmt_pending. There are some mgmt_pending_cmd, which would be released at callbacks. For example: > err = hci_cmd_sync_queue(hdev, mgmt_remove_adv_monitor_sync, cmd, mgmt_remove_adv_monitor_complete); mgmt_remove_adv_monitor_complete will release the cmd memory by calling mgmt_pending_remove(cmd). In this case, the call to mgmt_pending_foreach(0, hdev, cmd_complete_rsp, &status) at __mgmt_power_off will double free the memory at a race condition. A quick solution is to detect and skip some opcode at cmd_complete_rsp if a command is consistently released by callback. > static void cmd_complete_rsp(struct mgmt_pending_cmd *cmd, void *data) > { > if (cmd->opcode == MGMT_OP_REMOVE_ADV_MONITOR || > cmd->opcode == MGMT_OP_SET_SSP) > return; To fully remove the race condition, maybe using two lists is necessary. What do you think about this proposal? Thanks, Zhengping Thanks, Zhengping On Tue, Feb 28, 2023 at 6:11 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > Thanks for testing these options! > > > perhaps we need a dedicated flag to ensure cmd_sync cannot be schedule after a certain point > This actually sounds promising to me. I would think about this. > > This does not happen in regular use, but one of our customers has a > script to run a stress test by turning on/off the adapter and > rebooting for a few cycles. Then the crash can be reproduced. If you > have any new ideas, I can schedule a test. > > Just to confirm if you will submit the current patch or you would hold > it for a solid solution? The current patch to clear the cmd_sync list > indeed reduces the crash frequency. > > Best, > Zhengping > > On Tue, Feb 28, 2023 at 5:53 PM Luiz Augusto von Dentz > <luiz.dentz@gmail.com> wrote: > > > > Hi Zhengping, > > > > On Tue, Feb 28, 2023 at 4:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > Hi Luiz, > > > > > > This looks good to me. I still have a question. Does this prevent a > > > job scheduled between "hci_cmd_sync_work_list_clear(hdev);" and > > > "__mgmt_power_off(hdev);"? Otherwise, the chance for a race condition > > > is still there. Maybe using cancel_work_sync and re-init the workqueue > > > timer is the right thing to do? > > > > I tried the cancel_work_sync but it doesn't work since to > > cmd_sync_work itself can call hci_dev_close_sync so it deadlocks, Ive > > also tried stopping new scheduling of new work based on HCI_UP flag > > but that causes some tests not to run, perhaps we need a dedicated > > flag to ensure cmd_sync cannot be schedule after a certain point but I > > could found the exact point it is, anyway I fine leaving this to when > > we actually have a more clear understanding or a reproducer. > > > > > Thanks, > > > Zhengping > > > > > > On Tue, Feb 28, 2023 at 1:11 PM Luiz Augusto von Dentz > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > Hi Zhengping, > > > > > > > > On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > Hi Luiz, > > > > > > > > > > Sure. Hope this helps. > > > > > Here is one log from the user. > > > > > > > > > > [ 53.368740] ================================================================== > > > > > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > > > > > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > > > > > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > > > > > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > > > > > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > > > > > platform (rev5+) (DT) > > > > > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > [ 53.422780] Call trace: > > > > > [ 53.425310] dump_backtrace+0x0/0x424 > > > > > [ 53.429108] show_stack+0x20/0x2c > > > > > > > > > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > > > > > [ 53.436514] print_address_description+0x30/0x2fc > > > > > [ 53.441369] kasan_report+0x15c/0x19c > > > > > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > > > > > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > > [ 53.472214] process_one_work+0x59c/0xa88 > > > > > [ 53.476990] worker_thread+0x81c/0xd18 > > > > > [ 53.480854] kthread+0x2d4/0x3d8 > > > > > [ 53.484272] ret_from_fork+0x10/0x20 > > > > > > > > > > [ 53.489733] Allocated by task 1162: > > > > > [ 53.493336] kasan_save_stack+0x38/0x68 > > > > > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > > > > > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > > > > > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > > > > > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > > [ 53.531850] sock_sendmsg+0xb4/0xd8 > > > > > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > > > > > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > > > > > [ 53.543980] do_iter_write+0xf0/0x2e4 > > > > > [ 53.547747] vfs_writev+0xd0/0x13c > > > > > [ 53.551254] do_writev+0xe8/0x1fc > > > > > [ 53.554672] __arm64_sys_writev+0x84/0x98 > > > > > [ 53.558805] invoke_syscall+0x78/0x20c > > > > > [ 53.562665] el0_svc_common+0x12c/0x2f0 > > > > > [ 53.566618] do_el0_svc+0x94/0x13c > > > > > [ 53.570125] el0_svc+0x5c/0x108 > > > > > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > > > > > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > [ 53.583089] Freed by task 3207: > > > > > [ 53.586325] kasan_save_stack+0x38/0x68 > > > > > [ 53.590282] kasan_set_track+0x28/0x3c > > > > > [ 53.594153] kasan_set_free_info+0x28/0x4c > > > > > [ 53.598369] ____kasan_slab_free+0x138/0x17c > > > > > [ 53.602767] __kasan_slab_free+0x18/0x28 > > > > > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > > > > > [ 53.611559] kfree+0x138/0x29c > > > > > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > > > > > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > > > > > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > > > > > [ 53.665900] sock_ioctl+0x4fc/0x72c > > > > > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > > > > > [ 53.673726] invoke_syscall+0x78/0x20c > > > > > [ 53.677587] el0_svc_common+0x12c/0x2f0 > > > > > [ 53.681533] do_el0_svc+0x94/0x13c > > > > > [ 53.685043] el0_svc+0x5c/0x108 > > > > > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > > > > > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > [ 53.697988] Last potentially related work creation: > > > > > [ 53.703009] kasan_save_stack+0x38/0x68 > > > > > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > > > > > [ 53.711622] __call_rcu+0x14c/0x860 > > > > > [ 53.715212] call_rcu+0x18/0x24 > > > > > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > > > > > [ 53.722667] __sk_destruct+0xb4/0x4a8 > > > > > [ 53.726435] sk_destruct+0x78/0xa0 > > > > > [ 53.729941] __sk_free+0x190/0x270 > > > > > [ 53.733453] sk_free+0x54/0x8c > > > > > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > > > > > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > > > > > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > > > > > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > > > > > [ 53.752768] kthread+0x2d4/0x3d8 > > > > > [ 53.756098] ret_from_fork+0x10/0x20 > > > > > > > > > > This is another one at a different function but with the same signature. > > > > > > > > > > [ 43.363512] ================================================================== > > > > > [ 43.370966] BUG: KASAN: use-after-free in > > > > > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > > > > > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > > > > > [ 43.387158] > > > > > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > > > > > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > > > > > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > > > > > > [ 43.418887] Call trace: > > > > > [ 43.422407] dump_backtrace+0x0/0x424 > > > > > [ 43.426191] show_stack+0x20/0x2c > > > > > [ 43.429608] dump_stack_lvl+0x84/0xb4 > > > > > [ 43.433395] print_address_description+0x30/0x2fc > > > > > [ 43.438243] kasan_report+0x15c/0x19c > > > > > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > > > > > hciconfig hci0 up > > > > > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > > [ 43.467656] process_one_work+0x59c/0xa88 > > > > > [ 43.472530] worker_thread+0x81c/0xd18 > > > > > [ 43.476410] kthread+0x2d4/0x3d8 > > > > > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > > > > > [ 43.486588] > > > > > [ 43.488156] Allocated by task 1118: > > > > > [ 43.491751] kasan_save_stack+0x38/0x68 > > > > > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > > > > > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > > > > > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > > > > > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > > > > > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > > [ 43.531873] sock_sendmsg+0xb4/0xd8 > > > > > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > > > > > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > > > > > [ 43.544012] do_iter_write+0xf0/0x2e4 > > > > > [ 43.547788] vfs_writev+0xd0/0x13c > > > > > [ 43.551295] do_writev+0xe8/0x1fc > > > > > [ 43.554710] __arm64_sys_writev+0x84/0x98 > > > > > [ 43.558838] invoke_syscall+0x78/0x20c > > > > > [ 43.562709] el0_svc_common+0x12c/0x2f0 > > > > > [ 43.566654] do_el0_svc+0x94/0x13c > > > > > [ 43.570155] el0_svc+0x5c/0x108 > > > > > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > > > > > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > > > > > [ 43.581564] > > > > > [ 43.583115] Freed by task 3217: > > > > > [ 43.586356] kasan_save_stack+0x38/0x68 > > > > > [ 43.590314] kasan_set_track+0x28/0x3c > > > > > [ 43.594180] kasan_set_free_info+0x28/0x4c > > > > > [ 43.598396] ____kasan_slab_free+0x138/0x17c > > > > > [ 43.602794] __kasan_slab_free+0x18/0x28 > > > > > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > > > > > [ 43.611591] kfree+0x138/0x29c > > > > > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > > > > > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > > > > > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > > > > > [ 43.666134] sock_ioctl+0x4fc/0x72c > > > > > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > > > > > [ 43.673961] invoke_syscall+0x78/0x20c > > > > > [ 43.677820] el0_svc_common+0x12c/0x2f0 > > > > > [ 43.681770] do_el0_svc+0x94/0x13c > > > > > [ 43.685278] el0_svc+0x5c/0x108 > > > > > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > > > > > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > Thanks, > > > > > Zhengping > > > > > > > > Ok, how about we do something like the following: > > > > > > > > https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 > > > > > > > > The actual real culprit seem to be __mgmt_power_off does cleanup > > > > mgmt_pending but that is still accessible via cmd_sync_work_list, this > > > > is probably how hci_request was designed but in case of cmd_sync we > > > > normally have the data as part of cmd_sync_work_list. > > > > > > > > > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > > I have a question. Given that each command in the cmd_sync queue > > > > > > > should clean up the memory in a callback function. I was wondering if > > > > > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > > > > > necessary? Will this always risk a race condition that cmd has been > > > > > > > released when the complete callback or _sync function is run? > > > > > > > > > > > > Not sure I follow you here, do you have a stack trace when the user > > > > > > after free occurs? > > > > > > > > > > > > > Thanks, > > > > > > > Zhengping > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > > instead? > > > > > > > > > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > > > > > version, but it will clear the queue and also close the timer. As a > > > > > > > > result, when the adapter is turned on again, the timer will not > > > > > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > > > > > re-initiate the queue or to write a new function which only clears the > > > > > > > > queue. > > > > > > > > > > > > > > > > > We also may want to move the clearing logic to > > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > > I actually have a question here. I saw > > > > > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > > > > > prevent the use-after-free situation. > > > > > > > > > > > > > > > > Any suggestions to improve the solution? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Zhengping > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > > > > > before use. > > > > > > > > > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > Changes in v2: > > > > > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > > > > > > > > > Changes in v1: > > > > > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > > +{ > > > > > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > > + > > > > > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > > > > > + if (entry->destroy) { > > > > > > > > > > + hci_req_sync_lock(hdev); > > > > > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > > > > > + hci_req_sync_unlock(hdev); > > > > > > > > > > + } > > > > > > > > > > + list_del(&entry->list); > > > > > > > > > > + kfree(entry); > > > > > > > > > > + } > > > > > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > > > > > +} > > > > > > > > > > + > > > > > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > > { > > > > > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > > instead? We also may want to move the clearing logic to > > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > > > > > -- > > > > > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > -- > > > > Luiz Augusto von Dentz > > > > > > > > -- > > Luiz Augusto von Dentz
Hi Zhengping, On Tue, May 9, 2023 at 4:13 PM Zhengping Jiang <jiangzp@google.com> wrote: > > Hi Luiz, > > I have revisited this issue and asked our partner to do some more > tests. I think the root cause is the use of hdev->mgmt_pending. There > are some mgmt_pending_cmd, which would be released at callbacks. For > example: > > > err = hci_cmd_sync_queue(hdev, mgmt_remove_adv_monitor_sync, cmd, mgmt_remove_adv_monitor_complete); > > mgmt_remove_adv_monitor_complete will release the cmd memory by > calling mgmt_pending_remove(cmd). > > In this case, the call to mgmt_pending_foreach(0, hdev, > cmd_complete_rsp, &status) at __mgmt_power_off will double free the > memory at a race condition. > > A quick solution is to detect and skip some opcode at cmd_complete_rsp > if a command is consistently released by callback. There should be some means to remove from the list when cleanup is done, or perhaps this is because of the order we are doing the cleanup? Either way it would be great to have a test add to mgmt-tester to try to reproduce so we make sure we don't reintroduce the same behavior in the future. > > static void cmd_complete_rsp(struct mgmt_pending_cmd *cmd, void *data) > > { > > if (cmd->opcode == MGMT_OP_REMOVE_ADV_MONITOR || > > cmd->opcode == MGMT_OP_SET_SSP) > > return; > > To fully remove the race condition, maybe using two lists is > necessary. What do you think about this proposal? > > Thanks, > Zhengping > > Thanks, > Zhengping > > On Tue, Feb 28, 2023 at 6:11 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > Hi Luiz, > > > > Thanks for testing these options! > > > > > perhaps we need a dedicated flag to ensure cmd_sync cannot be schedule after a certain point > > This actually sounds promising to me. I would think about this. > > > > This does not happen in regular use, but one of our customers has a > > script to run a stress test by turning on/off the adapter and > > rebooting for a few cycles. Then the crash can be reproduced. If you > > have any new ideas, I can schedule a test. > > > > Just to confirm if you will submit the current patch or you would hold > > it for a solid solution? The current patch to clear the cmd_sync list > > indeed reduces the crash frequency. > > > > Best, > > Zhengping > > > > On Tue, Feb 28, 2023 at 5:53 PM Luiz Augusto von Dentz > > <luiz.dentz@gmail.com> wrote: > > > > > > Hi Zhengping, > > > > > > On Tue, Feb 28, 2023 at 4:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > Hi Luiz, > > > > > > > > This looks good to me. I still have a question. Does this prevent a > > > > job scheduled between "hci_cmd_sync_work_list_clear(hdev);" and > > > > "__mgmt_power_off(hdev);"? Otherwise, the chance for a race condition > > > > is still there. Maybe using cancel_work_sync and re-init the workqueue > > > > timer is the right thing to do? > > > > > > I tried the cancel_work_sync but it doesn't work since to > > > cmd_sync_work itself can call hci_dev_close_sync so it deadlocks, Ive > > > also tried stopping new scheduling of new work based on HCI_UP flag > > > but that causes some tests not to run, perhaps we need a dedicated > > > flag to ensure cmd_sync cannot be schedule after a certain point but I > > > could found the exact point it is, anyway I fine leaving this to when > > > we actually have a more clear understanding or a reproducer. > > > > > > > Thanks, > > > > Zhengping > > > > > > > > On Tue, Feb 28, 2023 at 1:11 PM Luiz Augusto von Dentz > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > Hi Zhengping, > > > > > > > > > > On Mon, Feb 27, 2023 at 3:58 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > Sure. Hope this helps. > > > > > > Here is one log from the user. > > > > > > > > > > > > [ 53.368740] ================================================================== > > > > > > [ 53.376167] BUG: KASAN: use-after-free in set_ssp_sync+0x44/0x154 [bluetooth] > > > > > > [ 53.384303] Read of size 8 at addr ffffff80b7ee0318 by task kworker/u17:0/190 > > > > > > [ 53.396342] CPU: 7 PID: 190 Comm: kworker/u17:0 Tainted: G W > > > > > > 5.15.59-lockdep #1 29eed131ef0afd42bc369a6a0ca1c69d3653699b > > > > > > [ 53.408868] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD > > > > > > platform (rev5+) (DT) > > > > > > [ 53.417095] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > > [ 53.422780] Call trace: > > > > > > [ 53.425310] dump_backtrace+0x0/0x424 > > > > > > [ 53.429108] show_stack+0x20/0x2c > > > > > > > > > > > > [ 53.432534] dump_stack_lvl+0x84/0xb4 > > > > > > [ 53.436514] print_address_description+0x30/0x2fc > > > > > > [ 53.441369] kasan_report+0x15c/0x19c > > > > > > [ 53.445975] __asan_report_load8_noabort+0x44/0x50 > > > > > > [ 53.450910] set_ssp_sync+0x44/0x154 [bluetooth > > > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > > > [ 53.460136] hci_cmd_sync_work+0x1c8/0x2c8 [bluetooth > > > > > > 34f6fa2bbf49f3d7faf6ea04e8755ae16590a6b3] > > > > > > [ 53.472214] process_one_work+0x59c/0xa88 > > > > > > [ 53.476990] worker_thread+0x81c/0xd18 > > > > > > [ 53.480854] kthread+0x2d4/0x3d8 > > > > > > [ 53.484272] ret_from_fork+0x10/0x20 > > > > > > > > > > > > [ 53.489733] Allocated by task 1162: > > > > > > [ 53.493336] kasan_save_stack+0x38/0x68 > > > > > > [ 53.498115] __kasan_kmalloc+0xb4/0xd0 > > > > > > [ 53.501993] kmem_cache_alloc_trace+0x29c/0x374 > > > > > > [ 53.506661] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > > > [ 53.511905] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > > [ 53.517059] set_ssp+0x2d8/0x5b0 [bluetooth] > > > > > > [ 53.521575] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > > > [ 53.526538] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > > > [ 53.531850] sock_sendmsg+0xb4/0xd8 > > > > > > [ 53.535454] sock_write_iter+0x1c0/0x2d0 > > > > > > [ 53.539494] do_iter_readv_writev+0x350/0x4e0 > > > > > > [ 53.543980] do_iter_write+0xf0/0x2e4 > > > > > > [ 53.547747] vfs_writev+0xd0/0x13c > > > > > > [ 53.551254] do_writev+0xe8/0x1fc > > > > > > [ 53.554672] __arm64_sys_writev+0x84/0x98 > > > > > > [ 53.558805] invoke_syscall+0x78/0x20c > > > > > > [ 53.562665] el0_svc_common+0x12c/0x2f0 > > > > > > [ 53.566618] do_el0_svc+0x94/0x13c > > > > > > [ 53.570125] el0_svc+0x5c/0x108 > > > > > > [ 53.573374] el0t_64_sync_handler+0x78/0x108 > > > > > > [ 53.577773] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > > > [ 53.583089] Freed by task 3207: > > > > > > [ 53.586325] kasan_save_stack+0x38/0x68 > > > > > > [ 53.590282] kasan_set_track+0x28/0x3c > > > > > > [ 53.594153] kasan_set_free_info+0x28/0x4c > > > > > > [ 53.598369] ____kasan_slab_free+0x138/0x17c > > > > > > [ 53.602767] __kasan_slab_free+0x18/0x28 > > > > > > [ 53.606803] slab_free_freelist_hook+0x188/0x260 > > > > > > [ 53.611559] kfree+0x138/0x29c > > > > > > [ 53.614708] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > > > [ 53.619948] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > > > [ 53.625357] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > > > [ 53.630586] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > > > [ 53.636076] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > > > [ 53.641390] hci_dev_close_sync+0x314/0x814 [bluetooth] > > > > > > [ 53.646882] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > > > [ 53.652017] hci_dev_close+0xa4/0x15c [bluetooth] > > > > > > [ 53.656980] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > > > [ 53.662117] sock_do_ioctl+0xd0/0x1e8 > > > > > > [ 53.665900] sock_ioctl+0x4fc/0x72c > > > > > > [ 53.669500] __arm64_sys_ioctl+0x118/0x154 > > > > > > [ 53.673726] invoke_syscall+0x78/0x20c > > > > > > [ 53.677587] el0_svc_common+0x12c/0x2f0 > > > > > > [ 53.681533] do_el0_svc+0x94/0x13c > > > > > > [ 53.685043] el0_svc+0x5c/0x108 > > > > > > [ 53.688278] el0t_64_sync_handler+0x78/0x108 > > > > > > [ 53.692677] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > > > [ 53.697988] Last potentially related work creation: > > > > > > [ 53.703009] kasan_save_stack+0x38/0x68 > > > > > > [ 53.706962] kasan_record_aux_stack+0x104/0x130 > > > > > > [ 53.711622] __call_rcu+0x14c/0x860 > > > > > > [ 53.715212] call_rcu+0x18/0x24 > > > > > > [ 53.718448] sk_filter_uncharge+0xc0/0x120 > > > > > > [ 53.722667] __sk_destruct+0xb4/0x4a8 > > > > > > [ 53.726435] sk_destruct+0x78/0xa0 > > > > > > [ 53.729941] __sk_free+0x190/0x270 > > > > > > [ 53.733453] sk_free+0x54/0x8c > > > > > > [ 53.736603] deferred_put_nlk_sk+0x1d4/0x20c > > > > > > [ 53.741000] rcu_do_batch+0x3e8/0xd08 > > > > > > [ 53.744772] nocb_cb_wait+0xc8/0xa3c > > > > > > [ 53.748453] rcu_nocb_cb_kthread+0x48/0x134 > > > > > > [ 53.752768] kthread+0x2d4/0x3d8 > > > > > > [ 53.756098] ret_from_fork+0x10/0x20 > > > > > > > > > > > > This is another one at a different function but with the same signature. > > > > > > > > > > > > [ 43.363512] ================================================================== > > > > > > [ 43.370966] BUG: KASAN: use-after-free in > > > > > > mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth] > > > > > > [ 43.379813] Read of size 8 at addr ffffff8096c28d18 by task kworker/u17:0/192 > > > > > > [ 43.387158] > > > > > > [ 43.388705] CPU: 6 PID: 192 Comm: kworker/u17:0 Tainted: G W > > > > > > 5.15.59-lockdep #1 59f35e3dfc07f6688b084869895c7a39892c891a > > > > > > localhost ~ # [ 43.410184] Workqueue: hci0 hci_cmd_sync_work [bluetooth] > > > > > > > > > > > > [ 43.418887] Call trace: > > > > > > [ 43.422407] dump_backtrace+0x0/0x424 > > > > > > [ 43.426191] show_stack+0x20/0x2c > > > > > > [ 43.429608] dump_stack_lvl+0x84/0xb4 > > > > > > [ 43.433395] print_address_description+0x30/0x2fc > > > > > > [ 43.438243] kasan_report+0x15c/0x19c > > > > > > [ 43.442070] __asan_report_load8_noabort+0x44/0x50 > > > > > > hciconfig hci0 up > > > > > > [ 43.447009] mgmt_remove_adv_monitor_sync+0x40/0xcc [bluetooth > > > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > > > [ 43.458568] hci_cmd_sync_work+0x1bc/0x2bc [bluetooth > > > > > > 8dae3a82177133cfa9626e7322b3b0c8f665102d] > > > > > > [ 43.467656] process_one_work+0x59c/0xa88 > > > > > > [ 43.472530] worker_thread+0x81c/0xd18 > > > > > > [ 43.476410] kthread+0x2d4/0x3d8 > > > > > > localhost ~ # [ 43.479753] ret_from_fork+0x10/0x20 > > > > > > [ 43.486588] > > > > > > [ 43.488156] Allocated by task 1118: > > > > > > [ 43.491751] kasan_save_stack+0x38/0x68 > > > > > > [ 43.495709] __kasan_kmalloc+0xb4/0xd0 > > > > > > [ 43.499577] kmem_cache_alloc_trace+0x29c/0x374 > > > > > > [ 43.504238] mgmt_pending_new+0x74/0x200 [bluetooth] > > > > > > sleep 2[ 43.509509] mgmt_pending_add+0x28/0xec [bluetooth] > > > > > > > > > > > > [ 43.515244] remove_adv_monitor+0xf8/0x174 [bluetooth] > > > > > > [ 43.521533] hci_mgmt_cmd+0x5c4/0x8b0 [bluetooth] > > > > > > [ 43.526527] hci_sock_sendmsg+0x28c/0x95c [bluetooth] > > > > > > [ 43.531873] sock_sendmsg+0xb4/0xd8 > > > > > > [ 43.535472] sock_write_iter+0x1c0/0x2d0 > > > > > > [ 43.539519] do_iter_readv_writev+0x350/0x4e0 > > > > > > [ 43.544012] do_iter_write+0xf0/0x2e4 > > > > > > [ 43.547788] vfs_writev+0xd0/0x13c > > > > > > [ 43.551295] do_writev+0xe8/0x1fc > > > > > > [ 43.554710] __arm64_sys_writev+0x84/0x98 > > > > > > [ 43.558838] invoke_syscall+0x78/0x20c > > > > > > [ 43.562709] el0_svc_common+0x12c/0x2f0 > > > > > > [ 43.566654] do_el0_svc+0x94/0x13c > > > > > > [ 43.570155] el0_svc+0x5c/0x108 > > > > > > [ 43.573391] el0t_64_sync_handler+0x78/0x108 > > > > > > [ 43.577785] el0t_64_sync+0x1a4/0x1a8 > > > > > > [ 43.581564] > > > > > > [ 43.583115] Freed by task 3217: > > > > > > [ 43.586356] kasan_save_stack+0x38/0x68 > > > > > > [ 43.590314] kasan_set_track+0x28/0x3c > > > > > > [ 43.594180] kasan_set_free_info+0x28/0x4c > > > > > > [ 43.598396] ____kasan_slab_free+0x138/0x17c > > > > > > [ 43.602794] __kasan_slab_free+0x18/0x28 > > > > > > [ 43.606838] slab_free_freelist_hook+0x188/0x260 > > > > > > [ 43.611591] kfree+0x138/0x29c > > > > > > [ 43.614741] mgmt_pending_free+0xac/0xdc [bluetooth] > > > > > > [ 43.620003] mgmt_pending_remove+0xd8/0xf0 [bluetooth] > > > > > > [ 43.625434] cmd_complete_rsp+0xc8/0x178 [bluetooth] > > > > > > [ 43.630686] mgmt_pending_foreach+0xa8/0xf8 [bluetooth] > > > > > > [ 43.636198] __mgmt_power_off+0x114/0x26c [bluetooth] > > > > > > [ 43.641532] hci_dev_close_sync+0x2ec/0x7ec [bluetooth] > > > > > > [ 43.647049] hci_dev_do_close+0x3c/0x7c [bluetooth] > > > > > > [ 43.652209] hci_dev_close+0xac/0x164 [bluetooth] > > > > > > [ 43.657190] hci_sock_ioctl+0x298/0x444 [bluetooth] > > > > > > [ 43.662353] sock_do_ioctl+0xd0/0x1e8 > > > > > > [ 43.666134] sock_ioctl+0x4fc/0x72c > > > > > > [ 43.669736] __arm64_sys_ioctl+0x118/0x154 > > > > > > [ 43.673961] invoke_syscall+0x78/0x20c > > > > > > [ 43.677820] el0_svc_common+0x12c/0x2f0 > > > > > > [ 43.681770] do_el0_svc+0x94/0x13c > > > > > > [ 43.685278] el0_svc+0x5c/0x108 > > > > > > [ 43.688514] el0t_64_sync_handler+0x78/0x108 > > > > > > [ 43.692913] el0t_64_sync+0x1a4/0x1a8 > > > > > > > > > > > > Thanks, > > > > > > Zhengping > > > > > > > > > > Ok, how about we do something like the following: > > > > > > > > > > https://gist.github.com/Vudentz/365d664275e4d2e2af157e47f0502f50 > > > > > > > > > > The actual real culprit seem to be __mgmt_power_off does cleanup > > > > > mgmt_pending but that is still accessible via cmd_sync_work_list, this > > > > > is probably how hci_request was designed but in case of cmd_sync we > > > > > normally have the data as part of cmd_sync_work_list. > > > > > > > > > > > On Mon, Feb 27, 2023 at 3:41 PM Luiz Augusto von Dentz > > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > > > On Sun, Feb 26, 2023 at 11:18 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > > > > I have a question. Given that each command in the cmd_sync queue > > > > > > > > should clean up the memory in a callback function. I was wondering if > > > > > > > > the call to cmd_complete_rsp in __mgmt_power_off function is still > > > > > > > > necessary? Will this always risk a race condition that cmd has been > > > > > > > > released when the complete callback or _sync function is run? > > > > > > > > > > > > > > Not sure I follow you here, do you have a stack trace when the user > > > > > > > after free occurs? > > > > > > > > > > > > > > > Thanks, > > > > > > > > Zhengping > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 2:37 PM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > > > Hi Luiz, > > > > > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > > > instead? > > > > > > > > > > > > > > > > > > That is a good question and we used hci_cmd_sync_clear in the first > > > > > > > > > version, but it will clear the queue and also close the timer. As a > > > > > > > > > result, when the adapter is turned on again, the timer will not > > > > > > > > > schedule any new jobs. So the option is to use hci_cmd_sync_clear and > > > > > > > > > re-initiate the queue or to write a new function which only clears the > > > > > > > > > queue. > > > > > > > > > > > > > > > > > > > We also may want to move the clearing logic to > > > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > > > > I actually have a question here. I saw > > > > > > > > > "drain_workqueue(hdev->workqueue)" in hci_dev_close_sync and thought > > > > > > > > > it should force clearing the cmd_sync queue. But it seems cannot > > > > > > > > > prevent the use-after-free situation. > > > > > > > > > > > > > > > > > > Any suggestions to improve the solution? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Zhengping > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 1:02 PM Luiz Augusto von Dentz > > > > > > > > > <luiz.dentz@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi Zhengping, > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2023 at 11:53 AM Zhengping Jiang <jiangzp@google.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Clear cmd_sync_work queue before clearing the mgmt cmd list to avoid > > > > > > > > > > > racing conditions which cause use-after-free. > > > > > > > > > > > > > > > > > > > > > > When powering off the adapter, the mgmt cmd list will be cleared. If a > > > > > > > > > > > work is queued in the cmd_sync_work queue at the same time, it will > > > > > > > > > > > cause the risk of use-after-free, as the cmd pointer is not checked > > > > > > > > > > > before use. > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Zhengping Jiang <jiangzp@google.com> > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > > Changes in v2: > > > > > > > > > > > - Add function to clear the queue without stop the timer > > > > > > > > > > > > > > > > > > > > > > Changes in v1: > > > > > > > > > > > - Clear cmd_sync_work queue before clearing the mgmt cmd list > > > > > > > > > > > > > > > > > > > > > > net/bluetooth/hci_sync.c | 21 ++++++++++++++++++++- > > > > > > > > > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > > > > > > > > > > > index 117eedb6f709..b70365dfff0c 100644 > > > > > > > > > > > --- a/net/bluetooth/hci_sync.c > > > > > > > > > > > +++ b/net/bluetooth/hci_sync.c > > > > > > > > > > > @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) > > > > > > > > > > > INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > > > +{ > > > > > > > > > > > + struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > > > + > > > > > > > > > > > + mutex_lock(&hdev->cmd_sync_work_lock); > > > > > > > > > > > + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { > > > > > > > > > > > + if (entry->destroy) { > > > > > > > > > > > + hci_req_sync_lock(hdev); > > > > > > > > > > > + entry->destroy(hdev, entry->data, -ECANCELED); > > > > > > > > > > > + hci_req_sync_unlock(hdev); > > > > > > > > > > > + } > > > > > > > > > > > + list_del(&entry->list); > > > > > > > > > > > + kfree(entry); > > > > > > > > > > > + } > > > > > > > > > > > + mutex_unlock(&hdev->cmd_sync_work_lock); > > > > > > > > > > > +} > > > > > > > > > > > + > > > > > > > > > > > void hci_cmd_sync_clear(struct hci_dev *hdev) > > > > > > > > > > > { > > > > > > > > > > > struct hci_cmd_sync_work_entry *entry, *tmp; > > > > > > > > > > > @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) > > > > > > > > > > > > > > > > > > > > > > if (!auto_off && hdev->dev_type == HCI_PRIMARY && > > > > > > > > > > > !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && > > > > > > > > > > > - hci_dev_test_flag(hdev, HCI_MGMT)) > > > > > > > > > > > + hci_dev_test_flag(hdev, HCI_MGMT)) { > > > > > > > > > > > + hci_pend_cmd_sync_clear(hdev); > > > > > > > > > > > > > > > > > > > > Any particular reason why you are not using hci_cmd_sync_clear > > > > > > > > > > instead? We also may want to move the clearing logic to > > > > > > > > > > hci_dev_close_sync since it should be equivalent to > > > > > > > > > > hci_request_cancel_all. > > > > > > > > > > > > > > > > > > > > > __mgmt_power_off(hdev); > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > hci_inquiry_cache_flush(hdev); > > > > > > > > > > > hci_pend_le_actions_clear(hdev); > > > > > > > > > > > -- > > > > > > > > > > > 2.39.2.722.g9855ee24e9-goog > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > > > > > > > > > -- > > > > > Luiz Augusto von Dentz > > > > > > > > > > > > -- > > > Luiz Augusto von Dentz
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index 117eedb6f709..b70365dfff0c 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -636,6 +636,23 @@ void hci_cmd_sync_init(struct hci_dev *hdev) INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); } +static void hci_pend_cmd_sync_clear(struct hci_dev *hdev) +{ + struct hci_cmd_sync_work_entry *entry, *tmp; + + mutex_lock(&hdev->cmd_sync_work_lock); + list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { + if (entry->destroy) { + hci_req_sync_lock(hdev); + entry->destroy(hdev, entry->data, -ECANCELED); + hci_req_sync_unlock(hdev); + } + list_del(&entry->list); + kfree(entry); + } + mutex_unlock(&hdev->cmd_sync_work_lock); +} + void hci_cmd_sync_clear(struct hci_dev *hdev) { struct hci_cmd_sync_work_entry *entry, *tmp; @@ -4842,8 +4859,10 @@ int hci_dev_close_sync(struct hci_dev *hdev) if (!auto_off && hdev->dev_type == HCI_PRIMARY && !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && - hci_dev_test_flag(hdev, HCI_MGMT)) + hci_dev_test_flag(hdev, HCI_MGMT)) { + hci_pend_cmd_sync_clear(hdev); __mgmt_power_off(hdev); + } hci_inquiry_cache_flush(hdev); hci_pend_le_actions_clear(hdev);