Message ID | 20230504123418.4438-1-dinghui@sangfor.com.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp304643vqo; Thu, 4 May 2023 06:11:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7JYgs1UOxEbkWOxIhzAskQzh6lexlMrOoUwRe6dqzNsLp3+JLHOgB+zs1EZX5pxluN4vSw X-Received: by 2002:a17:903:503:b0:1a1:add5:c355 with SMTP id jn3-20020a170903050300b001a1add5c355mr3327692plb.5.1683205883617; Thu, 04 May 2023 06:11:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683205883; cv=none; d=google.com; s=arc-20160816; b=ex+EbEdZ8tSXvtSjJyp57yejIyVFasUE/BI86Q6Q+Ty45gd+lESHeEuql4HhDzawnQ jY8etye2GVI8u7AlVu7SgVZq+VRrYHZ+55LfVRTW3pQiD5GU8KZo4gp0CA8w7FL38L6+ EcZ8EmUISV8G4x68Lbkt/Qwi+Cc60kNKTN4qsqKgdKc4IXi4pbq+ub7kM7qg9wnQOEIt zzPU9P40w6aWQYgjoCNf1CMo+F0PrR+rI96+25VAggn4lNQ4iJ668UPeNJuNNbvoS3JM Jayc8EIMCiB+5oRuWQNTf+11vb2OBvNqumA//XnFaD0YD3mOgvI2lwW0nMK8p5aMsCA0 OW7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from; bh=gBJRJsUET0/+HfhTYHMvVxAlpli/OXV95jOJmFTgUPw=; b=ca/OKaaH0KHtfnOcNuK6YhpoY/T4VNo4peRgP2jiT4uRbdVSj4L1fHwjI4ZHRUBN9e Cd7RRszxby6wXzPbZqSyzuFRI6D6GNphDy3PwWwnY+8HMauSqTYxE//emVhcryKRr2yx /MTh/PzqaJ0ATWdVl5PYIqAylvchurSxUOV9XiQDj3w4Wfh8CZtA7X6eWj8bnO9DeH04 MJA9hvWNeFh3PRgmL1SHVs1dTbsnhhK0UafrLdt/+JG5thY9+UOsPkfSTuTwovhPQsmn kQbLJPu5c29FNuev9PMHo6DqkafBP3TTyx4OFHCdvTschk08lAsCH4TDl/QodjLNf2CI Jtlw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q17-20020a17090311d100b001a6f94a0854si36117065plh.324.2023.05.04.06.11.10; Thu, 04 May 2023 06:11:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230030AbjEDMnp (ORCPT <rfc822;b08248@gmail.com> + 99 others); Thu, 4 May 2023 08:43:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229959AbjEDMno (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 4 May 2023 08:43:44 -0400 Received: from mail-m127104.qiye.163.com (mail-m127104.qiye.163.com [115.236.127.104]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71EB66185 for <linux-kernel@vger.kernel.org>; Thu, 4 May 2023 05:43:42 -0700 (PDT) Received: from localhost.localdomain (unknown [IPV6:240e:3b7:327f:5c30:f10d:e185:a86d:3be7]) by mail-m127104.qiye.163.com (Hmail) with ESMTPA id CD3D5A4050F; Thu, 4 May 2023 20:35:03 +0800 (CST) From: Ding Hui <dinghui@sangfor.com.cn> To: bhelgaas@google.com Cc: sathyanarayanan.kuppuswamy@linux.intel.com, vidyas@nvidia.com, david.e.box@linux.intel.com, kai.heng.feng@canonical.com, michael.a.bottini@linux.intel.com, rajatja@google.com, qinzongquan@sangfor.com.cn, dinghui@sangfor.com.cn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] PCI/ASPM: fix UAF by disable ASPM for link when child function is removed Date: Thu, 4 May 2023 20:34:18 +0800 Message-Id: <20230504123418.4438-1-dinghui@sangfor.com.cn> X-Mailer: git-send-email 2.17.1 X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVkZSkoaVh4dH05MTx0dTkIeGlUTARMWGhIXJBQOD1 lXWRgSC1lBWUlPSx5BSBlMQUhJTB1BThhIS0EdSksfQR5KQ05BGkNNH0FIGR5MWVdZFhoPEhUdFF lBWU9LSFVKSktISkNVSktLVUtZBg++ X-HM-Tid: 0a87e6c2455cb282kuuucd3d5a4050f X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6Kxg6SBw6PT0cAT8yCFYYIVY1 NzEwCQhVSlVKTUNISUtITEtPTUpDVTMWGhIXVR8SFRwTDhI7CBoVHB0UCVUYFBZVGBVFWVdZEgtZ QVlJT0seQUgZTEFISUwdQU4YSEtBHUpLH0EeSkNOQRpDTR9BSBkeTFlXWQgBWUFIT0pNNwY+ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764969292397137413?= X-GMAIL-MSGID: =?utf-8?q?1764969292397137413?= |
Series |
PCI/ASPM: fix UAF by disable ASPM for link when child function is removed
|
|
Commit Message
Ding Hui
May 4, 2023, 12:34 p.m. UTC
If the Function 0 of a Multi-Function device is software removed,
a freed downstream pointer will be left in struct pcie_link_state,
and then when pcie_config_aspm_link() be invoked from any path,
we will trigger use-after-free.
Based on the PCIe spec about ASPM Control (PCIe r6.0, sec 7.5.3.7),
for Multi-Function Devices (including ARI Devices), it is recommended
that software program the same value in all Functions. For ARI
Devices, ASPM Control is determined solely by the setting in Function 0.
So we can just disable ASPM of the whole component if any child
function is removed, the downstream pointer will be avoided from
use-after-free, that will also avoid other potential corner cases.
Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities")
Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn>
Suggestion-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
drivers/pci/pcie/aspm.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
Comments
Hi, On 5/4/23 5:34 AM, Ding Hui wrote: Maybe you can use the following title? "PCI/ASPM: Fix UAF by disabling ASPM for link when child function is removed > If the Function 0 of a Multi-Function device is software removed, > a freed downstream pointer will be left in struct pcie_link_state, > and then when pcie_config_aspm_link() be invoked from any path, > we will trigger use-after-free. > > Based on the PCIe spec about ASPM Control (PCIe r6.0, sec 7.5.3.7), As per PCIe spec r6.0, sec 7.5.3.7, it is recommended > for Multi-Function Devices (including ARI Devices), it is recommended > that software program the same value in all Functions. For ARI > Devices, ASPM Control is determined solely by the setting in Function 0. > > So we can just disable ASPM of the whole component if any child > function is removed, the downstream pointer will be avoided from > use-after-free, that will also avoid other potential corner cases. > > Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities") > Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn> Any bugzilla link with error log and reproduction steps? > Suggestion-by: Bjorn Helgaas <bhelgaas@google.com> Suggested-by? > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > --- > drivers/pci/pcie/aspm.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 66d7514ca111..1bf8306141aa 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1010,18 +1010,17 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > > down_read(&pci_bus_sem); > mutex_lock(&aspm_lock); > - /* > - * All PCIe functions are in one slot, remove one function will remove > - * the whole slot, so just wait until we are the last function left. > - */ > - if (!list_empty(&parent->subordinate->devices)) > - goto out; > > link = parent->link_state; > root = link->root; > parent_link = link->parent; > > - /* All functions are removed, so just disable ASPM for the link */ > + /* > + * Any function is removed (including software removing), just > + * disable ASPM for the link, in case we can not configure the same > + * setting for all functions. How about following? /* * For any function removed, disable ASPM for the link. See PCIe r6.0, * sec 7.7.3.7 for details. */ > + * See PCIe r6.0, sec 7.5.3.7. > + */ > pcie_config_aspm_link(link, 0); > list_del(&link->sibling); > /* Clock PM is for endpoint device */ > @@ -1032,7 +1031,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > pcie_update_aspm_capable(root); > pcie_config_aspm_path(parent_link); > } > -out: > + > mutex_unlock(&aspm_lock); > up_read(&pci_bus_sem); > }
On 2023/5/5 10:51, Sathyanarayanan Kuppuswamy wrote: > Hi, > > On 5/4/23 5:34 AM, Ding Hui wrote: > > Maybe you can use the following title? > > "PCI/ASPM: Fix UAF by disabling ASPM for link when child function is removed > Thanks, I'll fix it in next patch. >> If the Function 0 of a Multi-Function device is software removed, >> a freed downstream pointer will be left in struct pcie_link_state, >> and then when pcie_config_aspm_link() be invoked from any path, >> we will trigger use-after-free. >> >> Based on the PCIe spec about ASPM Control (PCIe r6.0, sec 7.5.3.7), > > As per PCIe spec r6.0, sec 7.5.3.7, it is recommended > Thanks, I'll fix it in next patch. >> for Multi-Function Devices (including ARI Devices), it is recommended >> that software program the same value in all Functions. For ARI >> Devices, ASPM Control is determined solely by the setting in Function 0. >> >> So we can just disable ASPM of the whole component if any child >> function is removed, the downstream pointer will be avoided from >> use-after-free, that will also avoid other potential corner cases. >> >> Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities") >> Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn> > > Any bugzilla link with error log and reproduction steps? Yes, I should link previous https://lore.kernel.org/lkml/20230429132604.31853-1-dinghui@sangfor.com.cn/ Since Bjorn think the ALL details is not necessary, so I'll add the reproducer and compact result in next patch. > >> Suggestion-by: Bjorn Helgaas <bhelgaas@google.com> > > Suggested-by? Sorry, I'll fix it in next patch. > >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> >> --- >> drivers/pci/pcie/aspm.c | 15 +++++++-------- >> 1 file changed, 7 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index 66d7514ca111..1bf8306141aa 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1010,18 +1010,17 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) >> >> down_read(&pci_bus_sem); >> mutex_lock(&aspm_lock); >> - /* >> - * All PCIe functions are in one slot, remove one function will remove >> - * the whole slot, so just wait until we are the last function left. >> - */ >> - if (!list_empty(&parent->subordinate->devices)) >> - goto out; >> >> link = parent->link_state; >> root = link->root; >> parent_link = link->parent; >> >> - /* All functions are removed, so just disable ASPM for the link */ >> + /* >> + * Any function is removed (including software removing), just >> + * disable ASPM for the link, in case we can not configure the same >> + * setting for all functions. > > How about following? > > /* > * For any function removed, disable ASPM for the link. See PCIe r6.0, > * sec 7.7.3.7 for details. > */ > Thanks, it's better. >> + * See PCIe r6.0, sec 7.5.3.7. >> + */ >> pcie_config_aspm_link(link, 0); >> list_del(&link->sibling); >> /* Clock PM is for endpoint device */ >> @@ -1032,7 +1031,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) >> pcie_update_aspm_capable(root); >> pcie_config_aspm_path(parent_link); >> } >> -out: >> + >> mutex_unlock(&aspm_lock); >> up_read(&pci_bus_sem); >> } >
On Thu, May 04, 2023 at 08:34:18PM +0800, Ding Hui wrote: > If the Function 0 of a Multi-Function device is software removed, > a freed downstream pointer will be left in struct pcie_link_state, > and then when pcie_config_aspm_link() be invoked from any path, > we will trigger use-after-free. > > Based on the PCIe spec about ASPM Control (PCIe r6.0, sec 7.5.3.7), > for Multi-Function Devices (including ARI Devices), it is recommended > that software program the same value in all Functions. For ARI > Devices, ASPM Control is determined solely by the setting in Function 0. > > So we can just disable ASPM of the whole component if any child > function is removed, the downstream pointer will be avoided from > use-after-free, that will also avoid other potential corner cases. > > Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities") > Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn> > Suggestion-by: Bjorn Helgaas <bhelgaas@google.com> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > --- > drivers/pci/pcie/aspm.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 66d7514ca111..1bf8306141aa 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1010,18 +1010,17 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) Not directly related to your patch, but this looks racy to me: void pcie_aspm_exit_link_state(struct pci_dev *pdev) { struct pci_dev *parent = pdev->bus->self; if (!parent || !parent->link_state) return; down_read(&pci_bus_sem); mutex_lock(&aspm_lock); link = parent->link_state; root = link->root; ... free_link_state(link); link->pdev->link_state = NULL; kfree(link); Since we check "parent->link_state" before acquiring the locks, I suspect that removing two functions at the same time could end up with a NULL pointer dereference: pcie_aspm_exit_link_state(fn 0) pcie_aspm_exit_link_state(fn 1) parent = X parent = X parent->link_state != NULL parent->link_state != NULL acquire locks free_link_state(link) link->pdev->link_state = NULL # aka parent->link_state kfree(link) release locks acquire locks link = parent->link_state # now NULL root = link->root # NULL ptr What do you think? I guess if this *is* a race, it should be fixed by a separate patch. > down_read(&pci_bus_sem); > mutex_lock(&aspm_lock); > - /* > - * All PCIe functions are in one slot, remove one function will remove > - * the whole slot, so just wait until we are the last function left. > - */ > - if (!list_empty(&parent->subordinate->devices)) > - goto out; > > link = parent->link_state; > root = link->root; > parent_link = link->parent; > > - /* All functions are removed, so just disable ASPM for the link */ > + /* > + * Any function is removed (including software removing), just > + * disable ASPM for the link, in case we can not configure the same > + * setting for all functions. > + * See PCIe r6.0, sec 7.5.3.7. > + */ > pcie_config_aspm_link(link, 0); > list_del(&link->sibling); > /* Clock PM is for endpoint device */ > @@ -1032,7 +1031,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > pcie_update_aspm_capable(root); > pcie_config_aspm_path(parent_link); > } > -out: > + > mutex_unlock(&aspm_lock); > up_read(&pci_bus_sem); > } > -- > 2.17.1 >
On 2023/5/6 4:58, Bjorn Helgaas wrote: > On Thu, May 04, 2023 at 08:34:18PM +0800, Ding Hui wrote: >> If the Function 0 of a Multi-Function device is software removed, >> a freed downstream pointer will be left in struct pcie_link_state, >> and then when pcie_config_aspm_link() be invoked from any path, >> we will trigger use-after-free. >> >> Based on the PCIe spec about ASPM Control (PCIe r6.0, sec 7.5.3.7), >> for Multi-Function Devices (including ARI Devices), it is recommended >> that software program the same value in all Functions. For ARI >> Devices, ASPM Control is determined solely by the setting in Function 0. >> >> So we can just disable ASPM of the whole component if any child >> function is removed, the downstream pointer will be avoided from >> use-after-free, that will also avoid other potential corner cases. >> >> Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities") >> Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn> >> Suggestion-by: Bjorn Helgaas <bhelgaas@google.com> >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> >> --- >> drivers/pci/pcie/aspm.c | 15 +++++++-------- >> 1 file changed, 7 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index 66d7514ca111..1bf8306141aa 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1010,18 +1010,17 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > > Not directly related to your patch, but this looks racy to me: > > void pcie_aspm_exit_link_state(struct pci_dev *pdev) > { > struct pci_dev *parent = pdev->bus->self; > > if (!parent || !parent->link_state) > return; > > down_read(&pci_bus_sem); > mutex_lock(&aspm_lock); > > link = parent->link_state; > root = link->root; > ... > free_link_state(link); > link->pdev->link_state = NULL; > kfree(link); > > Since we check "parent->link_state" before acquiring the locks, I > suspect that removing two functions at the same time could end up with > a NULL pointer dereference: > > pcie_aspm_exit_link_state(fn 0) pcie_aspm_exit_link_state(fn 1) > parent = X parent = X > parent->link_state != NULL parent->link_state != NULL > > acquire locks > free_link_state(link) > link->pdev->link_state = NULL # aka parent->link_state > kfree(link) > release locks > > acquire locks > link = parent->link_state # now NULL > root = link->root # NULL ptr > > What do you think? I guess if this *is* a race, it should be fixed by > a separate patch. > Maybe there's no need to worry about the race, when a pci device is removing, the pci_rescan_remove_lock should be acquired. Anyway, double check would be safer. >> down_read(&pci_bus_sem); >> mutex_lock(&aspm_lock); >> - /* >> - * All PCIe functions are in one slot, remove one function will remove >> - * the whole slot, so just wait until we are the last function left. >> - */ >> - if (!list_empty(&parent->subordinate->devices)) >> - goto out; >> >> link = parent->link_state; >> root = link->root; >> parent_link = link->parent; >> >> - /* All functions are removed, so just disable ASPM for the link */ >> + /* >> + * Any function is removed (including software removing), just >> + * disable ASPM for the link, in case we can not configure the same >> + * setting for all functions. >> + * See PCIe r6.0, sec 7.5.3.7. >> + */ >> pcie_config_aspm_link(link, 0); >> list_del(&link->sibling); >> /* Clock PM is for endpoint device */ >> @@ -1032,7 +1031,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) >> pcie_update_aspm_capable(root); >> pcie_config_aspm_path(parent_link); >> } >> -out: >> + >> mutex_unlock(&aspm_lock); >> up_read(&pci_bus_sem); >> } >> -- >> 2.17.1 >> >
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 66d7514ca111..1bf8306141aa 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1010,18 +1010,17 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) down_read(&pci_bus_sem); mutex_lock(&aspm_lock); - /* - * All PCIe functions are in one slot, remove one function will remove - * the whole slot, so just wait until we are the last function left. - */ - if (!list_empty(&parent->subordinate->devices)) - goto out; link = parent->link_state; root = link->root; parent_link = link->parent; - /* All functions are removed, so just disable ASPM for the link */ + /* + * Any function is removed (including software removing), just + * disable ASPM for the link, in case we can not configure the same + * setting for all functions. + * See PCIe r6.0, sec 7.5.3.7. + */ pcie_config_aspm_link(link, 0); list_del(&link->sibling); /* Clock PM is for endpoint device */ @@ -1032,7 +1031,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) pcie_update_aspm_capable(root); pcie_config_aspm_path(parent_link); } -out: + mutex_unlock(&aspm_lock); up_read(&pci_bus_sem); }