Message ID | 20230128031740.166743-1-sunnanyong@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp1155148wrn; Fri, 27 Jan 2023 18:35:04 -0800 (PST) X-Google-Smtp-Source: AMrXdXuPZsP87vx3X8t4jcuvf/R2tvqc0miwC/fIO5hGKaZUeCugLsAULHyLhwaWFEqK4cg4F2XD X-Received: by 2002:a05:6402:5305:b0:499:8849:5fb6 with SMTP id eo5-20020a056402530500b0049988495fb6mr47542104edb.30.1674873304462; Fri, 27 Jan 2023 18:35:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674873304; cv=none; d=google.com; s=arc-20160816; b=XXb/Fvmlsy6e4aPHI7wvK1rcwqHYinq7tDGtujCIikziQny614vFzXj85cIxP9qg3a svzleIHIw8+bAtd3zWcLvtmwczfOfkVKm8AE3yBc4NSM0M3aHNaCuo6nY5MoG3rqrxBH ngk6TjcjLeSjwe+CLCnmMHgRQh8Srdm1cD03P4x2MZ8mXmwQJb8VW5B/6PZqLTPJgujS OHVeqiN2FSUIcLeK0h0aqFLmq07DHYnglsHx/e8mcle31zEXdXMxVeTvdcLryBismSrl KKr9WjBmcL6wrypzbmnabDDbV4IZbaWwnCHNvF9uNt+nRka8OOFzJkVgK0SxMtOdm34R P0Nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=rfJrfqu3qYIwOBZWVEZLXD1b9l2jk9X5VRXvUfCb8VI=; b=igabMO6DwCU7RkRu3Y5bNehO0bYo3WO7HCEaz0pKKH/O5XjFsVrk8VaEvN0SSoHWBf D48qTH/gkwYrnnGN507/6BNnyNfnZjI+2nDnbyN1kED6GIXKXT+fx0Aa8MWezog2hJmw BYBEfir7N5GCscdI+mcebpKMWxKlnrLtk2AVDjbG9VhJzE0ty8p9bh1/1TTRuW+wnX95 Ny5KrG/cGJN0bidZmOj6GaQLkdErM9E6EsIc1slV9QCFYp4+6X9Hes73923GUqTkwD3/ 5j7aY8xDPgAxszEQNqXwYldVmDaWwAcxyAXgVtjcOeNXe9kiL95py2n2zg9gCRy6ZY8w 7U2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fi6-20020a056402550600b0049d4fe939b9si6894104edb.434.2023.01.27.18.34.28; Fri, 27 Jan 2023 18:35:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232254AbjA1CY4 (ORCPT <rfc822;jesperjuhl76@gmail.com> + 99 others); Fri, 27 Jan 2023 21:24:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229464AbjA1CYz (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 27 Jan 2023 21:24:55 -0500 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F1DF241EF; Fri, 27 Jan 2023 18:24:54 -0800 (PST) Received: from kwepemm600003.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4P3dX0039Pz16NNs; Sat, 28 Jan 2023 10:22:55 +0800 (CST) Received: from huawei.com (10.175.113.32) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Sat, 28 Jan 2023 10:24:51 +0800 From: Nanyong Sun <sunnanyong@huawei.com> To: <joro@8bytes.org>, <will@kernel.org>, <robin.murphy@arm.com>, <mst@redhat.com>, <jasowang@redhat.com> CC: <iommu@lists.linux.dev>, <linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>, <virtualization@lists.linux-foundation.org>, <netdev@vger.kernel.org>, <wangrong68@huawei.com>, <sunnanyong@huawei.com> Subject: [PATCH] vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI Date: Sat, 28 Jan 2023 11:17:40 +0800 Message-ID: <20230128031740.166743-1-sunnanyong@huawei.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.113.32] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600003.china.huawei.com (7.193.23.202) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756231949748274546?= X-GMAIL-MSGID: =?utf-8?q?1756231949748274546?= |
Series |
vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI
|
|
Commit Message
Nanyong Sun
Jan. 28, 2023, 3:17 a.m. UTC
From: Rong Wang <wangrong68@huawei.com> Once enable iommu domain for one device, the MSI translation tables have to be there for software-managed MSI. Otherwise, platform with software-managed MSI without an irq bypass function, can not get a correct memory write event from pcie, will not get irqs. The solution is to obtain the MSI phy base address from iommu reserved region, and set it to iommu MSI cookie, then translation tables will be created while request irq. Signed-off-by: Rong Wang <wangrong68@huawei.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> --- drivers/iommu/iommu.c | 1 + drivers/vhost/vdpa.c | 53 ++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 51 insertions(+), 3 deletions(-)
Comments
On Sat, Jan 28, 2023 at 10:25 AM Nanyong Sun <sunnanyong@huawei.com> wrote: > > From: Rong Wang <wangrong68@huawei.com> > > Once enable iommu domain for one device, the MSI > translation tables have to be there for software-managed MSI. > Otherwise, platform with software-managed MSI without an > irq bypass function, can not get a correct memory write event > from pcie, will not get irqs. > The solution is to obtain the MSI phy base address from > iommu reserved region, and set it to iommu MSI cookie, > then translation tables will be created while request irq. > > Signed-off-by: Rong Wang <wangrong68@huawei.com> > Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> > --- > drivers/iommu/iommu.c | 1 + > drivers/vhost/vdpa.c | 53 ++++++++++++++++++++++++++++++++++++++++--- > 2 files changed, 51 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index de91dd88705b..f6c65d5d8e2b 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -2623,6 +2623,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list) > if (ops->get_resv_regions) > ops->get_resv_regions(dev, list); > } > +EXPORT_SYMBOL_GPL(iommu_get_resv_regions); > > /** > * iommu_put_resv_regions - release resered regions > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > index ec32f785dfde..31d3e9ed4cfa 100644 > --- a/drivers/vhost/vdpa.c > +++ b/drivers/vhost/vdpa.c > @@ -1103,6 +1103,48 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb, > return vhost_chr_write_iter(dev, from); > } > > +static bool vhost_vdpa_check_sw_msi(struct list_head *dev_resv_regions, phys_addr_t *base) > +{ > + struct iommu_resv_region *region; > + bool ret = false; > + > + list_for_each_entry(region, dev_resv_regions, list) { > + /* > + * The presence of any 'real' MSI regions should take > + * precedence over the software-managed one if the > + * IOMMU driver happens to advertise both types. > + */ > + if (region->type == IOMMU_RESV_MSI) { > + ret = false; > + break; > + } > + > + if (region->type == IOMMU_RESV_SW_MSI) { > + *base = region->start; > + ret = true; > + } > + } > + > + return ret; > +} Can we unify this with what VFIO had? > + > +static int vhost_vdpa_get_msi_cookie(struct iommu_domain *domain, struct device *dma_dev) > +{ > + struct list_head dev_resv_regions; > + phys_addr_t resv_msi_base = 0; > + int ret = 0; > + > + INIT_LIST_HEAD(&dev_resv_regions); > + iommu_get_resv_regions(dma_dev, &dev_resv_regions); > + > + if (vhost_vdpa_check_sw_msi(&dev_resv_regions, &resv_msi_base)) > + ret = iommu_get_msi_cookie(domain, resv_msi_base); > + > + iommu_put_resv_regions(dma_dev, &dev_resv_regions); > + > + return ret; > +} > + > static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) > { > struct vdpa_device *vdpa = v->vdpa; > @@ -1128,11 +1170,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) > > ret = iommu_attach_device(v->domain, dma_dev); > if (ret) > - goto err_attach; > + goto err_alloc_domain; > > - return 0; > + ret = vhost_vdpa_get_msi_cookie(v->domain, dma_dev); Do we need to check the overlap mapping and record it in the interval tree (as what VFIO did)? Thanks > + if (ret) > + goto err_attach_device; > > -err_attach: > + return 0; > +err_attach_device: > + iommu_detach_device(v->domain, dma_dev); > +err_alloc_domain: > iommu_domain_free(v->domain); > return ret; > } > -- > 2.25.1 >
On 2023/1/29 14:02, Jason Wang wrote: > On Sat, Jan 28, 2023 at 10:25 AM Nanyong Sun <sunnanyong@huawei.com> wrote: >> From: Rong Wang <wangrong68@huawei.com> >> >> Once enable iommu domain for one device, the MSI >> translation tables have to be there for software-managed MSI. >> Otherwise, platform with software-managed MSI without an >> irq bypass function, can not get a correct memory write event >> from pcie, will not get irqs. >> The solution is to obtain the MSI phy base address from >> iommu reserved region, and set it to iommu MSI cookie, >> then translation tables will be created while request irq. >> >> Signed-off-by: Rong Wang <wangrong68@huawei.com> >> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> >> --- >> drivers/iommu/iommu.c | 1 + >> drivers/vhost/vdpa.c | 53 ++++++++++++++++++++++++++++++++++++++++--- >> 2 files changed, 51 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index de91dd88705b..f6c65d5d8e2b 100644 >> --- a/drivers/iommu/iommu.c >> +++ b/drivers/iommu/iommu.c >> @@ -2623,6 +2623,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list) >> if (ops->get_resv_regions) >> ops->get_resv_regions(dev, list); >> } >> +EXPORT_SYMBOL_GPL(iommu_get_resv_regions); >> >> /** >> * iommu_put_resv_regions - release resered regions >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >> index ec32f785dfde..31d3e9ed4cfa 100644 >> --- a/drivers/vhost/vdpa.c >> +++ b/drivers/vhost/vdpa.c >> @@ -1103,6 +1103,48 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb, >> return vhost_chr_write_iter(dev, from); >> } >> >> +static bool vhost_vdpa_check_sw_msi(struct list_head *dev_resv_regions, phys_addr_t *base) >> +{ >> + struct iommu_resv_region *region; >> + bool ret = false; >> + >> + list_for_each_entry(region, dev_resv_regions, list) { >> + /* >> + * The presence of any 'real' MSI regions should take >> + * precedence over the software-managed one if the >> + * IOMMU driver happens to advertise both types. >> + */ >> + if (region->type == IOMMU_RESV_MSI) { >> + ret = false; >> + break; >> + } >> + >> + if (region->type == IOMMU_RESV_SW_MSI) { >> + *base = region->start; >> + ret = true; >> + } >> + } >> + >> + return ret; >> +} > Can we unify this with what VFIO had? Yes, these two functions are just the same. Do you think move this function to iommu.c, and export from iommu is a good choice? > >> + >> +static int vhost_vdpa_get_msi_cookie(struct iommu_domain *domain, struct device *dma_dev) >> +{ >> + struct list_head dev_resv_regions; >> + phys_addr_t resv_msi_base = 0; >> + int ret = 0; >> + >> + INIT_LIST_HEAD(&dev_resv_regions); >> + iommu_get_resv_regions(dma_dev, &dev_resv_regions); >> + >> + if (vhost_vdpa_check_sw_msi(&dev_resv_regions, &resv_msi_base)) >> + ret = iommu_get_msi_cookie(domain, resv_msi_base); >> + >> + iommu_put_resv_regions(dma_dev, &dev_resv_regions); >> + >> + return ret; >> +} >> + >> static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) >> { >> struct vdpa_device *vdpa = v->vdpa; >> @@ -1128,11 +1170,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) >> >> ret = iommu_attach_device(v->domain, dma_dev); >> if (ret) >> - goto err_attach; >> + goto err_alloc_domain; >> >> - return 0; >> + ret = vhost_vdpa_get_msi_cookie(v->domain, dma_dev); > Do we need to check the overlap mapping and record it in the interval > tree (as what VFIO did)? > > Thanks Yes, we need to care about this part, I will handle this recently. Thanks a lot. >> + if (ret) >> + goto err_attach_device; >> >> -err_attach: >> + return 0; >> +err_attach_device: >> + iommu_detach_device(v->domain, dma_dev); >> +err_alloc_domain: >> iommu_domain_free(v->domain); >> return ret; >> } >> -- >> 2.25.1 >> > .
On Tue, Jan 31, 2023 at 9:32 AM Nanyong Sun <sunnanyong@huawei.com> wrote: > > On 2023/1/29 14:02, Jason Wang wrote: > > On Sat, Jan 28, 2023 at 10:25 AM Nanyong Sun <sunnanyong@huawei.com> wrote: > >> From: Rong Wang <wangrong68@huawei.com> > >> > >> Once enable iommu domain for one device, the MSI > >> translation tables have to be there for software-managed MSI. > >> Otherwise, platform with software-managed MSI without an > >> irq bypass function, can not get a correct memory write event > >> from pcie, will not get irqs. > >> The solution is to obtain the MSI phy base address from > >> iommu reserved region, and set it to iommu MSI cookie, > >> then translation tables will be created while request irq. > >> > >> Signed-off-by: Rong Wang <wangrong68@huawei.com> > >> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> > >> --- > >> drivers/iommu/iommu.c | 1 + > >> drivers/vhost/vdpa.c | 53 ++++++++++++++++++++++++++++++++++++++++--- > >> 2 files changed, 51 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > >> index de91dd88705b..f6c65d5d8e2b 100644 > >> --- a/drivers/iommu/iommu.c > >> +++ b/drivers/iommu/iommu.c > >> @@ -2623,6 +2623,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list) > >> if (ops->get_resv_regions) > >> ops->get_resv_regions(dev, list); > >> } > >> +EXPORT_SYMBOL_GPL(iommu_get_resv_regions); > >> > >> /** > >> * iommu_put_resv_regions - release resered regions > >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > >> index ec32f785dfde..31d3e9ed4cfa 100644 > >> --- a/drivers/vhost/vdpa.c > >> +++ b/drivers/vhost/vdpa.c > >> @@ -1103,6 +1103,48 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb, > >> return vhost_chr_write_iter(dev, from); > >> } > >> > >> +static bool vhost_vdpa_check_sw_msi(struct list_head *dev_resv_regions, phys_addr_t *base) > >> +{ > >> + struct iommu_resv_region *region; > >> + bool ret = false; > >> + > >> + list_for_each_entry(region, dev_resv_regions, list) { > >> + /* > >> + * The presence of any 'real' MSI regions should take > >> + * precedence over the software-managed one if the > >> + * IOMMU driver happens to advertise both types. > >> + */ > >> + if (region->type == IOMMU_RESV_MSI) { > >> + ret = false; > >> + break; > >> + } > >> + > >> + if (region->type == IOMMU_RESV_SW_MSI) { > >> + *base = region->start; > >> + ret = true; > >> + } > >> + } > >> + > >> + return ret; > >> +} > > Can we unify this with what VFIO had? > Yes, these two functions are just the same. > Do you think move this function to iommu.c, and export from iommu is a > good choice? Probably, we can try and see. > > > >> + > >> +static int vhost_vdpa_get_msi_cookie(struct iommu_domain *domain, struct device *dma_dev) > >> +{ > >> + struct list_head dev_resv_regions; > >> + phys_addr_t resv_msi_base = 0; > >> + int ret = 0; > >> + > >> + INIT_LIST_HEAD(&dev_resv_regions); > >> + iommu_get_resv_regions(dma_dev, &dev_resv_regions); > >> + > >> + if (vhost_vdpa_check_sw_msi(&dev_resv_regions, &resv_msi_base)) > >> + ret = iommu_get_msi_cookie(domain, resv_msi_base); > >> + > >> + iommu_put_resv_regions(dma_dev, &dev_resv_regions); > >> + > >> + return ret; > >> +} > >> + > >> static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) > >> { > >> struct vdpa_device *vdpa = v->vdpa; > >> @@ -1128,11 +1170,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) > >> > >> ret = iommu_attach_device(v->domain, dma_dev); > >> if (ret) > >> - goto err_attach; > >> + goto err_alloc_domain; > >> > >> - return 0; > >> + ret = vhost_vdpa_get_msi_cookie(v->domain, dma_dev); > > Do we need to check the overlap mapping and record it in the interval > > tree (as what VFIO did)? > > > > Thanks > Yes, we need to care about this part, I will handle this recently. > Thanks a lot. I think for parents that requires vendor specific mapping logic we probably also need this. But this could be added on top (via a new config ops probably). Thanks > >> + if (ret) > >> + goto err_attach_device; > >> > >> -err_attach: > >> + return 0; > >> +err_attach_device: > >> + iommu_detach_device(v->domain, dma_dev); > >> +err_alloc_domain: > >> iommu_domain_free(v->domain); > >> return ret; > >> } > >> -- > >> 2.25.1 > >> > > . >
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index de91dd88705b..f6c65d5d8e2b 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2623,6 +2623,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list) if (ops->get_resv_regions) ops->get_resv_regions(dev, list); } +EXPORT_SYMBOL_GPL(iommu_get_resv_regions); /** * iommu_put_resv_regions - release resered regions diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index ec32f785dfde..31d3e9ed4cfa 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1103,6 +1103,48 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb, return vhost_chr_write_iter(dev, from); } +static bool vhost_vdpa_check_sw_msi(struct list_head *dev_resv_regions, phys_addr_t *base) +{ + struct iommu_resv_region *region; + bool ret = false; + + list_for_each_entry(region, dev_resv_regions, list) { + /* + * The presence of any 'real' MSI regions should take + * precedence over the software-managed one if the + * IOMMU driver happens to advertise both types. + */ + if (region->type == IOMMU_RESV_MSI) { + ret = false; + break; + } + + if (region->type == IOMMU_RESV_SW_MSI) { + *base = region->start; + ret = true; + } + } + + return ret; +} + +static int vhost_vdpa_get_msi_cookie(struct iommu_domain *domain, struct device *dma_dev) +{ + struct list_head dev_resv_regions; + phys_addr_t resv_msi_base = 0; + int ret = 0; + + INIT_LIST_HEAD(&dev_resv_regions); + iommu_get_resv_regions(dma_dev, &dev_resv_regions); + + if (vhost_vdpa_check_sw_msi(&dev_resv_regions, &resv_msi_base)) + ret = iommu_get_msi_cookie(domain, resv_msi_base); + + iommu_put_resv_regions(dma_dev, &dev_resv_regions); + + return ret; +} + static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) { struct vdpa_device *vdpa = v->vdpa; @@ -1128,11 +1170,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) ret = iommu_attach_device(v->domain, dma_dev); if (ret) - goto err_attach; + goto err_alloc_domain; - return 0; + ret = vhost_vdpa_get_msi_cookie(v->domain, dma_dev); + if (ret) + goto err_attach_device; -err_attach: + return 0; +err_attach_device: + iommu_detach_device(v->domain, dma_dev); +err_alloc_domain: iommu_domain_free(v->domain); return ret; }