Message ID | 20230606120854.4170244-15-mshavit@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3353480vqr; Tue, 6 Jun 2023 05:24:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7yKaDiXikVNW3/DC+DOVqpIsPDB/i4hgGeRGv0zjQHg5Y4jmiGTQ1xMxyFT6/dRAQJm7ev X-Received: by 2002:a05:6a20:430d:b0:10f:472f:ffbb with SMTP id h13-20020a056a20430d00b0010f472fffbbmr10944321pzk.7.1686054281449; Tue, 06 Jun 2023 05:24:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686054281; cv=none; d=google.com; s=arc-20160816; b=1CGSUw6MLklPMeEC/oapdjL+2OsSJhCu9NTv5y6LCcU/avhPRH3Ymjcs28ameIY1Iw epsoCT3msK1in3vLERcht2TcQOPGPzd8N+fmBHwFJX6iij+7AftlGhPpYfPs0d71AdB3 iBbFlE1VWpu8FGEUHRSZOwCr2wxBMuanHAHEeci8MnvwrltuKIzUbKc/jlZmRvdRQroM 6aHQjUJ3wregq0isM+1YHIM1D2PboVgM90jiGPeE0WAEOgXSuRaFvTr7ECRQDPT75j6B PE/wDdC3XKa8DDBJ5scdhYPOHLMAX61R3kEdJqytbaFpxcNObc76I3xFbAzYRZELfwcW BwEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=e/IPHrVLjUq2s1o0Kk8jlOX8iem0l8MY258dkWJhSSA=; b=j8Z5ksvE1DIqC4wLLdqZy/9bbU4J/q6pCjesMKlT3SgbkAea+wwDaesFswby7lzkRI MrfVUAvYNN79kml2krvp9zEYh6BRRQngKUf+JERE4UV4eco/NtTd5f5iZr3JcdyJlQgw VzfKyH+OgWOXihqio0XsShYKmYt4l+QhITbuQpMCf9XO46Fzxn8BwYzZxcA7eBbKGEuA 0wmKdytfVvqbBYuHbOPVXD3ElPFWGPmLTWbSzJqqD1Mj6EDOj6wU+fg6eE9AlNweovmC V4ZBsgAYC0bMPLmGBVTh6pzD9Z61b2mfqgiWBb0wibgbwBXC+hnuHVs+L8JitlBau9c3 TEvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=VZ4aVUeg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 123-20020a630181000000b005303c3dedc2si7218975pgb.419.2023.06.06.05.24.28; Tue, 06 Jun 2023 05:24:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=VZ4aVUeg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237498AbjFFML3 (ORCPT <rfc822;xxoosimple@gmail.com> + 99 others); Tue, 6 Jun 2023 08:11:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237486AbjFFMLZ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 6 Jun 2023 08:11:25 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 441941703 for <linux-kernel@vger.kernel.org>; Tue, 6 Jun 2023 05:11:00 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-bb2202e0108so7551099276.1 for <linux-kernel@vger.kernel.org>; Tue, 06 Jun 2023 05:11:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1686053458; x=1688645458; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=e/IPHrVLjUq2s1o0Kk8jlOX8iem0l8MY258dkWJhSSA=; b=VZ4aVUegMrv+4Q6yvYwHunY/3cwuUlcr0h58VcDyy+whkFdBNoq1bmH2cIlw3NPTGf 2xwwiPuSrWUGDBls3d7jNYpe94klnR35TS5OkpOjjZPOxAO8MIWR85bzh29QZGu9r6vq Uhrp4oqPT2o6ewJtPDlh64Ybaq2kCogm4tvMgZDNn5hii7e3Sp7nkxS2CPQ+QqPKUqO2 jr7HF72nwtGIZwbU2rVIDb/0fXNIWezbdzedeFrmGi0/fABER6IwQMRfR22nwYnhBjrI RYCBaFffmg4sgiF4vWOkI7rmUWylAkhG7nKO4jBF3BV258AG4UrAU6eDZogjLom8Cb8J fe7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686053458; x=1688645458; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e/IPHrVLjUq2s1o0Kk8jlOX8iem0l8MY258dkWJhSSA=; b=CQtCnGzW/XmEn4N2MOYYsz+DsCXR++e5FzynOaNKRRqzm5rLTszRGqu8IgWTNCQU/E 4ipoo4w0OOCaigjEV0pqFReHKhibN7fbAuon/u01Aj4XwxdGgH0PZHPAHPP66QIaOcfo JHtDyqZQj2u89wiSSBFNmAZL7V7W0M98/zo+8nIEVtjvphQ3VMq8NSsEg8icPz6V/cg9 1fox2eZ65ca5T4IEN07AQJ7i9wfP7GbugBe3Fg/2WUUJV/LV0nKpO2DIprXe4JkkDdjh ALDfHStnFDiRow5/vYN/AoYluagKcjaIvS0Sq6EaI1ZGLPYEH9+nCyhrulGjDJyMueyS 6isg== X-Gm-Message-State: AC+VfDxakWzChd8CRhxv8Ovc6jxe9WWOj3Ye68yfFsGEUZxjZ64De3h4 YYdZ3LcPZm40PLadqSu39UiPXaZiidwu X-Received: from mshavit.ntc.corp.google.com ([2401:fa00:95:20c:a615:63d5:b54e:6919]) (user=mshavit job=sendgmr) by 2002:a05:6902:1183:b0:ba7:7478:195f with SMTP id m3-20020a056902118300b00ba77478195fmr620081ybu.13.1686053458202; Tue, 06 Jun 2023 05:10:58 -0700 (PDT) Date: Tue, 6 Jun 2023 20:07:50 +0800 In-Reply-To: <20230606120854.4170244-1-mshavit@google.com> Mime-Version: 1.0 References: <20230606120854.4170244-1-mshavit@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Message-ID: <20230606120854.4170244-15-mshavit@google.com> Subject: [PATCH v2 14/18] iommu/arm-smmu-v3: Support domains with shared CDs From: Michael Shavit <mshavit@google.com> To: Will Deacon <will@kernel.org>, Robin Murphy <robin.murphy@arm.com>, Joerg Roedel <joro@8bytes.org> Cc: Michael Shavit <mshavit@google.com>, jean-philippe@linaro.org, nicolinc@nvidia.com, jgg@nvidia.com, baolu.lu@linux.intel.com, linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767956054501007597?= X-GMAIL-MSGID: =?utf-8?q?1767956054501007597?= |
Series |
Add PASID support to SMMUv3 unmanaged domains
|
|
Commit Message
Michael Shavit
June 6, 2023, 12:07 p.m. UTC
SVA may attach a CD to masters that have different upstream SMMU
devices. The arm_smmu_domain structure can only be attached to a single
upstream SMMU device however. To work around this limitation, we propose
an ARM_SMMU_DOMAIN_S1_SHARED domain type for domains that attach a CD
shared across with arm_smmu_domains (each attached to a different
upstream SMMU device).
Signed-off-by: Michael Shavit <mshavit@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 22 insertions(+), 6 deletions(-)
Comments
On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote: > SVA may attach a CD to masters that have different upstream SMMU > devices. The arm_smmu_domain structure can only be attached to a single > upstream SMMU device however. Isn't that pretty much because we don't support replicating invalidations to each of the different SMMU instances? I don't really get why there would be anything like a "shared CD". It looks kind of like the arm_smmu_ctx_desc has become a bit weird since its main purpose seems to be to refcount the ASID. But our general design is that the iommu_domain should be 1:1 with the ASID, so it is really strange that we'd refcount the ASID.. I would expect different SMMU devices to be handled by allowing single domains to be attached to a list of masters where the masters can all be from different instances. When an invalidation comes in we have to replicate it through all the instances for the IOTLB and all the masters for the ATC. What we definately shouldn't do is try to have different SVA iommu_domain's pointing at the same ASID. That is again making SVA special, which we are trying to get away from :) You might try to stop your series here and get the first batch of patches done. This latter bit seems to be a seperate topic? Jason
On Tue, Jun 6, 2023 at 10:39 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote: > > SVA may attach a CD to masters that have different upstream SMMU > > devices. The arm_smmu_domain structure can only be attached to a single > > upstream SMMU device however. > > Isn't that pretty much because we don't support replicating > invalidations to each of the different SMMU instances? .... > I would expect different SMMU devices to be handled by allowing single > domains to be attached to a list of masters where the masters can all > be from different instances. Oh you're right! When I first looked at this, arm_smmu_domain still held an arm_smmu_s1_cfg which depends on the SMMU device. But attaching a single (stage 1) arm_smmu_domain to multiple SMMUs looks much more tractable after the first half of this patch series. We can add a handle to the smmu device in the new arm_smmu_attached_domain struct for this purpose. > What we definately shouldn't do is try to have different SVA > iommu_domain's pointing at the same ASID. That is again making SVA > special, which we are trying to get away from :) Fwiw, this change is preserving the status-quo in that regard; arm-smmu-v3-sva.c is already doing this. But yes, I agree that resolving the limitation is a better long term solution... and something I can try to look at further. > You might try to stop your series here and get the first batch of > patches done. This latter bit seems to be a seperate topic? The main motivation for this part of the series is to reduce inconsistencies with the smmu_domain->attached_domains list and arm-smmu-v3 functions that rely on that list. Specifically to reach the last patch in the series: "iommu/arm-smmu-v3-sva: Remove atc_inv_domain_ssid". Splitting this part into a follow-up patch series would definitely be easier and helpful if you're all ok with it :) .
On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote: > > What we definately shouldn't do is try to have different SVA > > iommu_domain's pointing at the same ASID. That is again making SVA > > special, which we are trying to get away from :) > > Fwiw, this change is preserving the status-quo in that regard; > arm-smmu-v3-sva.c is already doing this. But yes, I agree that > resolving the limitation is a better long term solution... and > something I can try to look at further. I suppose we also don't really have a entirely clear picture what allocating multiple SVA domains should even do in the iommu driver. The driver would like to share the ASID, but things are much cleaner for everything if the driver model has ASID 1:1 with the iommu_domain. It suggests we are missing some core code in iommu_sva_bind_device() to try to re-use existing SVA iommu_domains. This would certainly be better than trying to teach every driver how to share and refcount its ASID concept... Today we have this super hacky iommu_get_domain_for_dev_pasid() thing that allows SVA domain reuse for a single device. Possibly what we should do is conver the u32 pasid in the mm_struct to a struct iommu_mm_data * and put alot more stuff in there. eg a linked list of all SVA domains. > Splitting this part into a follow-up patch series would definitely be > easier and helpful if you're all ok with it :) . I think splitting it into a series to re-organize the way ste/cd stuff works is a nice contained topic. Adjusting the way the ASID works with SVA is another good topic And so on Jason
On 6/7/23 7:59 PM, Jason Gunthorpe wrote: > On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote: >>> What we definately shouldn't do is try to have different SVA >>> iommu_domain's pointing at the same ASID. That is again making SVA >>> special, which we are trying to get away from 😄 >> Fwiw, this change is preserving the status-quo in that regard; >> arm-smmu-v3-sva.c is already doing this. But yes, I agree that >> resolving the limitation is a better long term solution... and >> something I can try to look at further. > I suppose we also don't really have a entirely clear picture what > allocating multiple SVA domains should even do in the iommu driver. > > The driver would like to share the ASID, but things are much cleaner > for everything if the driver model has ASID 1:1 with the iommu_domain. This means that each ASID should be mapped to a single IOMMU domain. This is conceptually right as iommu_domain represents a hardware page table. For SVA, it's an mm_struct. So in my mind, each sva_domain should have a 1:1 relationship with an mm_struct. Each sva_domain could have a 1:N relationship with ASID (or PCI PASID), but in current implementation, it's a 1:1 relationship due to the current global pasid policy for SVA. > > It suggests we are missing some core code in iommu_sva_bind_device() > to try to re-use existing SVA iommu_domains. This would certainly be > better than trying to teach every driver how to share and refcount > its ASID concept... > > Today we have this super hacky iommu_get_domain_for_dev_pasid() > thing that allows SVA domain reuse for a single device. > > Possibly what we should do is conver the u32 pasid in the mm_struct to > a struct iommu_mm_data * and put alot more stuff in there. eg a linked > list of all SVA domains. I don't quite follow "a linked list of all SVA domains". If my above understanding is correct, then there should be a single sva_domain for each mm_struct. The iommu_sva_domain_alloc() takes the responsibility to allocate/free and refcount the sva domain. The sva bind code could simply: domain = iommu_get_sva_domain(dev, mm); iommu_attach_device_pasid(domain, dev, pasid); and the sva unbind code: iommu_detach_device_pasid(domain, dev, pasid); iommu_put_sva_domain(domain); Perhaps, we can further drop struct iommu_sva and make the sva bind/unbind interfaces to return the sva domain instead? Best regards, baolu
On Thu, Jun 08, 2023 at 10:39:23AM +0800, Baolu Lu wrote: > On 6/7/23 7:59 PM, Jason Gunthorpe wrote: > > On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote: > > > > What we definately shouldn't do is try to have different SVA > > > > iommu_domain's pointing at the same ASID. That is again making SVA > > > > special, which we are trying to get away from 😄 > > > Fwiw, this change is preserving the status-quo in that regard; > > > arm-smmu-v3-sva.c is already doing this. But yes, I agree that > > > resolving the limitation is a better long term solution... and > > > something I can try to look at further. > > I suppose we also don't really have a entirely clear picture what > > allocating multiple SVA domains should even do in the iommu driver. > > > > The driver would like to share the ASID, but things are much cleaner > > for everything if the driver model has ASID 1:1 with the iommu_domain. > > This means that each ASID should be mapped to a single IOMMU domain. > This is conceptually right as iommu_domain represents a hardware page > table. For SVA, it's an mm_struct. > > So in my mind, each sva_domain should have a 1:1 relationship with an > mm_struct. If we want to support multiple iommu drivers then we should support multiple iommu_domains per mm_struct so that each driver can have its own. In this world if each instance wants its own iommu_domain it is not a big deal. Drivers that can share iommu_domains across instances should probably also share sva iommu_domains across instances. Most real systems have only one iommu driver and we'd like the good iommu drivers to be able to share domains across instances, so we'd expect only 1 iommu_domain per mm struct. Jason
On 6/8/23 9:39 PM, Jason Gunthorpe wrote: > On Thu, Jun 08, 2023 at 10:39:23AM +0800, Baolu Lu wrote: >> On 6/7/23 7:59 PM, Jason Gunthorpe wrote: >>> On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote: >>>>> What we definately shouldn't do is try to have different SVA >>>>> iommu_domain's pointing at the same ASID. That is again making SVA >>>>> special, which we are trying to get away from 😄 >>>> Fwiw, this change is preserving the status-quo in that regard; >>>> arm-smmu-v3-sva.c is already doing this. But yes, I agree that >>>> resolving the limitation is a better long term solution... and >>>> something I can try to look at further. >>> I suppose we also don't really have a entirely clear picture what >>> allocating multiple SVA domains should even do in the iommu driver. >>> >>> The driver would like to share the ASID, but things are much cleaner >>> for everything if the driver model has ASID 1:1 with the iommu_domain. >> This means that each ASID should be mapped to a single IOMMU domain. >> This is conceptually right as iommu_domain represents a hardware page >> table. For SVA, it's an mm_struct. >> >> So in my mind, each sva_domain should have a 1:1 relationship with an >> mm_struct. > If we want to support multiple iommu drivers then we should support > multiple iommu_domains per mm_struct so that each driver can have its > own. In this world if each instance wants its own iommu_domain it is > not a big deal. > > Drivers that can share iommu_domains across instances should probably > also share sva iommu_domains across instances. > > Most real systems have only one iommu driver and we'd like the good > iommu drivers to be able to share domains across instances, so we'd > expect only 1 iommu_domain per mm struct. Yes. You are right. I overlooked the multiple-drivers case. So we stay on the same page now. Best regards, baolu
On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote: > > SVA may attach a CD to masters that have different upstream SMMU > > devices. The arm_smmu_domain structure can only be attached to a single > > upstream SMMU device however. > > Isn't that pretty much because we don't support replicating > invalidations to each of the different SMMU instances? Looked into this some more, and supporting attach to multiple devices is still very hard: 1. When an arm_smmu_domain is first attached to a master, it initializes an io_pgtable_cfg object whose properties depend on the master's upstream SMMU device. This io_pgtable_cfg is then used to allocate an io_pgtable object, and the arm_smmu_ctx_desc's TTBR field points to that io_pgtable's TTBR (and ditto for the vttbr on stage 2 domains). So then arm_smmu_domain needs to be split into two, arm_smmu_domain and arm_smmu_device_domain with the latter containing a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg. Each iommu_domain_ops operation now needs to loop over each arm_smmu_device_domain. 2. Some of the iommu_domain fields also depend on the per-SMMU io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end. These need to be restricted as the domain is attached to more devices. 3. Attaching a domain to a new SMMU device must be prohibited after any call to map_pages or if iommu_domain.pgsize_bitmap and iommu-domain.geometry.aperture_end have been consumed by any system. The first is something the arm-smmu-v3.c driver could feasibly enforce on its own, the second requires changes to the iommu framework. The arm-smmu-v3-sva.c implementation avoids all these problems because it doesn't need to allocate an io_pgtable; the shared arm_smmu_ctx_desc's TTBR is set to the MM's TTBR. One possible solution would be to make the changes proposed in 1., but only allowing SVA domains to attach to multiple devices: 1. Domain functions only accessed by arm-smmu-v3.c (such as arm_smmu_iova_to_phys) can assume there's a single arm_smmu_device_domain per arm_smmu_domain. 2. Domain functions also accessed by arm-smmu-v3-sva.c (such as invalidations commands) must iterate over all arm_smmu_device_domains. > On Wed, Jun 7, 2023 at 8:00 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > I think splitting it into a series to re-organize the way ste/cd stuff > works is a nice contained topic. Should I explicitly resend this patch series as a v3 cutting off patches 14 and onwards?
On Wed, Jun 14, 2023 at 5:17 PM Michael Shavit <mshavit@google.com> wrote: > The arm-smmu-v3-sva.c implementation avoids all these problems because it > doesn't need to allocate an io_pgtable; the shared arm_smmu_ctx_desc's > TTBR is set to the MM's TTBR. One possible solution would be to make > the changes proposed in 1., but only allowing SVA domains to attach to > multiple devices: > 1. Domain functions only accessed by arm-smmu-v3.c (such as > arm_smmu_iova_to_phys) can assume there's a single > arm_smmu_device_domain per arm_smmu_domain. > 2. Domain functions also accessed by arm-smmu-v3-sva.c (such as > invalidations commands) must iterate over all arm_smmu_device_domains. Or rather, arm_smmu_domain would hold a list of attached smmu devices, but only SVA domains would be allowed to have multiple entries in that list since the other arm_smmu_domain fields are tied to a single SMMU device for other domain types.
On 2023-06-14 10:17, Michael Shavit wrote: > On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote: >>> SVA may attach a CD to masters that have different upstream SMMU >>> devices. The arm_smmu_domain structure can only be attached to a single >>> upstream SMMU device however. >> >> Isn't that pretty much because we don't support replicating >> invalidations to each of the different SMMU instances? > > Looked into this some more, and supporting attach to multiple devices > is still very hard: > 1. When an arm_smmu_domain is first attached to a master, it > initializes an io_pgtable_cfg object whose properties depend on the > master's upstream SMMU device. This io_pgtable_cfg is then used to > allocate an io_pgtable object, and the arm_smmu_ctx_desc's TTBR field > points to that io_pgtable's TTBR (and ditto for the vttbr on stage 2 > domains). So then arm_smmu_domain needs to be split into two, > arm_smmu_domain and arm_smmu_device_domain with the latter containing > a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg. > Each iommu_domain_ops operation now needs to loop over each > arm_smmu_device_domain. > 2. Some of the iommu_domain fields also depend on the per-SMMU > io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end. > These need to be restricted as the domain is attached to more devices. > 3. Attaching a domain to a new SMMU device must be prohibited after > any call to map_pages or if iommu_domain.pgsize_bitmap and > iommu-domain.geometry.aperture_end have been consumed by any system. > The first is something the arm-smmu-v3.c driver could feasibly enforce > on its own, the second requires changes to the iommu framework. In practice it would be entirely reasonable to only support cross-instance attach between instances with matching capabilities such that they *can* share the pagetable directly. > The arm-smmu-v3-sva.c implementation avoids all these problems because it > doesn't need to allocate an io_pgtable; SVA has all the same underlying problems - pagetable sharing is still pagetable sharing regardless of which code happens to allocate the physical pages - they're just avoided up-front by disallowing SVA at all if the relevant capabilities don't line up between SMMU and CPU. Thanks, Robin.
On Wed, Jun 14, 2023 at 05:17:03PM +0800, Michael Shavit wrote: > On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote: > > > SVA may attach a CD to masters that have different upstream SMMU > > > devices. The arm_smmu_domain structure can only be attached to a single > > > upstream SMMU device however. > > > > Isn't that pretty much because we don't support replicating > > invalidations to each of the different SMMU instances? > > Looked into this some more, and supporting attach to multiple devices > is still very hard: > 1. When an arm_smmu_domain is first attached to a master, it > initializes an io_pgtable_cfg object whose properties depend on the > master's upstream SMMU device. So, this is actually kind of wrong, Robin is working on fixing it. The mental model you should have is when an iommu_domain is created it represents a single, definitive, IO page table format. The format is selected based on creation arguments and the 'master' it will be used with. An iommu_domain has a single IO page table in a single format, and that format doesn't change while the iommu_domain exists. Even single-instance cases have S1 and S2 IO page table formats co-existing. When we talk about multi instance support, it means the iommu_domain - in whatever fixed IO page table format it uses - can be attached to any SMMU instance that supports it as a compatible page table format. ARM doesn't quite reach this model, but once it passes the finalize step it does. The goal is to move finalize to allocate. So for your purposes you can ignore the difference. > domains). So then arm_smmu_domain needs to be split into two, > arm_smmu_domain and arm_smmu_device_domain with the latter containing > a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg. > Each iommu_domain_ops operation now needs to loop over each > arm_smmu_device_domain. No, if the instance is not compatible whith the domain then the attachment fails. > 2. Some of the iommu_domain fields also depend on the per-SMMU > io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end. > These need to be restricted as the domain is attached to more > devices. These need to be an exact match then. > 3. Attaching a domain to a new SMMU device must be prohibited after > any call to map_pages or if iommu_domain.pgsize_bitmap and > iommu-domain.geometry.aperture_end have been consumed by any system. Attach needs to fail, we don't reconfigure the domain. > The arm-smmu-v3-sva.c implementation avoids all these problems > because it doesn't need to allocate an io_pgtable; the shared > arm_smmu_ctx_desc's This isn't true, it has an IO page table (owned by the MM) and that page table has all the same properties you were concerned about above. They all still need to be checked for compatability. ie from a SMMU HW instance perspective there is no difference between a S1 or MM owned IO page table. From a code perspective the only difference should be where the TTBR comes from. The SVA case still should have a cfg describing what kind of IO pagetable the mm has created, and that cfg should still technically be checked for compatability. Have in mind that SVA is *exactly* the same as a S1 iommu_domain with PRI enabled except that the TTBR comes from the MM instead of being internally allocated. There should be no other differences between the two operating modes, that SVA became its owns special thing needs to be undone, not doubled down on. > > I think splitting it into a series to re-organize the way ste/cd stuff > > works is a nice contained topic. > > Should I explicitly resend this patch series as a v3 cutting off > patches 14 and onwards? I think it is good to make progress, it looked to me like the first part stood alone fairly well and was an improvement on its own. Jason
On Wed, Jun 14, 2023 at 5:57 PM Robin Murphy <robin.murphy@arm.com> wrote: > > In practice it would be entirely reasonable to only support > cross-instance attach between instances with matching capabilities such > that they *can* share the pagetable directly. On Wed, Jun 14, 2023 at 8:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > When we talk about multi instance support, it means the iommu_domain - > in whatever fixed IO page table format it uses - can be attached to > any SMMU instance that supports it as a compatible page table format. > > ARM doesn't quite reach this model, but once it passes the finalize > step it does. The goal is to move finalize to allocate. So for your > purposes you can ignore the difference. Got you. Failing the atach when the page table format is incompatible with the smmu device is a lot easier to handle. I didn't notice that SVA was already checking this elsewhere. I can give the multi-instance support a try (with the rest of these patches on top) and send it out as a follow-up series to this one. > I think it is good to make progress, it looked to me like the first > part stood alone fairly well and was an improvement on its own. Sorry for the noobie question; it's not 100% obvious to me what the next step is here. Is there anything I should do to progress that first part forward?
On Wed, Jun 14, 2023 at 09:30:34PM +0800, Michael Shavit wrote: > > I think it is good to make progress, it looked to me like the first > > part stood alone fairly well and was an improvement on its own. > > Sorry for the noobie question; it's not 100% obvious to me what the > next step is here. Is there anything I should do to progress that > first part forward? Reword your cover letter to cover the new subset and repost the series as v3 with just the patches you want to go with Jason
Hi, > -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Wednesday, June 7, 2023 8:00 PM > To: Michael Shavit <mshavit@google.com> > Cc: Will Deacon <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>; > Joerg Roedel <joro@8bytes.org>; jean-philippe@linaro.org; > nicolinc@nvidia.com; baolu.lu@linux.intel.com; linux-arm- > kernel@lists.infradead.org; iommu@lists.linux.dev; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH v2 14/18] iommu/arm-smmu-v3: Support domains with > shared CDs > > On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote: > > > What we definately shouldn't do is try to have different SVA > > > iommu_domain's pointing at the same ASID. That is again making SVA > > > special, which we are trying to get away from :) > > > > Fwiw, this change is preserving the status-quo in that regard; > > arm-smmu-v3-sva.c is already doing this. But yes, I agree that > > resolving the limitation is a better long term solution... and > > something I can try to look at further. > > I suppose we also don't really have a entirely clear picture what allocating > multiple SVA domains should even do in the iommu driver. > > The driver would like to share the ASID, but things are much cleaner for > everything if the driver model has ASID 1:1 with the iommu_domain. > > It suggests we are missing some core code in iommu_sva_bind_device() to try > to re-use existing SVA iommu_domains. This would certainly be better than > trying to teach every driver how to share and refcount its ASID concept... > > Today we have this super hacky iommu_get_domain_for_dev_pasid() thing > that allows SVA domain reuse for a single device. > > Possibly what we should do is conver the u32 pasid in the mm_struct to a > struct iommu_mm_data * and put alot more stuff in there. eg a linked list of > all SVA domains. If we are going to have 1:1 between SVA domain and pasid, why we need a linked list of all SVA domains? Would a SVA domain pointer be enough? I've got a patch-set which takes this suggestion to add an iommu_mm_data struct field to mm_struct. I'll send it out for review soon. The motivation of that patch-set is to let the invalidate_range() callback use the SVA domain referenced by mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation. Regards, -Tina > > > Splitting this part into a follow-up patch series would definitely be > > easier and helpful if you're all ok with it :) . > > I think splitting it into a series to re-organize the way ste/cd stuff works is a > nice contained topic. > > Adjusting the way the ASID works with SVA is another good topic > > And so on > > Jason
On Wed, Jul 05, 2023 at 09:56:50AM +0000, Zhang, Tina wrote: > > Possibly what we should do is conver the u32 pasid in the mm_struct to a > > struct iommu_mm_data * and put alot more stuff in there. eg a linked list of > > all SVA domains. > If we are going to have 1:1 between SVA domain and pasid, why we > need a linked list of all SVA domains? Would a SVA domain pointer be > enough? Kevin asked this, we can't assume that a single SVA domain is going to work in a multi-iommu-driver system, which we are trying to enable at the core level.. > I've got a patch-set which takes this suggestion to add an > iommu_mm_data struct field to mm_struct. I'll send it out for review > soon. The motivation of that patch-set is to let the > invalidate_range() callback use the SVA domain referenced by > mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation. Huh? You are supposed to put the struct mmu_notifier inside the sva_domain struct and use container_of(). This is another reason why I'd prefer we de-duplicate SVA domains at the core code level as duplicitive notifiers are expensive.. Please don't add stuff to the mm just for this reason. Jason
> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Tuesday, July 11, 2023 12:55 AM > To: Zhang, Tina <tina.zhang@intel.com> > Cc: Michael Shavit <mshavit@google.com>; Will Deacon <will@kernel.org>; > Robin Murphy <robin.murphy@arm.com>; Joerg Roedel <joro@8bytes.org>; > jean-philippe@linaro.org; nicolinc@nvidia.com; baolu.lu@linux.intel.com; > linux-arm-kernel@lists.infradead.org; iommu@lists.linux.dev; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH v2 14/18] iommu/arm-smmu-v3: Support domains with > shared CDs > > On Wed, Jul 05, 2023 at 09:56:50AM +0000, Zhang, Tina wrote: > > > > Possibly what we should do is conver the u32 pasid in the mm_struct > > > to a struct iommu_mm_data * and put alot more stuff in there. eg a > > > linked list of all SVA domains. > > > If we are going to have 1:1 between SVA domain and pasid, why we need > > a linked list of all SVA domains? Would a SVA domain pointer be > > enough? > > Kevin asked this, we can't assume that a single SVA domain is going to work in > a multi-iommu-driver system, which we are trying to enable at the core level.. > > > I've got a patch-set which takes this suggestion to add an > > iommu_mm_data struct field to mm_struct. I'll send it out for review > > soon. The motivation of that patch-set is to let the > > invalidate_range() callback use the SVA domain referenced by > > mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation. > > Huh? > > You are supposed to put the struct mmu_notifier inside the sva_domain > struct and use container_of(). No. Just want to use iommu_mm to replace the old pasid field. https://lore.kernel.org/linux-iommu/20230707013441.365583-5-tina.zhang@intel.com/ The motivation is mainly to for performance optimization for vIOMMU, though it could also benefit native world. Currently, in invalidate_range() callback implemented by VT-d driver, due to lacking sva_domain knowledge (i.e., intel_invalidate_range() doesn't know which IOMMUs' IOTLB should be invalidated), intel_invalidate_range() just performs IOMMU IOTLB per device and that leads to superfluous IOTLB invalidations. For example, if there are two devices physically behind an IOMMU and those two devices are used by one user space application, it means in each invalidate_range() callback, two IOTLB and two device-IOTLB invalidations would be performed, instead of one IOTLB plus two device-TLB invalidations (i.e., one IOTLB invalidation is redundant). Things will become worse in a virtualized environment, especially with one vIOMMU exposed to guest, as invoking IOTLB is much more expensive than doing it in native and conceptually there could be more devices behind a vIOMMU that means more superfluous IOTLB invalidations. So, we want to add sva_domain related info to mm to remove the superfluous IOTLB invalidations. Thanks, -Tina > > This is another reason why I'd prefer we de-duplicate SVA domains at the > core code level as duplicitive notifiers are expensive.. > > Please don't add stuff to the mm just for this reason. > > Jason
On Tue, Jul 11, 2023 at 12:26:56AM +0000, Zhang, Tina wrote: > The motivation is mainly to for performance optimization for vIOMMU, > though it could also benefit native world. > > Currently, in invalidate_range() callback implemented by VT-d > driver, due to lacking sva_domain knowledge (i.e., > intel_invalidate_range() doesn't know which IOMMUs' IOTLB should be > invalidated), intel_invalidate_range() just performs IOMMU IOTLB per > device and that leads to superfluous IOTLB invalidations. You get the sva_domain from container_of on the notifier The sva_domain has a list of all the devices and iommu's that it is connected to The driver optimizes the invalidations based on the list. The core code de-duplicates the sva domain so there is usually only one. If someone creates two SVA domains then the driver is less optimal, oh well. The driver should not look inside the mm_struct and certainly should never try to get a "SVA domain" from it. Jason
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index b7f834dde85d1..69b1d09fd0284 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -965,6 +965,20 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } +static struct arm_smmu_ctx_desc *arm_smmu_get_cd(struct arm_smmu_domain *domain) +{ + if (domain->stage == ARM_SMMU_DOMAIN_S1_SHARED_CD) + return domain->shared_cd; + else + return &domain->cd; +} + +static bool arm_smmu_is_s1_domain(struct arm_smmu_domain *domain) +{ + return domain->stage == ARM_SMMU_DOMAIN_S1_SHARED_CD || + domain->stage == ARM_SMMU_DOMAIN_S1; +} + /* master may be null */ static void arm_smmu_sync_cd(struct arm_smmu_master *master, int ssid, bool leaf) @@ -1887,8 +1901,8 @@ static void arm_smmu_tlb_inv_context(void *cookie) * insertion to guarantee those are observed before the TLBI. Do be * careful, 007. */ - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid); + if (arm_smmu_is_s1_domain(smmu_domain)) { + arm_smmu_tlb_inv_asid(smmu, arm_smmu_get_cd(smmu_domain)->asid); } else { cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; @@ -1968,10 +1982,10 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size, }, }; - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + if (arm_smmu_is_s1_domain(smmu_domain)) { cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ? CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; - cmd.tlbi.asid = smmu_domain->cd.asid; + cmd.tlbi.asid = arm_smmu_get_cd(smmu_domain)->asid; } else { cmd.opcode = CMDQ_OP_TLBI_S2_IPA; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; @@ -2549,7 +2563,7 @@ static int arm_smmu_set_dev_pasid(struct iommu_domain *domain, return -ENODEV; } - if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) { + if (!arm_smmu_is_s1_domain(smmu_domain)) { dev_err(dev, "set_dev_pasid only supports stage 1 domains\n"); return -EINVAL; } @@ -2575,7 +2589,7 @@ static int arm_smmu_set_dev_pasid(struct iommu_domain *domain, */ mutex_lock(&arm_smmu_asid_lock); ret = arm_smmu_write_ctx_desc(master->smmu, master->s1_cfg, master, - pasid, &smmu_domain->cd); + pasid, arm_smmu_get_cd(smmu_domain)); if (ret) { mutex_unlock(&arm_smmu_asid_lock); kfree(attached_domain); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 3525d60668c23..4ac69427abf1c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -713,6 +713,7 @@ struct arm_smmu_master { /* SMMU private data for an IOMMU domain */ enum arm_smmu_domain_stage { ARM_SMMU_DOMAIN_S1 = 0, + ARM_SMMU_DOMAIN_S1_SHARED_CD, ARM_SMMU_DOMAIN_S2, ARM_SMMU_DOMAIN_NESTED, ARM_SMMU_DOMAIN_BYPASS, @@ -728,6 +729,7 @@ struct arm_smmu_domain { enum arm_smmu_domain_stage stage; union { struct arm_smmu_ctx_desc cd; + struct arm_smmu_ctx_desc *shared_cd; struct arm_smmu_s2_cfg s2_cfg; };