Message ID | 20230724110406.107212-3-yi.l.liu@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp1733752vqg; Mon, 24 Jul 2023 04:31:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlHdraMECvy03MGUzVlaSjfVAGbJ97HliGrWBSfhYt3vwA+9ecjkQAZyoTxncHo3IVBvnQqw X-Received: by 2002:a17:903:4285:b0:1b2:a63:9587 with SMTP id ju5-20020a170903428500b001b20a639587mr6736821plb.36.1690198311882; Mon, 24 Jul 2023 04:31:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690198311; cv=none; d=google.com; s=arc-20160816; b=g/8wrXjusGGvzixbU4auRNvQF53pZvjQW104fFv+ffmz98rHbuJS1ZshY8Wp+/qEO0 J+p9px7gxKDHAizOfv71D0hDOUA4TOp/KExBPDTz9cViY6NQ0LfkOYgwODx2tge5lIWI nW+Jj7WdETWJq/3UBXlaEpGVJB4yy7FTOtJZY0DBCwLh5jD/7UIWNa6m0/Qeij24xw4x HrXr6ISlRCaIbUvXIo2OVU/748cunhsHZQpxwSY6G5km71eZS6hI1jIDtV5Yw3BuWl66 79TFpIHqM2YPBqnVofCTTSWjfYFXey00DpMFbLKXI3/Y29TLU1Z2CBxhs6ifFxAA/x4U FmhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=aff2xy3sVPs+NhyHOQXFvsI1+uqNT01UNc507x+TtgM=; fh=25kVo4cotgMzE250R6E3K5ES7E7C2JawiavlW0h5mKU=; b=JMbHbIw4BW35aKR6hR/Nkhl65Lnw6dgDAIoMCGFp8ybZbMuBPlh8DXohrallC8xjhL m9HrByiXqc5h8SQxCue1JN16L8vAmotSGlwXQYLcIZ0vkh3jbnjxZDT6yV5A7oO725Ge /2wf/Xc3TeFNLcPt5VQnaFqNccHjcPTFm0o2mIU6lCcxl6xDvSEVhqKQje19jlYxVXe1 f9/SGzRkLsl+SSkzFXDvYOB2Fe2wy3yxR9X3m3/r0ZVpkssYKyNKRd/zXy+6yY3FHfhg Y0v4OMBeKUYAZE08EYWz5yT+a8j+JzMGEabZIQKzEIME3NVNoqmOAphdqEgXG1j+2dpt yH4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EeBEUj8+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ko16-20020a17090307d000b001b8698149f2si8150910plb.588.2023.07.24.04.31.38; Mon, 24 Jul 2023 04:31:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EeBEUj8+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232948AbjGXLFT (ORCPT <rfc822;kloczko.tomasz@gmail.com> + 99 others); Mon, 24 Jul 2023 07:05:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232640AbjGXLFG (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 24 Jul 2023 07:05:06 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EBC519B2; Mon, 24 Jul 2023 04:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690196680; x=1721732680; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TwNnGQoy/O9RF24KvlNH55AuSnmg4Ro5e8QjmX39PRQ=; b=EeBEUj8+PA7hW/BSPLA42M0WYkEeRdyVnol+cm0d/vrbg2HQtzSnn4Sl bnu1R2uovtohva4Hd307vKYvJ0Mo1fcZvrKRbzApS+Uc7gm5J7jkAX71C hbvig/xL36t1LhNQhjalmVkMBwO6teFENtEJjVyp4vS4Psb+rfN+w625y 7ijhovuiySnElBp7TyBGTKBqdpqor6+qDA3Gta8TaVeB7ve6q8Z32rzBz Xp+zpUVnR9TRWAv6e0q/MwvLdQ57zJ/r/kgO7ZLl314DuYkbWOhUNMGaM 3iU3P2BydKZExkY2q37jWNB2y5k7Lyq9ed6UGE94nCLqgfxF5xE7Hv4oD A==; X-IronPort-AV: E=McAfee;i="6600,9927,10780"; a="366301786" X-IronPort-AV: E=Sophos;i="6.01,228,1684825200"; d="scan'208";a="366301786" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2023 04:04:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10780"; a="815775757" X-IronPort-AV: E=Sophos;i="6.01,228,1684825200"; d="scan'208";a="815775757" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by FMSMGA003.fm.intel.com with ESMTP; 24 Jul 2023 04:04:11 -0700 From: Yi Liu <yi.l.liu@intel.com> To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, zhenzhong.duan@intel.com Subject: [PATCH v3 02/17] iommu: Add nested domain support Date: Mon, 24 Jul 2023 04:03:51 -0700 Message-Id: <20230724110406.107212-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230724110406.107212-1-yi.l.liu@intel.com> References: <20230724110406.107212-1-yi.l.liu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772301384677705469 X-GMAIL-MSGID: 1772301384677705469 |
Series |
iommufd: Add nesting infrastructure
|
|
Commit Message
Yi Liu
July 24, 2023, 11:03 a.m. UTC
From: Lu Baolu <baolu.lu@linux.intel.com> Introduce a new domain type for a user I/O page table, which is nested on top of another user space address represented by a UNMANAGED domain. The mappings of a nested domain are managed by user space software, therefore it's unnecessary to have map/unmap callbacks. But the updates of the PTEs in the nested domain page table must be propagated to the caches on both IOMMU (IOTLB) and devices (DevTLB). The nested domain is allocated by the domain_alloc_user op, and attached to the device through the existing iommu_attach_device/group() interfaces. A new domain op, named cache_invalidate_user is added for the userspace to flush the hardware caches for a nested domain through iommufd. No wrapper for it, as it's only supposed to be used by iommufd. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> --- include/linux/iommu.h | 11 +++++++++++ 1 file changed, 11 insertions(+)
Comments
> From: Liu, Yi L <yi.l.liu@intel.com> > Sent: Monday, July 24, 2023 7:04 PM > > From: Lu Baolu <baolu.lu@linux.intel.com> > > Introduce a new domain type for a user I/O page table, which is nested > on top of another user space address represented by a UNMANAGED > domain. The > mappings of a nested domain are managed by user space software, > therefore > it's unnecessary to have map/unmap callbacks. But the updates of the PTEs > in the nested domain page table must be propagated to the caches on both > IOMMU (IOTLB) and devices (DevTLB). > > The nested domain is allocated by the domain_alloc_user op, and attached > to the device through the existing iommu_attach_device/group() interfaces. > > A new domain op, named cache_invalidate_user is added for the userspace > to > flush the hardware caches for a nested domain through iommufd. No > wrapper > for it, as it's only supposed to be used by iommufd. > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote: > @@ -350,6 +354,10 @@ struct iommu_ops { > * @iotlb_sync_map: Sync mappings created recently using @map to the hardware > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush > * queue > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings > + * @cache_invalidate_user_data_len: Defined length of input user data for the > + * cache_invalidate_user op, being sizeof the > + * structure in include/uapi/linux/iommufd.h > * @iova_to_phys: translate iova to physical address > * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE, > * including no-snoop TLPs on PCIe or other platform > @@ -379,6 +387,9 @@ struct iommu_domain_ops { > size_t size); > void (*iotlb_sync)(struct iommu_domain *domain, > struct iommu_iotlb_gather *iotlb_gather); > + int (*cache_invalidate_user)(struct iommu_domain *domain, > + void *user_data); If we are doing const unions, then this void * should also be a const union. Jason
On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote: > On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote: > > > @@ -350,6 +354,10 @@ struct iommu_ops { > > * @iotlb_sync_map: Sync mappings created recently using @map to the hardware > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush > > * queue > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings > > + * @cache_invalidate_user_data_len: Defined length of input user data for the > > + * cache_invalidate_user op, being sizeof the > > + * structure in include/uapi/linux/iommufd.h > > * @iova_to_phys: translate iova to physical address > > * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE, > > * including no-snoop TLPs on PCIe or other platform > > @@ -379,6 +387,9 @@ struct iommu_domain_ops { > > size_t size); > > void (*iotlb_sync)(struct iommu_domain *domain, > > struct iommu_iotlb_gather *iotlb_gather); > > + int (*cache_invalidate_user)(struct iommu_domain *domain, > > + void *user_data); > > If we are doing const unions, then this void * should also be a const > union. Unlike iommu_domain_user_data is a union on its own, all invalidate user data structures are added to union ucmd_buffer. It feels a bit weird to cross reference "union ucmd_buffer" and to pass the naming "ucmd_buffer" in this cache_invalidate_user. Any suggestion? Thanks Nic
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Thursday, August 3, 2023 10:37 AM > > On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote: > > On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote: > > > > > @@ -350,6 +354,10 @@ struct iommu_ops { > > > * @iotlb_sync_map: Sync mappings created recently using @map to the hardware > > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush > > > * queue > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings > > > + * @cache_invalidate_user_data_len: Defined length of input user data for the > > > + * cache_invalidate_user op, being sizeof the > > > + * structure in include/uapi/linux/iommufd.h > > > * @iova_to_phys: translate iova to physical address > > > * @enforce_cache_coherency: Prevent any kind of DMA from bypassing > IOMMU_CACHE, > > > * including no-snoop TLPs on PCIe or other platform > > > @@ -379,6 +387,9 @@ struct iommu_domain_ops { > > > size_t size); > > > void (*iotlb_sync)(struct iommu_domain *domain, > > > struct iommu_iotlb_gather *iotlb_gather); > > > + int (*cache_invalidate_user)(struct iommu_domain *domain, > > > + void *user_data); > > > > If we are doing const unions, then this void * should also be a const > > union. > > Unlike iommu_domain_user_data is a union on its own, all invalidate > user data structures are added to union ucmd_buffer. It feels a bit > weird to cross reference "union ucmd_buffer" and to pass the naming > "ucmd_buffer" in this cache_invalidate_user. > > Any suggestion? I think we can have a union like iommu_user_cache_invalidate, every new data structures should be put in this union, and this union is put in the ucmd_buffer. Regards, Yi Liu
On Thu, Aug 03, 2023 at 02:53:34AM +0000, Liu, Yi L wrote: > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Thursday, August 3, 2023 10:37 AM > > > > On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote: > > > On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote: > > > > > > > @@ -350,6 +354,10 @@ struct iommu_ops { > > > > * @iotlb_sync_map: Sync mappings created recently using @map to the hardware > > > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush > > > > * queue > > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings > > > > + * @cache_invalidate_user_data_len: Defined length of input user data for the > > > > + * cache_invalidate_user op, being sizeof the > > > > + * structure in include/uapi/linux/iommufd.h > > > > * @iova_to_phys: translate iova to physical address > > > > * @enforce_cache_coherency: Prevent any kind of DMA from bypassing > > IOMMU_CACHE, > > > > * including no-snoop TLPs on PCIe or other platform > > > > @@ -379,6 +387,9 @@ struct iommu_domain_ops { > > > > size_t size); > > > > void (*iotlb_sync)(struct iommu_domain *domain, > > > > struct iommu_iotlb_gather *iotlb_gather); > > > > + int (*cache_invalidate_user)(struct iommu_domain *domain, > > > > + void *user_data); > > > > > > If we are doing const unions, then this void * should also be a const > > > union. > > > > Unlike iommu_domain_user_data is a union on its own, all invalidate > > user data structures are added to union ucmd_buffer. It feels a bit > > weird to cross reference "union ucmd_buffer" and to pass the naming > > "ucmd_buffer" in this cache_invalidate_user. > > > > Any suggestion? > > I think we can have a union like iommu_user_cache_invalidate, every new > data structures should be put in this union, and this union is put in the > ucmd_buffer. Ah, that should do the job. Thanks!
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ecbec2627b63..b8f09330b64e 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -66,6 +66,9 @@ struct iommu_domain_geometry { #define __IOMMU_DOMAIN_SVA (1U << 4) /* Shared process address space */ +#define __IOMMU_DOMAIN_NESTED (1U << 5) /* User-managed address space nested + on a stage-2 translation */ + #define IOMMU_DOMAIN_ALLOC_FLAGS ~__IOMMU_DOMAIN_DMA_FQ /* * This are the possible domain-types @@ -92,6 +95,7 @@ struct iommu_domain_geometry { __IOMMU_DOMAIN_DMA_API | \ __IOMMU_DOMAIN_DMA_FQ) #define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SVA) +#define IOMMU_DOMAIN_NESTED (__IOMMU_DOMAIN_NESTED) struct iommu_domain { unsigned type; @@ -350,6 +354,10 @@ struct iommu_ops { * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings + * @cache_invalidate_user_data_len: Defined length of input user data for the + * cache_invalidate_user op, being sizeof the + * structure in include/uapi/linux/iommufd.h * @iova_to_phys: translate iova to physical address * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE, * including no-snoop TLPs on PCIe or other platform @@ -379,6 +387,9 @@ struct iommu_domain_ops { size_t size); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); + int (*cache_invalidate_user)(struct iommu_domain *domain, + void *user_data); + const size_t cache_invalidate_user_data_len; phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);