Message ID | 20230511143844.22693-3-yi.l.liu@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp4435342vqo; Thu, 11 May 2023 07:53:24 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5msLl9IFoHImDB7IxA28A3R3LsdR+NWAjsfDEe9hyuw8Paw8i2fV5pcgMl9QNQfxDUQzS3 X-Received: by 2002:a17:902:d48d:b0:1ac:66c4:6071 with SMTP id c13-20020a170902d48d00b001ac66c46071mr22805260plg.57.1683816804605; Thu, 11 May 2023 07:53:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683816804; cv=none; d=google.com; s=arc-20160816; b=KjGNzVki5tQu4J3xxqUJ8Vhz44SaD0NIb/ze9HgGJimyfVAKNsKAtnfPraTIwW63Xo O01zh4OkfUZB2NO+lk9bKhp+HMhLii2prCqROtmalJLqrHi+OKCQ1V7jN9boDqxrxkte cBMNepoor5XETqiMyWFDWgpx1R0mU0bQNiqmiJvHppA5q3K83Nbv/bks6XbqnrJ/YUyv LWk9W21tT6eH8jGCBswZJhK0lPSQOjJnfEEItX33ValG0R7IH0njegOZHAnR7lFVZXTZ cdfXjooNMdeT4d+JeoTdagyieSjV6YHowk0ImdkWQqFHkmDltiEhNZopcmqMmJor32bj pq3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ADoxD/cBpgC4HUQpMmzDK6058+DbT8+hdz4b6SgGqjU=; b=ECLfFRLMzNDbMiKlM8pXA8v521Ptx5FkGlrDbexzVPEkweaYaoDPZSAWAxftLr3yPS L5dhfZ4Q6CpLrDmjbxHTKAz5jbsFr/Xkk7uopIWXqJUuB0Mx2BPFJ0o+OP6ciO2Pre0/ L6dNOunCxVG3aC6TKI27ulX8/zOOsr6ICtqwzqxcLgu25w0c+PmecNZAO0wtlvOhtcFs 9/qmXtkvRhGNz9sk1VFscGddd6UEsWACjaKbXLJIULFE35NaczEymArLshGEQctGzw8k OC+T2skIwmY9kZJ86hckznv4qwFQu23dai3adr1TA/Tv8EdOVIb924hBSvMmKjn+SOe+ y+eQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UHHZukxg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k7-20020a170902c40700b001a1cefdaa33si7156152plk.342.2023.05.11.07.53.11; Thu, 11 May 2023 07:53:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UHHZukxg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238682AbjEKOnI (ORCPT <rfc822;peekingduck44@gmail.com> + 99 others); Thu, 11 May 2023 10:43:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238612AbjEKOma (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 11 May 2023 10:42:30 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 844111249E; Thu, 11 May 2023 07:38:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683815929; x=1715351929; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4spnTWVSxrPcfUADcyFVe2yRLnmd/04snICB4zgJN08=; b=UHHZukxg4GvTdUS9dMJxWijmy8E6dmZITQe/5ZFi+PzocrV/txmny/TZ jhGF2Zb2UMNvnf1yGmc1BgqrANOKl0dlmJ1lmziyIRzwhCIlW7I+bLCSZ 16HjJJRWERUMgC4TdmfpVAb7EuCmHwibhseNRCT64fnssb14ryztqKih4 tbpd+qi1W/xOn76GUT9tBYCiynk+lkO2j+053w8MltIL9QilaEET6hOxN vt4HT/v5VaYoik9oz6vdmXXJRbXV0xsq7PiUwve70YMuTYxqWr00IYfef yJremhwpPClBFbU2XqRK8desiuF42izZ3BpHxJ4HqlhT3GuySnMY0j07p w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="339812861" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="339812861" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 07:38:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="730382603" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="730382603" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga008.jf.intel.com with ESMTP; 11 May 2023 07:38:47 -0700 From: Yi Liu <yi.l.liu@intel.com> To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com, robin.murphy@arm.com, baolu.lu@linux.intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, zhenzhong.duan@intel.com Subject: [PATCH v2 02/11] iommu: Add nested domain support Date: Thu, 11 May 2023 07:38:35 -0700 Message-Id: <20230511143844.22693-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230511143844.22693-1-yi.l.liu@intel.com> References: <20230511143844.22693-1-yi.l.liu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765609889866181609?= X-GMAIL-MSGID: =?utf-8?q?1765609889866181609?= |
Series |
iommufd: Add nesting infrastructure
|
|
Commit Message
Yi Liu
May 11, 2023, 2:38 p.m. UTC
From: Lu Baolu <baolu.lu@linux.intel.com> Introduce a new domain type for a user I/O page table, which is nested on top of another user space address represented by a UNMANAGED domain. The mappings of a nested domain are managed by user space software, therefore it's unnecessary to have map/unmap callbacks. But the updates of the PTEs in the nested domain page table must be propagated to the caches on both IOMMU (IOTLB) and devices (DevTLB). The nested domain is allocated by the domain_alloc_user op, and attached to the device through the existing iommu_attach_device/group() interfaces. A new domain op, named cache_invalidate_user is added for the userspace to flush the hardware caches for a nested domain through iommufd. No wrapper for it, as it's only supposed to be used by iommufd. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> --- include/linux/iommu.h | 11 +++++++++++ 1 file changed, 11 insertions(+)
Comments
> From: Yi Liu <yi.l.liu@intel.com> > Sent: Thursday, May 11, 2023 10:39 PM > > @@ -66,6 +66,9 @@ struct iommu_domain_geometry { > > #define __IOMMU_DOMAIN_SVA (1U << 4) /* Shared process address > space */ > > +#define __IOMMU_DOMAIN_NESTED (1U << 5) /* User-managed IOVA > nested on > + a stage-2 translation */ s/IOVA/address space/ > @@ -346,6 +350,10 @@ struct iommu_ops { > * @iotlb_sync_map: Sync mappings created recently using @map to the > hardware > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty > flush > * queue > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO > mappings > + * @cache_invalidate_user_data_len: Defined length of input user data for > the > + * cache_invalidate_user op, being sizeof the > + * structure in include/uapi/linux/iommufd.h same as comment to last patch, can this be merged with @hw_info?
On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote: > > @@ -346,6 +350,10 @@ struct iommu_ops { > > * @iotlb_sync_map: Sync mappings created recently using @map to the > > hardware > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty > > flush > > * queue > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO > > mappings > > + * @cache_invalidate_user_data_len: Defined length of input user data for > > the > > + * cache_invalidate_user op, being sizeof the > > + * structure in include/uapi/linux/iommufd.h > > same as comment to last patch, can this be merged with @hw_info? I think it's better to keep them separate, since this is added in struct iommu_domain_ops, given it is domain/hwpt specific, while the hw_info is in struct iommu_ops? Thanks Nic
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Saturday, May 20, 2023 2:49 AM > > On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote: > > > > @@ -346,6 +350,10 @@ struct iommu_ops { > > > * @iotlb_sync_map: Sync mappings created recently using @map to the > > > hardware > > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and > empty > > > flush > > > * queue > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO > > > mappings > > > + * @cache_invalidate_user_data_len: Defined length of input user data > for > > > the > > > + * cache_invalidate_user op, being sizeof the > > > + * structure in include/uapi/linux/iommufd.h > > > > same as comment to last patch, can this be merged with @hw_info? > > I think it's better to keep them separate, since this is added > in struct iommu_domain_ops, given it is domain/hwpt specific, > while the hw_info is in struct iommu_ops? > Just be curious whether there are real examples in which the data len might be different upon the hwpt type...
On Wed, May 24, 2023 at 05:03:37AM +0000, Tian, Kevin wrote: > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Saturday, May 20, 2023 2:49 AM > > > > On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote: > > > > > > @@ -346,6 +350,10 @@ struct iommu_ops { > > > > * @iotlb_sync_map: Sync mappings created recently using @map to the > > > > hardware > > > > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and > > empty > > > > flush > > > > * queue > > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO > > > > mappings > > > > + * @cache_invalidate_user_data_len: Defined length of input user data > > for > > > > the > > > > + * cache_invalidate_user op, being sizeof the > > > > + * structure in include/uapi/linux/iommufd.h > > > > > > same as comment to last patch, can this be merged with @hw_info? > > > > I think it's better to keep them separate, since this is added > > in struct iommu_domain_ops, given it is domain/hwpt specific, > > while the hw_info is in struct iommu_ops? > > > > Just be curious whether there are real examples in which the data > len might be different upon the hwpt type... Likely "no" on top of my head. Yet it's a bit odd to see the data length for cache_invalidate_user being added along with the hw_info, since they belong to two different structs, and even two different series? Thanks Nic
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 7f2046fa53a3..c2d0fa3e2e18 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -66,6 +66,9 @@ struct iommu_domain_geometry { #define __IOMMU_DOMAIN_SVA (1U << 4) /* Shared process address space */ +#define __IOMMU_DOMAIN_NESTED (1U << 5) /* User-managed IOVA nested on + a stage-2 translation */ + /* * This are the possible domain-types * @@ -91,6 +94,7 @@ struct iommu_domain_geometry { __IOMMU_DOMAIN_DMA_API | \ __IOMMU_DOMAIN_DMA_FQ) #define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SVA) +#define IOMMU_DOMAIN_NESTED (__IOMMU_DOMAIN_NESTED) struct iommu_domain { unsigned type; @@ -346,6 +350,10 @@ struct iommu_ops { * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings + * @cache_invalidate_user_data_len: Defined length of input user data for the + * cache_invalidate_user op, being sizeof the + * structure in include/uapi/linux/iommufd.h * @iova_to_phys: translate iova to physical address * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE, * including no-snoop TLPs on PCIe or other platform @@ -375,6 +383,9 @@ struct iommu_domain_ops { size_t size); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); + int (*cache_invalidate_user)(struct iommu_domain *domain, + void *user_data); + const size_t cache_invalidate_user_data_len; phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);