Message ID | a98e622f41d76b64f5a7d0c758d8bda5e8043013.1675320212.git.nicolinc@nvidia.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp83931wrn; Wed, 1 Feb 2023 23:06:57 -0800 (PST) X-Google-Smtp-Source: AK7set8KWsCNRRpMfTn4kwhchdPcWv6LCBGVA1WUBSAW17FuAxz6fGARhPvDlW1RBYbEeJ297jNg X-Received: by 2002:a17:906:d290:b0:87a:dadd:c5aa with SMTP id ay16-20020a170906d29000b0087adaddc5aamr4164578ejb.2.1675321617160; Wed, 01 Feb 2023 23:06:57 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1675321617; cv=pass; d=google.com; s=arc-20160816; b=yaWYw+tqy9OcsrQRWYFFm1rVBJowxLPk/6LpCvKCR3caWo4O3Lc9MqQDokvm+if5sJ z7wqu7m1QjKdDH8CdT0QCF7dpodna+eKNwvtVJcSkmzGm6dNam+Awg0HgYmxJ2rflujl 4rKdNjQcXD7p/Teq42alkGLvU4CQ5wFm1xW76CqgJkHkMiBIF2psKVZW+34ZOLj+hW9A 4q8iqoSIjKu7KZ5VGHR4ndEnoKHC97tIi1k9vcz13UAuPvukjWcqMy32HP25Tfzbd22N pheO8liQxXH3g+NBhPD4j04xPcIKP52mvgk/Mm51Ct3gSEvAEuZpj2+fWfxIiP/TODoa 9sQQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=aSXopJtgzfG+ksXEKGP9eCOmabLw6Vz9mNSanSl1ntQ=; b=zaoHK2Ryn3UE69ZiMv32ZNCdAJFbMvSGFT/73afigB8TtzEjqWLqbOrLzzMQnpR28I CP3vyflQvhkro46KDhRq/EllZaYP5AuE/oPwFghqPYh0lmMOHDRrDRdT/q7FWdy0lONW fY3WiI7vSW1ONfqYBjDpUHDojuqB80NklRqRcKJTJwS7I5iGX/8zIw86eoFFQhCWSGoy O27coG3TjME0V7ewQVWpfh/4/WwIT+lsd0CD9B54QJ/O1ZLzRTbfUDdTRc0KpMCpX+rw PZDh95n2EvRAIeDPlLKbSlLTSYsa2dYIy/W5JcgqrI8O7RdSmIX3UHEpCAYx3zVoVWR9 c+Qg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=Asfw5JsL; arc=pass (i=1 spf=pass spfdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fn18-20020a1709069d1200b0088eb55ed9d2si2147489ejc.684.2023.02.01.23.06.33; Wed, 01 Feb 2023 23:06:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=Asfw5JsL; arc=pass (i=1 spf=pass spfdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231646AbjBBHFy (ORCPT <rfc822;il.mystafa@gmail.com> + 99 others); Thu, 2 Feb 2023 02:05:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231473AbjBBHFt (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Feb 2023 02:05:49 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2060.outbound.protection.outlook.com [40.107.94.60]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6B7583493; Wed, 1 Feb 2023 23:05:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mMwMqfmwRSjvgDPHcJ28fNKjuVceBjLrBiUUyxcBKCXO4Dc2Vmniuz6NWVthPwsWaR3FXByZ7ZLflrs6pId44BRrFJXq7bavSyfNpfeJZHoR7showLNf7z9mpCAzugfyOKCgFqQ7XreEl9NhnLCzraEfb4LSyUF1wY4k58pcadQk11LdWH9tTMcySmYW1xo6p9IIy0ZXLu3K230i2vwT9aKrdgnrjvEipSWkS+EncxQUwR96sQ0vBNFgK2sMFi+cH2gQSBq/pY8WXbJfRx8YfQTAvkeZgA1kadgRD1dYRru7Gqwvle7sVNYs0V3aY6ZYJLsJzOWWPby/pIp+ebViJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aSXopJtgzfG+ksXEKGP9eCOmabLw6Vz9mNSanSl1ntQ=; b=KIrDRodF6DX9TYzo10QD2CasEpLUsT3oSfhko/DmlCySwNkdmplzJUm6/21/FnnR/HwR/nWKDPx5JlInU6M/jcR99aEAoI1gX2/Aae4e4gizd4UnKndW2+FFipd5easzhPoL/EUu59VkIxUxNBBkzslQWSLj3b3ZxaQbz0JhWuSDcwpiBe8hecVmzLN2zX6WVk5g9aMorlZ1LUdtqYBQfI4CD1010FVc+ja1Ijn4gCdoH9sQvo6gLChP3Cz2vlanincbkVp0s9h1y3WmDC/54SNWdI+1E0PIYh9GBNzkxEKmpmL7Q3DyVsDjLO7KiD7NERameURF82h2q3V0+8G69w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aSXopJtgzfG+ksXEKGP9eCOmabLw6Vz9mNSanSl1ntQ=; b=Asfw5JsLuavfVuN76soQEbG2I4hvnte0UN4Zs++8iA0+hSI1Zj4CRM+6Gi5EC8+nukMSMyzeZh1medRc/6yBJSo8MdjX0QUwQnCnCD7fdoC/nBNSYHHS75bzc1UkvHQsgKUsRay2eki8U0MJs9BSa18RqoUbg64fXUXmyK4/pkFRkMYj+idTNwpOvgHoH9c2t6rtpWPsDg+9S5SLDlUCAbgM4fPHNrpBZ8Q6GDjPwa0XoU1ZeAnX7JXtvnv8J+yQQUfHRjeWokZTC7dXGZ381/kvy6NrVmFrjbgUvGFx+e39yPISuFl1LIQRzTweng1bplwj64if2uEC9tW5o+p4/g== Received: from DS7PR03CA0240.namprd03.prod.outlook.com (2603:10b6:5:3ba::35) by DM4PR12MB5817.namprd12.prod.outlook.com (2603:10b6:8:60::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6064.27; Thu, 2 Feb 2023 07:05:37 +0000 Received: from DS1PEPF0000B077.namprd05.prod.outlook.com (2603:10b6:5:3ba:cafe::38) by DS7PR03CA0240.outlook.office365.com (2603:10b6:5:3ba::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6064.24 via Frontend Transport; Thu, 2 Feb 2023 07:05:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS1PEPF0000B077.mail.protection.outlook.com (10.167.17.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6064.17 via Frontend Transport; Thu, 2 Feb 2023 07:05:36 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.36; Wed, 1 Feb 2023 23:05:23 -0800 Received: from rnnvmail203.nvidia.com (10.129.68.9) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.36; Wed, 1 Feb 2023 23:05:23 -0800 Received: from Asurada-Nvidia.nvidia.com (10.127.8.11) by mail.nvidia.com (10.129.68.9) with Microsoft SMTP Server id 15.2.986.36 via Frontend Transport; Wed, 1 Feb 2023 23:05:22 -0800 From: Nicolin Chen <nicolinc@nvidia.com> To: <jgg@nvidia.com>, <kevin.tian@intel.com>, <joro@8bytes.org>, <will@kernel.org>, <robin.murphy@arm.com>, <alex.williamson@redhat.com>, <shuah@kernel.org> CC: <yi.l.liu@intel.com>, <linux-kernel@vger.kernel.org>, <iommu@lists.linux.dev>, <kvm@vger.kernel.org>, <linux-kselftest@vger.kernel.org>, <baolu.lu@linux.intel.com> Subject: [PATCH v1 2/8] iommu: Introduce a new iommu_group_replace_domain() API Date: Wed, 1 Feb 2023 23:05:06 -0800 Message-ID: <a98e622f41d76b64f5a7d0c758d8bda5e8043013.1675320212.git.nicolinc@nvidia.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <cover.1675320212.git.nicolinc@nvidia.com> References: <cover.1675320212.git.nicolinc@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000B077:EE_|DM4PR12MB5817:EE_ X-MS-Office365-Filtering-Correlation-Id: be078dbb-72de-47e7-224c-08db04ebe2b3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: c7+k6zhoEvSTt8gk0y4EKA93Bhz9WmxO5aYmoksbmPZ8zV2aufVSMvxnppSic37ocuOfQKKuieS3H9l0lmW8Vw3MHnTtj/GQOsp5roEAfpy4+vWW28QYQmZWWkcoDUEhMGZp7wMCwib6XAOUaavYbO/WlZV+uqe3qx6nf0Zq7TC4WI7ggTmwN+geQDpzAXNi9rhL0+XU9lJL83K9M5AkWZKGeJGjtAapEP9P/fMjikETcPpeG8fcH9Vn1hR+N+A4n7EWYhNx8Lm0/UuUX3uce07+dYZL5y4h70wfYTt0/3GiUHFXhUrA227Dxqs8dmVvNcI280bJxm893AJinuNAiDSWnEgtY0Tg+tojMXFzzMX9zNyO8Q6LIJVc0/5ZSQshtf84VXtoCGGfYGPpgb4L5xrfobtviVSRh5KRlndQEezZ8j8kgNxAq+FkZQ0LT21oqk5hxgt7tfDvSnR8PQ2YCQ1ClW4PrVsEzM3JWCFn6yQtzOk70y00RcvBXDtxd8i+GN3EADevT69/cKwAqhJV8FOaS788duAMZMGJeK3dauZKPbQPhFQQ2mUoUEFzuKa2D54t7hRTBlGaSvky+D9EMDY3GaMsaAUoJhREMOk4mbe4OlE93uKpfoXagfAVoIjs+Iem7GUkIdh9dDjAOGSvE1sAS5x0t1mdQsPMNylwI1mgmweNvW0uYk58TrURsprcQRZ2RXutXGMqCJwXnTip8w== X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230025)(4636009)(376002)(346002)(136003)(39860400002)(396003)(451199018)(46966006)(36840700001)(40470700004)(82740400003)(7636003)(40460700003)(26005)(336012)(83380400001)(2616005)(186003)(478600001)(6666004)(70586007)(4326008)(316002)(110136005)(7416002)(54906003)(426003)(2906002)(7696005)(8676002)(41300700001)(86362001)(5660300002)(8936002)(70206006)(40480700001)(36756003)(82310400005)(47076005)(36860700001)(356005);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Feb 2023 07:05:36.8710 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: be078dbb-72de-47e7-224c-08db04ebe2b3 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000B077.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5817 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756702040006385450?= X-GMAIL-MSGID: =?utf-8?q?1756702040006385450?= |
Series |
Add IO page table replacement support
|
|
Commit Message
Nicolin Chen
Feb. 2, 2023, 7:05 a.m. UTC
qemu has a need to replace the translations associated with a domain
when the guest does large-scale operations like switching between an
IDENTITY domain and, say, dma-iommu.c.
Currently, it does this by replacing all the mappings in a single
domain, but this is very inefficient and means that domains have to be
per-device rather than per-translation.
Provide a high-level API to allow replacements of one domain with
another. This is similar to a detach/attach cycle except it doesn't
force the group to go to the blocking domain in-between.
By removing this forced blocking domain the iommu driver has the
opportunity to implement an atomic replacement of the domains to the
greatest extent its hardware allows.
Atomic replacement allows the qemu emulation of the viommu to be more
complete, as real hardware has this ability.
All drivers are already required to support changing between active
UNMANAGED domains when using their attach_dev ops.
This API is expected to be used by IOMMUFD, so add to the iommu-priv
header and mark it as IOMMUFD_INTERNAL.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommu-priv.h | 4 ++++
drivers/iommu/iommu.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+)
Comments
On 2023/2/2 15:05, Nicolin Chen wrote: > +/** > + * iommu_group_replace_domain - replace the domain that a group is attached to > + * @new_domain: new IOMMU domain to replace with > + * @group: IOMMU group that will be attached to the new domain > + * > + * This API allows the group to switch domains without being forced to go to > + * the blocking domain in-between. > + * > + * If the attached domain is a core domain (e.g. a default_domain), it will act > + * just like the iommu_attach_group(). I am not following above two lines. Why and how could iommufd set a core domain to an iommu_group? > + */ > +int iommu_group_replace_domain(struct iommu_group *group, > + struct iommu_domain *new_domain) > +{ > + int ret; > + > + if (!new_domain) > + return -EINVAL; > + > + mutex_lock(&group->mutex); > + ret = __iommu_group_set_domain(group, new_domain); > + if (ret) { > + if (__iommu_group_set_domain(group, group->domain)) > + __iommu_group_set_core_domain(group); > + } > + mutex_unlock(&group->mutex); > + return ret; > +} > +EXPORT_SYMBOL_NS_GPL(iommu_group_replace_domain, IOMMUFD_INTERNAL); Best regards, baolu
On Thu, Feb 02, 2023 at 06:21:20PM +0800, Baolu Lu wrote: > External email: Use caution opening links or attachments > > > On 2023/2/2 15:05, Nicolin Chen wrote: > > +/** > > + * iommu_group_replace_domain - replace the domain that a group is attached to > > + * @new_domain: new IOMMU domain to replace with > > + * @group: IOMMU group that will be attached to the new domain > > + * > > + * This API allows the group to switch domains without being forced to go to > > + * the blocking domain in-between. > > + * > > + * If the attached domain is a core domain (e.g. a default_domain), it will act > > + * just like the iommu_attach_group(). > > I am not following above two lines. Why and how could iommufd set a > core domain to an iommu_group? Perhaps this isn't the best narrative. What it's supposed to say is that this function acts as an iommu_attach_group() call if the device is "detached", yet we have changed the semantics about the word "detach". So, what should the correct way to write such a note? Thanks Nic
On 2023/2/3 3:14, Nicolin Chen wrote: > On Thu, Feb 02, 2023 at 06:21:20PM +0800, Baolu Lu wrote: >> External email: Use caution opening links or attachments >> >> >> On 2023/2/2 15:05, Nicolin Chen wrote: >>> +/** >>> + * iommu_group_replace_domain - replace the domain that a group is attached to >>> + * @new_domain: new IOMMU domain to replace with >>> + * @group: IOMMU group that will be attached to the new domain >>> + * >>> + * This API allows the group to switch domains without being forced to go to >>> + * the blocking domain in-between. >>> + * >>> + * If the attached domain is a core domain (e.g. a default_domain), it will act >>> + * just like the iommu_attach_group(). >> I am not following above two lines. Why and how could iommufd set a >> core domain to an iommu_group? > Perhaps this isn't the best narrative. What it's supposed to say > is that this function acts as an iommu_attach_group() call if the > device is "detached", yet we have changed the semantics about the > word "detach". So, what should the correct way to write such a > note? How could this interface be used as detaching a domain from a group? Even it could be used, doesn't it act as an iommu_detach_group()? Best regards, baolu
On Fri, Feb 03, 2023 at 09:33:44AM +0800, Baolu Lu wrote: > External email: Use caution opening links or attachments > > > On 2023/2/3 3:14, Nicolin Chen wrote: > > On Thu, Feb 02, 2023 at 06:21:20PM +0800, Baolu Lu wrote: > > > External email: Use caution opening links or attachments > > > > > > > > > On 2023/2/2 15:05, Nicolin Chen wrote: > > > > +/** > > > > + * iommu_group_replace_domain - replace the domain that a group is attached to > > > > + * @new_domain: new IOMMU domain to replace with > > > > + * @group: IOMMU group that will be attached to the new domain > > > > + * > > > > + * This API allows the group to switch domains without being forced to go to > > > > + * the blocking domain in-between. > > > > + * > > > > + * If the attached domain is a core domain (e.g. a default_domain), it will act > > > > + * just like the iommu_attach_group(). > > > I am not following above two lines. Why and how could iommufd set a > > > core domain to an iommu_group? > > Perhaps this isn't the best narrative. What it's supposed to say > > is that this function acts as an iommu_attach_group() call if the > > device is "detached", yet we have changed the semantics about the > > word "detach". So, what should the correct way to write such a > > note? > > How could this interface be used as detaching a domain from a group? > Even it could be used, doesn't it act as an iommu_detach_group()? No. I didn't say that. It doesn't act as detach(), but attach() when a device is already "detached". The original statement is saying, "if the attached domain is a core domain", i.e. the device is detach()-ed, "it will act just like the iommu_attach_group()". Thanks Nic
On 2023/2/3 9:41, Nicolin Chen wrote: > On Fri, Feb 03, 2023 at 09:33:44AM +0800, Baolu Lu wrote: >> External email: Use caution opening links or attachments >> >> >> On 2023/2/3 3:14, Nicolin Chen wrote: >>> On Thu, Feb 02, 2023 at 06:21:20PM +0800, Baolu Lu wrote: >>>> External email: Use caution opening links or attachments >>>> >>>> >>>> On 2023/2/2 15:05, Nicolin Chen wrote: >>>>> +/** >>>>> + * iommu_group_replace_domain - replace the domain that a group is attached to >>>>> + * @new_domain: new IOMMU domain to replace with >>>>> + * @group: IOMMU group that will be attached to the new domain >>>>> + * >>>>> + * This API allows the group to switch domains without being forced to go to >>>>> + * the blocking domain in-between. >>>>> + * >>>>> + * If the attached domain is a core domain (e.g. a default_domain), it will act >>>>> + * just like the iommu_attach_group(). >>>> I am not following above two lines. Why and how could iommufd set a >>>> core domain to an iommu_group? >>> Perhaps this isn't the best narrative. What it's supposed to say >>> is that this function acts as an iommu_attach_group() call if the >>> device is "detached", yet we have changed the semantics about the >>> word "detach". So, what should the correct way to write such a >>> note? >> How could this interface be used as detaching a domain from a group? >> Even it could be used, doesn't it act as an iommu_detach_group()? > No. I didn't say that. It doesn't act as detach(), but attach() > when a device is already "detached". > > The original statement is saying, "if the attached domain is a > core domain", i.e. the device is detach()-ed, "it will act just > like the iommu_attach_group()". Oh! My bad. I misunderstood it. Sorry for the noise. :-) Best regards, baolu
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Thursday, February 2, 2023 3:05 PM > > All drivers are already required to support changing between active > UNMANAGED domains when using their attach_dev ops. All drivers which don't have *broken* UNMANAGED domain? > > +/** > + * iommu_group_replace_domain - replace the domain that a group is > attached to > + * @new_domain: new IOMMU domain to replace with > + * @group: IOMMU group that will be attached to the new domain > + * > + * This API allows the group to switch domains without being forced to go to > + * the blocking domain in-between. > + * > + * If the attached domain is a core domain (e.g. a default_domain), it will act > + * just like the iommu_attach_group(). I think you meant "the currently-attached domain", which implies a 'detached' state as you replied to Baolu. > + */ > +int iommu_group_replace_domain(struct iommu_group *group, > + struct iommu_domain *new_domain) what actual value does 'replace' give us? It's just a wrapper of __iommu_group_set_domain() then calling it set_domain is probably clearer. You can clarify the 'replace' behavior in the comment. > +{ > + int ret; > + > + if (!new_domain) > + return -EINVAL; > + > + mutex_lock(&group->mutex); > + ret = __iommu_group_set_domain(group, new_domain); > + if (ret) { > + if (__iommu_group_set_domain(group, group->domain)) > + __iommu_group_set_core_domain(group); > + } Can you elaborate the error handling here? Ideally if __iommu_group_set_domain() fails then group->domain shouldn't be changed. Why do we need further housekeeping here?
On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Thursday, February 2, 2023 3:05 PM > > > > All drivers are already required to support changing between active > > UNMANAGED domains when using their attach_dev ops. > > All drivers which don't have *broken* UNMANAGED domain? No, all drivers.. It has always been used by VFIO. > > + */ > > +int iommu_group_replace_domain(struct iommu_group *group, > > + struct iommu_domain *new_domain) > > what actual value does 'replace' give us? It's just a wrapper of > __iommu_group_set_domain() then calling it set_domain is > probably clearer. You can clarify the 'replace' behavior in the > comment. As the APIs are setup: attach demands that no domain is currently set (eg the domain must be blocking) replace permits the domain to be currently set 'set' vs 'attach' is really unclear what the intended difference is. We could also address this by simply removing the protection from attach, but it is not so clear if that is safe for the few users. > > +{ > > + int ret; > > + > > + if (!new_domain) > > + return -EINVAL; > > + > > + mutex_lock(&group->mutex); > > + ret = __iommu_group_set_domain(group, new_domain); > > + if (ret) { > > + if (__iommu_group_set_domain(group, group->domain)) > > + __iommu_group_set_core_domain(group); > > + } > > Can you elaborate the error handling here? Ideally if > __iommu_group_set_domain() fails then group->domain shouldn't > be changed. That isn't what it implements though. The internal helper leaves things in a mess, it is for the caller to fix it, and it depends on the caller what that means. In this case the API cannot retain a hidden reference to the new domain, so it must be purged, one way or another. Jason
On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > +/** > > + * iommu_group_replace_domain - replace the domain that a group is > > attached to > > + * @new_domain: new IOMMU domain to replace with > > + * @group: IOMMU group that will be attached to the new domain > > + * > > + * This API allows the group to switch domains without being forced to go to > > + * the blocking domain in-between. > > + * > > + * If the attached domain is a core domain (e.g. a default_domain), it will act > > + * just like the iommu_attach_group(). > > I think you meant "the currently-attached domain", which implies a > 'detached' state as you replied to Baolu. Hmm, I don't see an implication, since we only have either "the attached domain" or "a new domain" in the context? With that being said, I can add "currently" in v2: * If the currently attached domain is a core domain (e.g. a default_domain), * it will act just like the iommu_attach_group(). Thanks Nic
On Fri, Feb 03, 2023 at 11:03:20AM -0400, Jason Gunthorpe wrote: > > > + */ > > > +int iommu_group_replace_domain(struct iommu_group *group, > > > + struct iommu_domain *new_domain) > > > > what actual value does 'replace' give us? It's just a wrapper of > > __iommu_group_set_domain() then calling it set_domain is > > probably clearer. You can clarify the 'replace' behavior in the > > comment. > > As the APIs are setup: > > attach demands that no domain is currently set (eg the domain must be blocking) > > replace permits the domain to be currently set > > 'set' vs 'attach' is really unclear what the intended difference is. > > We could also address this by simply removing the protection from > attach, but it is not so clear if that is safe for the few users. I can add a couple of lines to the commit message to make things clear. > > > +{ > > > + int ret; > > > + > > > + if (!new_domain) > > > + return -EINVAL; > > > + > > > + mutex_lock(&group->mutex); > > > + ret = __iommu_group_set_domain(group, new_domain); > > > + if (ret) { > > > + if (__iommu_group_set_domain(group, group->domain)) > > > + __iommu_group_set_core_domain(group); > > > + } > > > > Can you elaborate the error handling here? Ideally if > > __iommu_group_set_domain() fails then group->domain shouldn't > > be changed. > > That isn't what it implements though. The internal helper leaves > things in a mess, it is for the caller to fix it, and it depends on > the caller what that means. > > In this case the API cannot retain a hidden reference to the new > domain, so it must be purged, one way or another. Considering it is a bit different than the existing APIs like iommu_attach_group(), perhaps I should add a line of comments in the fallback routine. Thanks Nic
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Saturday, February 4, 2023 1:45 AM > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > +/** > > > + * iommu_group_replace_domain - replace the domain that a group is > > > attached to > > > + * @new_domain: new IOMMU domain to replace with > > > + * @group: IOMMU group that will be attached to the new domain > > > + * > > > + * This API allows the group to switch domains without being forced to > go to > > > + * the blocking domain in-between. > > > + * > > > + * If the attached domain is a core domain (e.g. a default_domain), it will > act > > > + * just like the iommu_attach_group(). > > > > I think you meant "the currently-attached domain", which implies a > > 'detached' state as you replied to Baolu. > > Hmm, I don't see an implication, since we only have either > "the attached domain" or "a new domain" in the context? probably just me reading it as "the newly attached domain". 😊 > > With that being said, I can add "currently" in v2: > * If the currently attached domain is a core domain (e.g. a default_domain), > * it will act just like the iommu_attach_group(). > this is clearer.
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, February 3, 2023 11:03 PM > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > All drivers are already required to support changing between active > > > UNMANAGED domains when using their attach_dev ops. > > > > All drivers which don't have *broken* UNMANAGED domain? > > No, all drivers.. It has always been used by VFIO. existing iommu_attach_group() doesn't support changing between two UNMANAGED domains. only from default->unmanaged or blocking->unmanaged. Above statement is true only if this series is based on your unmerged work removing DMA domain types. > > > > +{ > > > + int ret; > > > + > > > + if (!new_domain) > > > + return -EINVAL; > > > + > > > + mutex_lock(&group->mutex); > > > + ret = __iommu_group_set_domain(group, new_domain); > > > + if (ret) { > > > + if (__iommu_group_set_domain(group, group->domain)) > > > + __iommu_group_set_core_domain(group); > > > + } > > > > Can you elaborate the error handling here? Ideally if > > __iommu_group_set_domain() fails then group->domain shouldn't > > be changed. > > That isn't what it implements though. The internal helper leaves > things in a mess, it is for the caller to fix it, and it depends on > the caller what that means. I didn't see any warning of the mess and the caller's responsibility in __iommu_group_set_domain(). Can it be documented clearly so if someone wants to add a new caller on it he can clearly know what to do? and why doesn't iommu_attach_group() need to do anything when an attach attempt fails? In the end it calls the same iommu_group_do_attach_device() as __iommu_group_set_domain() does. btw looking at the code __iommu_group_set_domain(): * Note that this is called in error unwind paths, attaching to a * domain that has already been attached cannot fail. */ ret = __iommu_group_for_each_dev(group, new_domain, iommu_group_do_attach_device); with that we don't need fall back to core domain in above error unwinding per this comment. > > In this case the API cannot retain a hidden reference to the new > domain, so it must be purged, one way or another. > Could you elaborate where the hidden reference is retained?
On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Friday, February 3, 2023 11:03 PM > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > All drivers are already required to support changing between active > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > No, all drivers.. It has always been used by VFIO. > > existing iommu_attach_group() doesn't support changing between > two UNMANAGED domains. only from default->unmanaged or > blocking->unmanaged. Yes, but before we added the blocking domains VFIO was changing between unmanaged domains. Blocking domains are so new that no driver could have suddenly started to depend on this. > > > Can you elaborate the error handling here? Ideally if > > > __iommu_group_set_domain() fails then group->domain shouldn't > > > be changed. > > > > That isn't what it implements though. The internal helper leaves > > things in a mess, it is for the caller to fix it, and it depends on > > the caller what that means. > > I didn't see any warning of the mess and the caller's responsibility > in __iommu_group_set_domain(). Can it be documented clearly > so if someone wants to add a new caller on it he can clearly know > what to do? That would be nice.. > and why doesn't iommu_attach_group() need to do anything > when an attach attempt fails? In the end it calls the same > iommu_group_do_attach_device() as __iommu_group_set_domain() > does. That's a bug for sure. > btw looking at the code __iommu_group_set_domain(): > > * Note that this is called in error unwind paths, attaching to a > * domain that has already been attached cannot fail. > */ > ret = __iommu_group_for_each_dev(group, new_domain, > iommu_group_do_attach_device); > > with that we don't need fall back to core domain in above error > unwinding per this comment. That does make some sense. I tried to make a patch to consolidate all this error handling once, that would be the better way to approach this. > > In this case the API cannot retain a hidden reference to the new > > domain, so it must be purged, one way or another. > > Could you elaborate where the hidden reference is retained? Inside the driver, it can keep track of the domain pointer if attach_dev succeeds Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Monday, February 6, 2023 9:25 PM > > On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > Sent: Friday, February 3, 2023 11:03 PM > > > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > > > All drivers are already required to support changing between active > > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > > > No, all drivers.. It has always been used by VFIO. > > > > existing iommu_attach_group() doesn't support changing between > > two UNMANAGED domains. only from default->unmanaged or > > blocking->unmanaged. > > Yes, but before we added the blocking domains VFIO was changing > between unmanaged domains. Blocking domains are so new that no driver > could have suddenly started to depend on this. In legacy VFIO unmanaged domain was 1:1 associated with vfio container. I didn't say how a group can switch between two containers w/o going through transition to/from the default domain, i.e. detach from 1st container and then attach to the 2nd. > > btw looking at the code __iommu_group_set_domain(): > > > > * Note that this is called in error unwind paths, attaching to a > > * domain that has already been attached cannot fail. > > */ > > ret = __iommu_group_for_each_dev(group, new_domain, > > iommu_group_do_attach_device); > > > > with that we don't need fall back to core domain in above error > > unwinding per this comment. > > That does make some sense. > > I tried to make a patch to consolidate all this error handling once, > that would be the better way to approach this. that would be good. > > > > In this case the API cannot retain a hidden reference to the new > > > domain, so it must be purged, one way or another. > > > > Could you elaborate where the hidden reference is retained? > > Inside the driver, it can keep track of the domain pointer if > attach_dev succeeds > Are you referring to no error unwinding in __iommu_group_for_each_dev() so if it is failed some devices may have attach_dev succeeds then simply recovering group->domain in __iommu_attach_group() is insufficient?
On Tue, Feb 07, 2023 at 12:32:50AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Monday, February 6, 2023 9:25 PM > > > > On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > > Sent: Friday, February 3, 2023 11:03 PM > > > > > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > > > > > All drivers are already required to support changing between active > > > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > > > > > No, all drivers.. It has always been used by VFIO. > > > > > > existing iommu_attach_group() doesn't support changing between > > > two UNMANAGED domains. only from default->unmanaged or > > > blocking->unmanaged. > > > > Yes, but before we added the blocking domains VFIO was changing > > between unmanaged domains. Blocking domains are so new that no driver > > could have suddenly started to depend on this. > > In legacy VFIO unmanaged domain was 1:1 associated with vfio > container. I didn't say how a group can switch between two > containers w/o going through transition to/from the default > domain, i.e. detach from 1st container and then attach to the 2nd. Yes, in the past we went through the default domain which is basically another unmanaged domain type. So unmanaged -> unmanaged is OK. > > Inside the driver, it can keep track of the domain pointer if > > attach_dev succeeds > > Are you referring to no error unwinding in __iommu_group_for_each_dev() > so if it is failed some devices may have attach_dev succeeds then simply > recovering group->domain in __iommu_attach_group() is insufficient? Yes Jason
On Mon, Feb 06, 2023 at 09:25:17AM -0400, Jason Gunthorpe wrote: > > > > Can you elaborate the error handling here? Ideally if > > > > __iommu_group_set_domain() fails then group->domain shouldn't > > > > be changed. > > > > > > That isn't what it implements though. The internal helper leaves > > > things in a mess, it is for the caller to fix it, and it depends on > > > the caller what that means. > > > > I didn't see any warning of the mess and the caller's responsibility > > in __iommu_group_set_domain(). Can it be documented clearly > > so if someone wants to add a new caller on it he can clearly know > > what to do? > > That would be nice.. I'd expect the doc to come with some other patch/series than this replace series, so I think we should be fine without adding a line of comments in this patch? > > btw looking at the code __iommu_group_set_domain(): > > > > * Note that this is called in error unwind paths, attaching to a > > * domain that has already been attached cannot fail. > > */ > > ret = __iommu_group_for_each_dev(group, new_domain, > > iommu_group_do_attach_device); > > > > with that we don't need fall back to core domain in above error > > unwinding per this comment. > > That does make some sense. > > I tried to make a patch to consolidate all this error handling once, > that would be the better way to approach this. Then, I'll drop the core-domain line. Combining my reply above: + mutex_lock(&group->mutex); + ret = __iommu_group_set_domain(group, new_domain); + if (ret) + __iommu_group_set_domain(group, group->domain); + mutex_unlock(&group->mutex); Will wrap things up and send v2 today. Thanks Nic
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Tuesday, February 7, 2023 8:23 PM > > On Tue, Feb 07, 2023 at 12:32:50AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > Sent: Monday, February 6, 2023 9:25 PM > > > > > > On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > > > Sent: Friday, February 3, 2023 11:03 PM > > > > > > > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > > > > > > > All drivers are already required to support changing between > active > > > > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > > > > > > > No, all drivers.. It has always been used by VFIO. > > > > > > > > existing iommu_attach_group() doesn't support changing between > > > > two UNMANAGED domains. only from default->unmanaged or > > > > blocking->unmanaged. > > > > > > Yes, but before we added the blocking domains VFIO was changing > > > between unmanaged domains. Blocking domains are so new that no > driver > > > could have suddenly started to depend on this. > > > > In legacy VFIO unmanaged domain was 1:1 associated with vfio > > container. I didn't say how a group can switch between two > > containers w/o going through transition to/from the default > > domain, i.e. detach from 1st container and then attach to the 2nd. > > Yes, in the past we went through the default domain which is basically > another unmanaged domain type. So unmanaged -> unmanaged is OK. > it's right in concept but confusing in current context whether DMA domain still has its own type. That's why I replied earlier the statement makes more sense after your patch which cleans up the domain type is merged.
On Wed, Feb 08, 2023 at 04:25:58AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Tuesday, February 7, 2023 8:23 PM > > > > On Tue, Feb 07, 2023 at 12:32:50AM +0000, Tian, Kevin wrote: > > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > > Sent: Monday, February 6, 2023 9:25 PM > > > > > > > > On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > > > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > > > > Sent: Friday, February 3, 2023 11:03 PM > > > > > > > > > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > > > > > > > > > All drivers are already required to support changing between > > active > > > > > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > > > > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > > > > > > > > > No, all drivers.. It has always been used by VFIO. > > > > > > > > > > existing iommu_attach_group() doesn't support changing between > > > > > two UNMANAGED domains. only from default->unmanaged or > > > > > blocking->unmanaged. > > > > > > > > Yes, but before we added the blocking domains VFIO was changing > > > > between unmanaged domains. Blocking domains are so new that no > > driver > > > > could have suddenly started to depend on this. > > > > > > In legacy VFIO unmanaged domain was 1:1 associated with vfio > > > container. I didn't say how a group can switch between two > > > containers w/o going through transition to/from the default > > > domain, i.e. detach from 1st container and then attach to the 2nd. > > > > Yes, in the past we went through the default domain which is basically > > another unmanaged domain type. So unmanaged -> unmanaged is OK. > > > > it's right in concept but confusing in current context whether DMA > domain still has its own type. That's why I replied earlier the statement > makes more sense after your patch which cleans up the domain type > is merged. We are just reasoning about why the existing drivers are safe with this - and my draft patch to remove DMA simply demonstrates that DMA == UNMANAGED, and the above reasoning about VFIO in the past demonstrates that this has historically be expected of drivers. So I'm not so worried about patch order as long as things do work Jason
diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index 9e1497027cff..b546795a7e49 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -15,4 +15,8 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev) */ return dev->iommu->iommu_dev->ops; } + +extern int iommu_group_replace_domain(struct iommu_group *group, + struct iommu_domain *new_domain); + #endif /* __LINUX_IOMMU_PRIV_H */ diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index a18b7f1a4e6e..c35da7a5c0d4 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2151,6 +2151,36 @@ int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group) } EXPORT_SYMBOL_GPL(iommu_attach_group); +/** + * iommu_group_replace_domain - replace the domain that a group is attached to + * @new_domain: new IOMMU domain to replace with + * @group: IOMMU group that will be attached to the new domain + * + * This API allows the group to switch domains without being forced to go to + * the blocking domain in-between. + * + * If the attached domain is a core domain (e.g. a default_domain), it will act + * just like the iommu_attach_group(). + */ +int iommu_group_replace_domain(struct iommu_group *group, + struct iommu_domain *new_domain) +{ + int ret; + + if (!new_domain) + return -EINVAL; + + mutex_lock(&group->mutex); + ret = __iommu_group_set_domain(group, new_domain); + if (ret) { + if (__iommu_group_set_domain(group, group->domain)) + __iommu_group_set_core_domain(group); + } + mutex_unlock(&group->mutex); + return ret; +} +EXPORT_SYMBOL_NS_GPL(iommu_group_replace_domain, IOMMUFD_INTERNAL); + static int iommu_group_do_set_platform_dma(struct device *dev, void *data) { const struct iommu_ops *ops = dev_iommu_ops(dev);