[v3] iommufd: Only enforce cache coherency in iommufd_hw_pagetable_alloc
Commit Message
According to the conversion in the following link:
https://lore.kernel.org/linux-iommu/20231020135501.GG3952@nvidia.com/
The enforce_cache_coherency should be set/enforced in the hwpt allocation
routine. The iommu driver in its attach_dev() op should decide whether to
reject or not a device that doesn't match with the configuration of cache
coherency. Drop the enforce_cache_coherency piece in the attach/replace()
and move the remaining "num_devices" piece closer to the refcount that is
using it.
Accordingly drop its function prototype in the header and mark it static.
Also add some extra comments to clarify the expected behaviors.
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Changelog
v3:
* Reworked the first line of the added commets.
v2: https://lore.kernel.org/all/20231023035733.27378-1-nicolinc@nvidia.com/
* Dropped "fixes" tags and merged two patches into one (Jason)
* Added comments to the remaining enforce_cache_coherency call (Jason)
[Please feel free to rephrase, or let me know what to change.]
* Replace "num_devices++" with list_for_each_entry (Baolu)
v1: https://lore.kernel.org/all/cover.1697848510.git.nicolinc@nvidia.com/
drivers/iommu/iommufd/device.c | 20 ++------------------
drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++++-
drivers/iommu/iommufd/iommufd_private.h | 1 -
3 files changed, 10 insertions(+), 20 deletions(-)
base-commit: dc7ce51ff88569b95d8764b0cf76405511f693d5
Comments
On Mon, Oct 23, 2023 at 11:57:00AM -0700, Nicolin Chen wrote:
> According to the conversion in the following link:
> https://lore.kernel.org/linux-iommu/20231020135501.GG3952@nvidia.com/
>
> The enforce_cache_coherency should be set/enforced in the hwpt allocation
> routine. The iommu driver in its attach_dev() op should decide whether to
> reject or not a device that doesn't match with the configuration of cache
> coherency. Drop the enforce_cache_coherency piece in the attach/replace()
> and move the remaining "num_devices" piece closer to the refcount that is
> using it.
>
> Accordingly drop its function prototype in the header and mark it static.
> Also add some extra comments to clarify the expected behaviors.
>
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> Changelog
> v3:
> * Reworked the first line of the added commets.
> v2: https://lore.kernel.org/all/20231023035733.27378-1-nicolinc@nvidia.com/
> * Dropped "fixes" tags and merged two patches into one (Jason)
> * Added comments to the remaining enforce_cache_coherency call (Jason)
> [Please feel free to rephrase, or let me know what to change.]
> * Replace "num_devices++" with list_for_each_entry (Baolu)
> v1: https://lore.kernel.org/all/cover.1697848510.git.nicolinc@nvidia.com/
>
> drivers/iommu/iommufd/device.c | 20 ++------------------
> drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++++-
> drivers/iommu/iommufd/iommufd_private.h | 1 -
> 3 files changed, 10 insertions(+), 20 deletions(-)
This looks OK to me, Kevin is it what you think it should be now?
Thanks,
Jason
On Mon, Oct 23, 2023 at 08:05:09PM -0300, Jason Gunthorpe wrote:
> On Mon, Oct 23, 2023 at 11:57:00AM -0700, Nicolin Chen wrote:
> > According to the conversion in the following link:
Ah, a typo here: conversation.
> > https://lore.kernel.org/linux-iommu/20231020135501.GG3952@nvidia.com/
> >
> > The enforce_cache_coherency should be set/enforced in the hwpt allocation
> > routine. The iommu driver in its attach_dev() op should decide whether to
> > reject or not a device that doesn't match with the configuration of cache
> > coherency. Drop the enforce_cache_coherency piece in the attach/replace()
> > and move the remaining "num_devices" piece closer to the refcount that is
> > using it.
> >
> > Accordingly drop its function prototype in the header and mark it static.
> > Also add some extra comments to clarify the expected behaviors.
> >
> > Suggested-by: Kevin Tian <kevin.tian@intel.com>
> > Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > Changelog
> > v3:
> > * Reworked the first line of the added commets.
> > v2: https://lore.kernel.org/all/20231023035733.27378-1-nicolinc@nvidia.com/
> > * Dropped "fixes" tags and merged two patches into one (Jason)
> > * Added comments to the remaining enforce_cache_coherency call (Jason)
> > [Please feel free to rephrase, or let me know what to change.]
> > * Replace "num_devices++" with list_for_each_entry (Baolu)
> > v1: https://lore.kernel.org/all/cover.1697848510.git.nicolinc@nvidia.com/
> >
> > drivers/iommu/iommufd/device.c | 20 ++------------------
> > drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++++-
> > drivers/iommu/iommufd/iommufd_private.h | 1 -
> > 3 files changed, 10 insertions(+), 20 deletions(-)
>
> This looks OK to me, Kevin is it what you think it should be now?
I will respin a v4 after Kevin's reply. And I can attach his
"Reviewed-by" if everything looks good to him too.
Thanks
Nic
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, October 24, 2023 7:05 AM
>
> On Mon, Oct 23, 2023 at 11:57:00AM -0700, Nicolin Chen wrote:
> > According to the conversion in the following link:
> > https://lore.kernel.org/linux-
> iommu/20231020135501.GG3952@nvidia.com/
> >
> > The enforce_cache_coherency should be set/enforced in the hwpt
> allocation
> > routine. The iommu driver in its attach_dev() op should decide whether to
> > reject or not a device that doesn't match with the configuration of cache
> > coherency. Drop the enforce_cache_coherency piece in the attach/replace()
> > and move the remaining "num_devices" piece closer to the refcount that is
> > using it.
> >
> > Accordingly drop its function prototype in the header and mark it static.
> > Also add some extra comments to clarify the expected behaviors.
> >
> > Suggested-by: Kevin Tian <kevin.tian@intel.com>
> > Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > Changelog
> > v3:
> > * Reworked the first line of the added commets.
> > v2: https://lore.kernel.org/all/20231023035733.27378-1-
> nicolinc@nvidia.com/
> > * Dropped "fixes" tags and merged two patches into one (Jason)
> > * Added comments to the remaining enforce_cache_coherency call (Jason)
> > [Please feel free to rephrase, or let me know what to change.]
> > * Replace "num_devices++" with list_for_each_entry (Baolu)
> > v1: https://lore.kernel.org/all/cover.1697848510.git.nicolinc@nvidia.com/
> >
> > drivers/iommu/iommufd/device.c | 20 ++------------------
> > drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++++-
> > drivers/iommu/iommufd/iommufd_private.h | 1 -
> > 3 files changed, 10 insertions(+), 20 deletions(-)
>
> This looks OK to me, Kevin is it what you think it should be now?
>
This looks good.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
@@ -337,13 +337,6 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
goto err_unlock;
}
- /* Try to upgrade the domain we have */
- if (idev->enforce_cache_coherency) {
- rc = iommufd_hw_pagetable_enforce_cc(hwpt);
- if (rc)
- goto err_unlock;
- }
-
rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev,
&idev->igroup->sw_msi_start);
if (rc)
@@ -413,8 +406,8 @@ iommufd_device_do_replace(struct iommufd_device *idev,
{
struct iommufd_group *igroup = idev->igroup;
struct iommufd_hw_pagetable *old_hwpt;
- unsigned int num_devices = 0;
struct iommufd_device *cur;
+ unsigned int num_devices;
int rc;
mutex_lock(&idev->igroup->lock);
@@ -429,16 +422,6 @@ iommufd_device_do_replace(struct iommufd_device *idev,
return NULL;
}
- /* Try to upgrade the domain we have */
- list_for_each_entry(cur, &igroup->device_list, group_item) {
- num_devices++;
- if (cur->enforce_cache_coherency) {
- rc = iommufd_hw_pagetable_enforce_cc(hwpt);
- if (rc)
- goto err_unlock;
- }
- }
-
old_hwpt = igroup->hwpt;
if (hwpt->ioas != old_hwpt->ioas) {
list_for_each_entry(cur, &igroup->device_list, group_item) {
@@ -465,6 +448,7 @@ iommufd_device_do_replace(struct iommufd_device *idev,
igroup->hwpt = hwpt;
+ num_devices = list_count_nodes(&igroup->device_list);
/*
* Move the refcounts held by the device_list to the new hwpt. Retain a
* refcount for this thread as the caller will free it.
@@ -42,7 +42,7 @@ void iommufd_hw_pagetable_abort(struct iommufd_object *obj)
iommufd_hw_pagetable_destroy(obj);
}
-int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt)
+static int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt)
{
if (hwpt->enforce_cache_coherency)
return 0;
@@ -116,6 +116,13 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
* doing any maps. It is an iommu driver bug to report
* IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail enforce_cache_coherency on
* a new domain.
+ *
+ * The cache coherency mode must be configured here and unchanged later.
+ * Note that a HWPT (non-CC) created for a device (non-CC) can be later
+ * reused by another device (either non-CC or CC). However, A HWPT (CC)
+ * created for a device (CC) cannot be reused by another device (non-CC)
+ * but only devices (CC). Instead user space in this case would need to
+ * allocate a separate HWPT (non-CC).
*/
if (idev->enforce_cache_coherency) {
rc = iommufd_hw_pagetable_enforce_cc(hwpt);
@@ -266,7 +266,6 @@ struct iommufd_hw_pagetable *
iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
struct iommufd_device *idev, u32 flags,
bool immediate_attach);
-int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt);
int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev);
struct iommufd_hw_pagetable *