[vhost,v3,0/6] vdpa/mlx5: Add support for resumable vqs

Message ID 20231215140146.95816-1-dtatulea@nvidia.com
Headers
Series vdpa/mlx5: Add support for resumable vqs |

Message

Dragos Tatulea Dec. 15, 2023, 2:01 p.m. UTC
  Add support for resumable vqs in the driver. This is a firmware feature
that can be used for the following benefits:
- Full device .suspend/.resume.
- .set_map doesn't need to destroy and create new vqs anymore just to
  update the map. When resumable vqs are supported it is enough to
  suspend the vqs, set the new maps, and then resume the vqs.

The first patch exposes the relevant bits in mlx5_ifc.h. That means it
needs to be applied to the mlx5-vhost tree [0] first. Once applied
there, the change has to be pulled from mlx5-vhost into the vhost tree
and only then the remaining patches can be applied. Same flow as the vq
descriptor mappings patchset [1].

The second part adds support for resumable vqs in the form of a device .resume
operation but also for the .set_map call (suspend/resume device instead
of re-creating vqs with new mappings).

The last part of the series introduces reference counting for mrs which
is necessary to avoid freeing mkeys too early or leaking them.

* Changes in v3:
- Dropped patches that allowed vq modification of state and addresses
  when state is DRIVER_OK. This is not allowed by the standard.
  Should be re-added under a vdpa feature flag.

* Changes in v2:
- Added mr refcounting patches.
- Deleted unnecessary patch: "vdpa/mlx5: Split function into locked and
  unlocked variants"
- Small print improvement in "Introduce per vq and device resume"
  patch.
- Patch 1/7 has been applied to mlx5-vhost branch.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
[1] https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatulea@nvidia.com/


Dragos Tatulea (6):
  vdpa/mlx5: Expose resumable vq capability
  vdpa/mlx5: Allow modifying multiple vq fields in one modify command
  vdpa/mlx5: Introduce per vq and device resume
  vdpa/mlx5: Use vq suspend/resume during .set_map
  vdpa/mlx5: Introduce reference counting to mrs
  vdpa/mlx5: Add mkey leak detection

 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  10 +-
 drivers/vdpa/mlx5/core/mr.c        |  69 +++++++---
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 194 +++++++++++++++++++++++++----
 include/linux/mlx5/mlx5_ifc.h      |   3 +-
 include/linux/mlx5/mlx5_ifc_vdpa.h |   1 +
 5 files changed, 239 insertions(+), 38 deletions(-)
  

Comments

Dragos Tatulea Dec. 15, 2023, 2:10 p.m. UTC | #1
On Fri, 2023-12-15 at 16:01 +0200, Dragos Tatulea wrote:
> Add support for resumable vqs in the driver. This is a firmware feature
> that can be used for the following benefits:
> - Full device .suspend/.resume.
> - .set_map doesn't need to destroy and create new vqs anymore just to
>   update the map. When resumable vqs are supported it is enough to
>   suspend the vqs, set the new maps, and then resume the vqs.
> 
> The first patch exposes the relevant bits in mlx5_ifc.h. That means it
> needs to be applied to the mlx5-vhost tree [0] first. Once applied
> there, the change has to be pulled from mlx5-vhost into the vhost tree
> and only then the remaining patches can be applied. Same flow as the vq
> descriptor mappings patchset [1].
> 
> The second part adds support for resumable vqs in the form of a device .resume
> operation but also for the .set_map call (suspend/resume device instead
> of re-creating vqs with new mappings).
> 
> The last part of the series introduces reference counting for mrs which
> is necessary to avoid freeing mkeys too early or leaking them.
> 
> * Changes in v3:
> - Dropped patches that allowed vq modification of state and addresses
>   when state is DRIVER_OK. This is not allowed by the standard.
>   Should be re-added under a vdpa feature flag.
> 
> * Changes in v2:
> - Added mr refcounting patches.
> - Deleted unnecessary patch: "vdpa/mlx5: Split function into locked and
>   unlocked variants"
> - Small print improvement in "Introduce per vq and device resume"
>   patch.
> - Patch 1/7 has been applied to mlx5-vhost branch.
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
> [1] https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatulea@nvidia.com/
> 
> 
> Dragos Tatulea (6):
>   vdpa/mlx5: Expose resumable vq capability
>   vdpa/mlx5: Allow modifying multiple vq fields in one modify command
>   vdpa/mlx5: Introduce per vq and device resume
>   vdpa/mlx5: Use vq suspend/resume during .set_map
>   vdpa/mlx5: Introduce reference counting to mrs
>   vdpa/mlx5: Add mkey leak detection
> 
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  10 +-
>  drivers/vdpa/mlx5/core/mr.c        |  69 +++++++---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 194 +++++++++++++++++++++++++----
>  include/linux/mlx5/mlx5_ifc.h      |   3 +-
>  include/linux/mlx5/mlx5_ifc_vdpa.h |   1 +
>  5 files changed, 239 insertions(+), 38 deletions(-)
> 

Please disregard this version. I will send a v4. Sorry about the noise.