[RFC] vdpa/mlx5: preserve CVQ vringh index
Commit Message
mlx5_vdpa does not preserve userland's view of vring base for the control
queue in the following sequence:
ioctl VHOST_SET_VRING_BASE
ioctl VHOST_VDPA_SET_STATUS VIRTIO_CONFIG_S_DRIVER_OK
mlx5_vdpa_set_status()
setup_cvq_vring()
vringh_init_iotlb()
vringh_init_kern()
vrh->last_avail_idx = 0;
ioctl VHOST_GET_VRING_BASE
To fix, restore the value of cvq->vring.last_avail_idx after calling
vringh_init_iotlb.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/vdpa/mlx5/net/mlx5_vnet.c | 7 ++++++-
drivers/vhost/vringh.c | 30 ++++++++++++++++++++++++++++++
include/linux/vringh.h | 2 ++
3 files changed, 38 insertions(+), 1 deletion(-)
Comments
On 10/26/2023 4:11 PM, Steve Sistare wrote:
> mlx5_vdpa does not preserve userland's view of vring base for the control
> queue in the following sequence:
>
> ioctl VHOST_SET_VRING_BASE
> ioctl VHOST_VDPA_SET_STATUS VIRTIO_CONFIG_S_DRIVER_OK
> mlx5_vdpa_set_status()
> setup_cvq_vring()
> vringh_init_iotlb()
> vringh_init_kern()
> vrh->last_avail_idx = 0;
> ioctl VHOST_GET_VRING_BASE
>
> To fix, restore the value of cvq->vring.last_avail_idx after calling
> vringh_init_iotlb.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
This is a resend, I forgot to cc myself the first time.
I don't know if we expect vring_base to be preserved after reset, because the
uapi comments say nothing about it. mlx5 *does* preserve base across reset
for the the data vq's, but perhaps that is an accident of the implementation.
I posted this patch to perhaps avoid future problems. The bug(?) bit me while
developing with an older version of qemu, and I can work around it in qemu
code. Further, the latest version of qemu always enables svq for the cvq
and is not affected by this behavior AFAICT.
- Steve
Steve, I think this is a loose end that I myself am not sure if worth
fixing, copy Eugenio for his awareness. Reason is that when CVQ is in
place it always has to cope with device state saving and restoration
using shadowed virtqueue for a lot of cases not just migration, and
that's the reason why SVQ is always enabled for CVQ in the latest QEMU.
But I agree this is a nice to have, possibly there could be value to
support vDPA VM instances without solely depending on SVQ for e.g. for
use case like memory encrypted VM. Thanks for posting the fix and lets
see what other people think about it.
-Siwei
On 10/26/2023 1:13 PM, Steven Sistare wrote:
> On 10/26/2023 4:11 PM, Steve Sistare wrote:
>> mlx5_vdpa does not preserve userland's view of vring base for the control
>> queue in the following sequence:
>>
>> ioctl VHOST_SET_VRING_BASE
>> ioctl VHOST_VDPA_SET_STATUS VIRTIO_CONFIG_S_DRIVER_OK
>> mlx5_vdpa_set_status()
>> setup_cvq_vring()
>> vringh_init_iotlb()
>> vringh_init_kern()
>> vrh->last_avail_idx = 0;
>> ioctl VHOST_GET_VRING_BASE
>>
>> To fix, restore the value of cvq->vring.last_avail_idx after calling
>> vringh_init_iotlb.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> This is a resend, I forgot to cc myself the first time.
>
> I don't know if we expect vring_base to be preserved after reset, because the
> uapi comments say nothing about it. mlx5 *does* preserve base across reset
> for the the data vq's, but perhaps that is an accident of the implementation.
>
> I posted this patch to perhaps avoid future problems. The bug(?) bit me while
> developing with an older version of qemu, and I can work around it in qemu
> code. Further, the latest version of qemu always enables svq for the cvq
> and is not affected by this behavior AFAICT.
>
> - Steve
@@ -2795,13 +2795,18 @@ static int setup_cvq_vring(struct mlx5_vdpa_dev *mvdev)
struct mlx5_control_vq *cvq = &mvdev->cvq;
int err = 0;
- if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)) {
+ u16 last_avail_idx = cvq->vring.last_avail_idx;
+
err = vringh_init_iotlb(&cvq->vring, mvdev->actual_features,
MLX5_CVQ_MAX_ENT, false,
(struct vring_desc *)(uintptr_t)cvq->desc_addr,
(struct vring_avail *)(uintptr_t)cvq->driver_addr,
(struct vring_used *)(uintptr_t)cvq->device_addr);
+ if (!err)
+ vringh_set_base_iotlb(&cvq->vring, last_avail_idx);
+ }
return err;
}
@@ -595,6 +595,24 @@ static inline void __vringh_notify_disable(struct vringh *vrh,
}
}
+static inline int __vringh_set_base(struct vringh *vrh, u16 idx,
+ int (*putu16)(const struct vringh *vrh,
+ __virtio16 *p, u16 val))
+{
+ int ret;
+
+ ret = putu16(vrh, &vrh->vring.avail->idx, idx);
+ if (ret)
+ return ret;
+
+ ret = putu16(vrh, &vrh->vring.used->idx, idx);
+ if (ret)
+ return ret;
+
+ vrh->last_avail_idx = vrh->last_used_idx = idx;
+ return 0;
+}
+
/* Userspace access helpers: in this case, addresses are really userspace. */
static inline int getu16_user(const struct vringh *vrh, u16 *val, const __virtio16 *p)
{
@@ -1456,6 +1474,18 @@ void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
}
EXPORT_SYMBOL(vringh_set_iotlb);
+/**
+ * vringh_set_base_iotlb - set avail_idx and used_idx
+ * @vrh: the vring
+ * @idx: the value to set
+ */
+int vringh_set_base_iotlb(struct vringh *vrh, u16 idx)
+{
+ return __vringh_set_base(vrh, idx, putu16_iotlb);
+}
+EXPORT_SYMBOL(vringh_set_base_iotlb);
+
+
/**
* vringh_getdesc_iotlb - get next available descriptor from ring with
* IOTLB.
@@ -306,6 +306,8 @@ int vringh_init_iotlb_va(struct vringh *vrh, u64 features,
struct vring_avail *avail,
struct vring_used *used);
+int vringh_set_base_iotlb(struct vringh *vrh, u16 idx);
+
int vringh_getdesc_iotlb(struct vringh *vrh,
struct vringh_kiov *riov,
struct vringh_kiov *wiov,