From patchwork Tue Oct 17 00:39:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Si-Wei Liu X-Patchwork-Id: 153837 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3816235vqb; Mon, 16 Oct 2023 17:43:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHe5sLhrv864w+yW2yNDeaOhG9cl8f4j5K24SfBX6JiyVJQTQhR1f2Fj8O/4Q1Jw0XvmRZA X-Received: by 2002:a17:902:f30a:b0:1ca:2ec4:7f4d with SMTP id c10-20020a170902f30a00b001ca2ec47f4dmr875298ple.3.1697503410951; Mon, 16 Oct 2023 17:43:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697503410; cv=none; d=google.com; s=arc-20160816; b=XyFX/XTpVQvD164UIfIFEt/1lohZ2FIsVCPYz9px9FxW+1aJi1XbqdWYP9XrT7vWPc 0Oitz2MDWi3kZGCwlrTfkgzUAx1G+yjuQyYyHmwNhBUYXczPMPKlQGdRoJi55f5qy4pm 53JANS7C9OnAKxA5BgSejTo50edZWQngkM0zPQ6WLXVNsnvDtlXMv/HhDL5GJQoneEch HOBYxO5LSbkWgoUUrz9hojWI1U+cKghYEElIqD4WOV1AT4hMGE0+W09hAvLgmtRbf9Gr 2raQcvI4Vn/D0m3a2gWoxVECAGMUoJ5CXf0SA68PIFkH5c74y9d/kVK12jFS2ztd7t33 suHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KrpapIJ3BkNAXiuqRX7bRQLsn2/RQH2+BsvlPVXHXNA=; fh=oDOqP7sFVfy8hE2JW+RNYDdlQLyeIHOxyj9Z/wRrGAI=; b=hJ1vuEcKYBWjhdeHcOz/GFt3wu5prMOHUoABL5d4NX0pnrSMNr/KdS5g7MwJrCwWGp 8fEuGLNSk75Ksscc9+e548cXlo4Tg+oARn+Yl+/8mDNMsSQzPvPsHP87jVxiDQMOabth D6mgui1Q0urdCSWG0CZ0TVIvHR6dnVxiNakKPA1ZbgNuKpMbfMEuj7t7xv0pVOGDp6wV Z1xgmnWV20JD+3cMH8KU69gHnvFkk762PEthqeY+hkXwRYEkOmkvRvAKkpNWVC52F23i N0hQJOegnCwygMA1VrVeKZnclY6lK47joJeATu8shMVunP/+nnZdhL0rISBmNtIFuyJM aO5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=glPMvXzH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id 21-20020a170902c11500b001bdca6456c3si503990pli.46.2023.10.16.17.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 17:43:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=glPMvXzH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id A506180C5CA6; Mon, 16 Oct 2023 17:43:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234302AbjJQAmz (ORCPT + 18 others); Mon, 16 Oct 2023 20:42:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234245AbjJQAmu (ORCPT ); Mon, 16 Oct 2023 20:42:50 -0400 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77FC09B for ; Mon, 16 Oct 2023 17:42:48 -0700 (PDT) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39GKO8bX014418; Tue, 17 Oct 2023 00:42:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2023-03-30; bh=KrpapIJ3BkNAXiuqRX7bRQLsn2/RQH2+BsvlPVXHXNA=; b=glPMvXzHavU3wJJTbXGlGb60s2ADRLv3I12Yb/1hQXGoSjh88YfThCFrU0jPLObxTdTb ZR9KV2ISQvZkCxJXcIy5NxXGbT2WGQZnX3a4ZlG7/WbLd6CqVJAYE3YsuEmTUq5zpSpN jCdDZpiTkM3S4/ry17/uCTR6xBTjkaR/IId/4NYNalFV9W8qAgplwjS7+BehB55vHFM2 QeZ2l7rJdAcvLMjRUxb+ssuDQcH7lCm39mUYluBVbO6WjoX8MuJlx5qn3XDRGZEdEbxH LAP6H8E7v07CtmMNZ+Y/y22q0hWdckxF00RiTqEs9Ogft57DkDMX+3VV4EFIgvt3L8xq Zg== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3tqk1bkxsn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 17 Oct 2023 00:42:40 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 39GNB27K027141; Tue, 17 Oct 2023 00:42:39 GMT Received: from ban25x6uut24.us.oracle.com (ban25x6uut24.us.oracle.com [10.153.73.24]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3trg535bja-4; Tue, 17 Oct 2023 00:42:39 +0000 From: Si-Wei Liu To: jasowang@redhat.com, mst@redhat.com, eperezma@redhat.com, xuanzhuo@linux.alibaba.com, dtatulea@nvidia.com, sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Date: Mon, 16 Oct 2023 17:39:56 -0700 Message-Id: <1697503197-15935-4-git-send-email-si-wei.liu@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1697503197-15935-1-git-send-email-si-wei.liu@oracle.com> References: <1697503197-15935-1-git-send-email-si-wei.liu@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-16_13,2023-10-12_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 mlxscore=0 phishscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310170003 X-Proofpoint-GUID: Evj-k7e-lxAQDDNY1nc-XlkIMSEyxO8b X-Proofpoint-ORIG-GUID: Evj-k7e-lxAQDDNY1nc-XlkIMSEyxO8b X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 16 Oct 2023 17:43:28 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779961336875578781 X-GMAIL-MSGID: 1779961336875578781 Userspace needs this feature flag to distinguish if vhost-vdpa iotlb in the kernel can be trusted to persist IOTLB mapping across vDPA reset. Without it, userspace has no way to tell apart if it's running on an older kernel, which could silently drop all iotlb mapping across vDPA reset, especially with broken parent driver implementation for the .reset driver op. The broken driver may incorrectly drop all mappings of its own as part of .reset, which inadvertently ends up with corrupted mapping state between vhost-vdpa userspace and the kernel. As a workaround, to make the mapping behaviour predictable across reset, userspace has to pro-actively remove all mappings before vDPA reset, and then restore all the mappings afterwards. This workaround is done unconditionally on top of all parent drivers today, due to the parent driver implementation issue and no means to differentiate. This workaround had been utilized in QEMU since day one when the corresponding vhost-vdpa userspace backend came to the world. There are 3 cases that backend may claim this feature bit on for: - parent device that has to work with platform IOMMU - parent device with on-chip IOMMU that has the expected .reset_map support in driver - parent device with vendor specific IOMMU implementation with persistent IOTLB mapping already that has to specifically declare this backend feature The reason why .reset_map is being one of the pre-condition for persistent iotlb is because without it, vhost-vdpa can't switch back iotlb to the initial state later on, especially for the on-chip IOMMU case which starts with identity mapping at device creation. virtio-vdpa requires on-chip IOMMU to perform 1:1 passthrough translation from PA to IOVA as-is to begin with, and .reset_map is the only means to turn back iotlb to the identity mapping mode after vhost-vdpa is gone. The difference in behavior did not matter as QEMU unmaps all the memory unregistering the memory listener at vhost_vdpa_dev_start( started = false), but the backend acknowledging this feature flag allows QEMU to make sure it is safe to skip this unmap & map in the case of vhost stop & start cycle. In that sense, this feature flag is actually a signal for userspace to know that the driver bug has been solved. Not offering it indicates that userspace cannot trust the kernel will retain the maps. Signed-off-by: Si-Wei Liu Acked-by: Eugenio PĂ©rez --- drivers/vhost/vdpa.c | 15 +++++++++++++++ include/uapi/linux/vhost_types.h | 2 ++ 2 files changed, 17 insertions(+) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index a3f8160c9807..9202986a7d81 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -438,6 +438,15 @@ static u64 vhost_vdpa_get_backend_features(const struct vhost_vdpa *v) return ops->get_backend_features(vdpa); } +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + return (!ops->set_map && !ops->dma_map) || ops->reset_map || + vhost_vdpa_get_backend_features(v) & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); +} + static long vhost_vdpa_set_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; @@ -725,6 +734,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, return -EFAULT; if (features & ~(VHOST_VDPA_BACKEND_FEATURES | BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST) | BIT_ULL(VHOST_BACKEND_F_SUSPEND) | BIT_ULL(VHOST_BACKEND_F_RESUME) | BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK))) @@ -741,6 +751,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && !vhost_vdpa_has_desc_group(v)) return -EOPNOTSUPP; + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && + !vhost_vdpa_has_persistent_map(v)) + return -EOPNOTSUPP; vhost_set_backend_features(&v->vdev, features); return 0; } @@ -796,6 +809,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, features |= BIT_ULL(VHOST_BACKEND_F_RESUME); if (vhost_vdpa_has_desc_group(v)) features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); + if (vhost_vdpa_has_persistent_map(v)) + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); features |= vhost_vdpa_get_backend_features(v); if (copy_to_user(featurep, &features, sizeof(features))) r = -EFAULT; diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index 18ad6ae7ab5c..d7656908f730 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -190,5 +190,7 @@ struct vhost_vdpa_iova_range { * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. */ #define VHOST_BACKEND_F_DESC_ASID 0x7 +/* IOTLB don't flush memory mapping across device reset */ +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x8 #endif