Message ID | 1697880319-4937-1-git-send-email-si-wei.liu@oracle.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp199285vqx; Sat, 21 Oct 2023 02:29:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE8JsawLhcQDDc5YRFJ3ieHhKgce/9rJHpO+9vxvzmVtz61L3XwO3NdPLqhYss5pA/hgR0q X-Received: by 2002:a17:90b:1490:b0:27d:e1c:5345 with SMTP id js16-20020a17090b149000b0027d0e1c5345mr4282981pjb.15.1697880569916; Sat, 21 Oct 2023 02:29:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697880569; cv=none; d=google.com; s=arc-20160816; b=xD7xr9Xsa4w+GZeS7DAV+BBq3EMKKGPHtUr1RwmRdsVGgAmKjCFhDf9lgx/br8uzl1 dgZu0rPTy27CDBmE20Ier8HVgdA0Xd/Df2hJD2YqEmSZz7PgDI/SAgfPwQuWS6ky6HHG 155En/VnMRuJng3gS+vb+e8Yje2VNaNG5DcyXeDEHT48DPqiiwaY1O+6L3RmYbF6IfJ5 jNzDlcpMdisTQPIXyIlxt2FSl5Hi3oKMAt57NAh+668w5LlN+pho0JdSbqOEXaJ0ezc/ Ttvx6ljRCyfLSrdj+PsOJS0A1Bg/z6o4UpNzlfsmwjP2pR8WPif7oKplzo/VkR9eVfbI vo1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=vXJHNbSFXSISHev3zlx79ZCGYnD++jfw2/gKRC3j7mM=; fh=ozy2CkLebIOzbpPF40kmeuF6jbAmH8vZa/HD/EKOE2c=; b=GKgBHBfCQybtg6rB19lOK0dhZ9J5xSsUq+foSY1HJex1udlC/8VaVi/U2i8JVwyGqS nwwoA/QxgPbev+yaKsjkB2a9JayhOKbaMZsojqvO1oNUYk0I4EOH1lbYKtnMHnvzFVby dQYHZvjXWTvQ42QFnBernrXkBr52tYi3bQq7GbkMm8UVu4eIHeT845Xi3+qDi8QAx+/Y fWR/9vz8iUNDqEYEVDSN9ZPqlAndzULSxNApNV74J6ZoicsjU8Q2bDwxB0oJ/3iKo5tW UsOrwBrlR4quX/xJoxw2SNt9mg3n9bVcZnwF2gDEZhv7Q/N6JRnqulYelLJEtEicsn9K PAAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=UoCbAFz1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id bb20-20020a17090b009400b0027367e0c931si5754628pjb.129.2023.10.21.02.29.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Oct 2023 02:29:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=UoCbAFz1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id CA6E6805D2BF; Sat, 21 Oct 2023 02:29:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229765AbjJUJ2b (ORCPT <rfc822;a1648639935@gmail.com> + 26 others); Sat, 21 Oct 2023 05:28:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229472AbjJUJ22 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 21 Oct 2023 05:28:28 -0400 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0DD1B0 for <linux-kernel@vger.kernel.org>; Sat, 21 Oct 2023 02:28:23 -0700 (PDT) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39L5eAj6008978; Sat, 21 Oct 2023 09:28:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : mime-version : content-type : content-transfer-encoding; s=corp-2023-03-30; bh=vXJHNbSFXSISHev3zlx79ZCGYnD++jfw2/gKRC3j7mM=; b=UoCbAFz1PpfdKCiAzA5FgvmBx74jQ8ve93i8OtBeiTJX6NeOT1L075fbUaZPSDr/pjih rbIEoG+PEw8t7a68TM8uLY9TqEj83aGybAtxTlBgUgt5Z7zKUFTG7rmU5K5PT3AjJIca rXjHxKqtshKtVaFHb7dNiw5LuMjNgAByjEEG1LRAK/sDCYl1csODGKk3iL0dz26meJvb Xehrn7Tx0cM7pFK23+f51PNzXxNkhj5WQK4N4xO9c3rRJaU3nGSYr8QPEqKNHBJKMdjL HX+CwmNBS0eHJ+xuYg9FMwxGaJLO7N05PAE8ahHAgxjUkcy8N5hDqoyNjnep4ACCvUcT Bg== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3tv68t8c2a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 21 Oct 2023 09:28:18 +0000 Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 39L6FU68019120; Sat, 21 Oct 2023 09:28:18 GMT Received: from ban25x6uut24.us.oracle.com (ban25x6uut24.us.oracle.com [10.153.73.24]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 3tv532gf44-1; Sat, 21 Oct 2023 09:28:18 +0000 From: Si-Wei Liu <si-wei.liu@oracle.com> To: jasowang@redhat.com, mst@redhat.com, eperezma@redhat.com, sgarzare@redhat.com, dtatulea@nvidia.com Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset Date: Sat, 21 Oct 2023 02:25:12 -0700 Message-Id: <1697880319-4937-1-git-send-email-si-wei.liu@oracle.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-20_10,2023-10-19_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 malwarescore=0 bulkscore=0 mlxscore=0 suspectscore=0 phishscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2310170001 definitions=main-2310210086 X-Proofpoint-ORIG-GUID: OcUBGy1t-QNztFdDrpMctnzrON55byEh X-Proofpoint-GUID: OcUBGy1t-QNztFdDrpMctnzrON55byEh X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Sat, 21 Oct 2023 02:29:16 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780356816648969055 X-GMAIL-MSGID: 1780356816648969055 |
Series |
vdpa: decouple reset of iotlb mapping from device reset
|
|
Message
Si-Wei Liu
Oct. 21, 2023, 9:25 a.m. UTC
In order to reduce needlessly high setup and teardown cost of iotlb mapping during live migration, it's crucial to decouple the vhost-vdpa iotlb abstraction from the virtio device life cycle, i.e. iotlb mappings should be left intact across virtio device reset [1]. For it to work, the on-chip IOMMU parent device could implement a separate .reset_map() operation callback to restore 1:1 DMA mapping without having to resort to the .reset() callback, the latter of which is mainly used to reset virtio device state. This new .reset_map() callback will be invoked only before the vhost-vdpa driver is to be removed and detached from the vdpa bus, such that other vdpa bus drivers, e.g. virtio-vdpa, can start with 1:1 DMA mapping when they are attached. For the context, those on-chip IOMMU parent devices, create the 1:1 DMA mapping at vdpa device creation, and they would implicitly destroy the 1:1 mapping when the first .set_map or .dma_map callback is invoked. This patchset is rebased on top of the latest vhost tree. [1] Reducing vdpa migration downtime because of memory pin / maps https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html --- v4: - Rework compatibility using new .compat_reset driver op v3: - add .reset_map support to vdpa_sim - introduce module parameter to provide bug-for-bug compatibility with older userspace v2: - improved commit message to clarify the intended csope of .reset_map API - improved commit messages to clarify no breakage on older userspace v1: - rewrote commit messages to include more detailed description and background - reword to vendor specific IOMMU implementation from on-chip IOMMU - include parent device backend feautres to persistent iotlb precondition - reimplement mlx5_vdpa patch on top of descriptor group series RFC v3: - fix missing return due to merge error in patch #4 RFC v2: - rebased on top of the "[PATCH RFC v2 0/3] vdpa: dedicated descriptor table group" series: https://lore.kernel.org/virtualization/1694248959-13369-1-git-send-email-si-wei.liu@oracle.com/ --- Si-Wei Liu (7): vdpa: introduce .reset_map operation callback vhost-vdpa: reset vendor specific mapping to initial state in .release vhost-vdpa: introduce IOTLB_PERSIST backend feature bit vdpa: introduce .compat_reset operation callback vhost-vdpa: clean iotlb map during reset for older userspace vdpa/mlx5: implement .reset_map driver op vdpa_sim: implement .reset_map support drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + drivers/vdpa/mlx5/core/mr.c | 17 ++++++++++ drivers/vdpa/mlx5/net/mlx5_vnet.c | 27 ++++++++++++++-- drivers/vdpa/vdpa_sim/vdpa_sim.c | 52 ++++++++++++++++++++++++------ drivers/vhost/vdpa.c | 49 +++++++++++++++++++++++++--- drivers/virtio/virtio_vdpa.c | 2 +- include/linux/vdpa.h | 30 +++++++++++++++-- include/uapi/linux/vhost_types.h | 2 ++ 8 files changed, 161 insertions(+), 19 deletions(-)
Comments
Hi Si-Wei: On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > In order to reduce needlessly high setup and teardown cost > of iotlb mapping during live migration, it's crucial to > decouple the vhost-vdpa iotlb abstraction from the virtio > device life cycle, i.e. iotlb mappings should be left > intact across virtio device reset [1]. For it to work, the > on-chip IOMMU parent device could implement a separate > .reset_map() operation callback to restore 1:1 DMA mapping > without having to resort to the .reset() callback, the > latter of which is mainly used to reset virtio device state. > This new .reset_map() callback will be invoked only before > the vhost-vdpa driver is to be removed and detached from > the vdpa bus, such that other vdpa bus drivers, e.g. > virtio-vdpa, can start with 1:1 DMA mapping when they > are attached. For the context, those on-chip IOMMU parent > devices, create the 1:1 DMA mapping at vdpa device creation, > and they would implicitly destroy the 1:1 mapping when > the first .set_map or .dma_map callback is invoked. > > This patchset is rebased on top of the latest vhost tree. > > [1] Reducing vdpa migration downtime because of memory pin / maps > https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > > --- > v4: > - Rework compatibility using new .compat_reset driver op I still think having a set_backend_feature() or reset_map(clean=true) might be better. As it tries hard to not introduce new stuff on the bus. But we can listen to others for sure. Thanks
On 10/22/2023 8:51 PM, Jason Wang wrote: > Hi Si-Wei: > > On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> In order to reduce needlessly high setup and teardown cost >> of iotlb mapping during live migration, it's crucial to >> decouple the vhost-vdpa iotlb abstraction from the virtio >> device life cycle, i.e. iotlb mappings should be left >> intact across virtio device reset [1]. For it to work, the >> on-chip IOMMU parent device could implement a separate >> .reset_map() operation callback to restore 1:1 DMA mapping >> without having to resort to the .reset() callback, the >> latter of which is mainly used to reset virtio device state. >> This new .reset_map() callback will be invoked only before >> the vhost-vdpa driver is to be removed and detached from >> the vdpa bus, such that other vdpa bus drivers, e.g. >> virtio-vdpa, can start with 1:1 DMA mapping when they >> are attached. For the context, those on-chip IOMMU parent >> devices, create the 1:1 DMA mapping at vdpa device creation, >> and they would implicitly destroy the 1:1 mapping when >> the first .set_map or .dma_map callback is invoked. >> >> This patchset is rebased on top of the latest vhost tree. >> >> [1] Reducing vdpa migration downtime because of memory pin / maps >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >> >> --- >> v4: >> - Rework compatibility using new .compat_reset driver op > I still think having a set_backend_feature() This will overload backend features with the role of carrying over compatibility quirks, which I tried to avoid from. While I think the .compat_reset from the v4 code just works with the backend features acknowledgement (and maybe others as well) to determine, but not directly tie it to backend features itself. These two have different implications in terms of requirement, scope and maintaining/deprecation, better to cope with compat quirks in explicit and driver visible way. > or reset_map(clean=true) might be better. An explicit op might be marginally better in driver writer's point of view. Compliant driver doesn't have to bother asserting clean_map never be true so their code would never bother dealing with this case, as explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map during reset for older userspace": " The separation of .compat_reset from the regular .reset allows vhost-vdpa able to know which driver had broken behavior before, so it can apply the corresponding compatibility quirk to the individual driver whenever needed. Compared to overloading the existing .reset with flags, .compat_reset won't cause any extra burden to the implementation of every compliant driver. " > As it tries hard to not introduce new stuff on the bus. Honestly I don't see substantial difference between these other than the color. There's no single best solution that stands out among the 3. And I assume you already noticed it from all the above 3 approaches will have to go with backend features negotiation, that the 1st vdpa reset before backend feature negotiation will use the compliant version of .reset that doesn't clean up the map. While I don't think this nuance matters much to existing older userspace apps, as the maps should already get cleaned by previous process in vhost_vdpa_cleanup(), but if bug-for-bug behavioral compatibility is what you want, module parameter will be the single best answer. Regards, -Siwei > But we can listen to others for sure. > > Thanks >
QE tested this series v4 with regression testing on real nic, there is no new regression bug. Tested-by: Lei Yang <leiyang@redhat.com> On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 10/22/2023 8:51 PM, Jason Wang wrote: > > Hi Si-Wei: > > > > On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> In order to reduce needlessly high setup and teardown cost > >> of iotlb mapping during live migration, it's crucial to > >> decouple the vhost-vdpa iotlb abstraction from the virtio > >> device life cycle, i.e. iotlb mappings should be left > >> intact across virtio device reset [1]. For it to work, the > >> on-chip IOMMU parent device could implement a separate > >> .reset_map() operation callback to restore 1:1 DMA mapping > >> without having to resort to the .reset() callback, the > >> latter of which is mainly used to reset virtio device state. > >> This new .reset_map() callback will be invoked only before > >> the vhost-vdpa driver is to be removed and detached from > >> the vdpa bus, such that other vdpa bus drivers, e.g. > >> virtio-vdpa, can start with 1:1 DMA mapping when they > >> are attached. For the context, those on-chip IOMMU parent > >> devices, create the 1:1 DMA mapping at vdpa device creation, > >> and they would implicitly destroy the 1:1 mapping when > >> the first .set_map or .dma_map callback is invoked. > >> > >> This patchset is rebased on top of the latest vhost tree. > >> > >> [1] Reducing vdpa migration downtime because of memory pin / maps > >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >> > >> --- > >> v4: > >> - Rework compatibility using new .compat_reset driver op > > I still think having a set_backend_feature() > This will overload backend features with the role of carrying over > compatibility quirks, which I tried to avoid from. While I think the > .compat_reset from the v4 code just works with the backend features > acknowledgement (and maybe others as well) to determine, but not > directly tie it to backend features itself. These two have different > implications in terms of requirement, scope and maintaining/deprecation, > better to cope with compat quirks in explicit and driver visible way. > > > or reset_map(clean=true) might be better. > An explicit op might be marginally better in driver writer's point of > view. Compliant driver doesn't have to bother asserting clean_map never > be true so their code would never bother dealing with this case, as > explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map > during reset for older userspace": > > " > The separation of .compat_reset from the regular .reset allows > vhost-vdpa able to know which driver had broken behavior before, so it > can apply the corresponding compatibility quirk to the individual > driver > whenever needed. Compared to overloading the existing .reset with > flags, .compat_reset won't cause any extra burden to the implementation > of every compliant driver. > " > > > As it tries hard to not introduce new stuff on the bus. > Honestly I don't see substantial difference between these other than the > color. There's no single best solution that stands out among the 3. And > I assume you already noticed it from all the above 3 approaches will > have to go with backend features negotiation, that the 1st vdpa reset > before backend feature negotiation will use the compliant version of > .reset that doesn't clean up the map. While I don't think this nuance > matters much to existing older userspace apps, as the maps should > already get cleaned by previous process in vhost_vdpa_cleanup(), but if > bug-for-bug behavioral compatibility is what you want, module parameter > will be the single best answer. > > Regards, > -Siwei > > > But we can listen to others for sure. > > > > Thanks > > >
Thanks a lot for testing! Please be aware that there's a follow-up fix for a potential oops in this v4 series: https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ Would be nice to have it applied for any tests. Thanks, -Siwei On 10/23/2023 11:51 PM, Lei Yang wrote: > QE tested this series v4 with regression testing on real nic, there is > no new regression bug. > > Tested-by: Lei Yang <leiyang@redhat.com> > > On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 10/22/2023 8:51 PM, Jason Wang wrote: >>> Hi Si-Wei: >>> >>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> In order to reduce needlessly high setup and teardown cost >>>> of iotlb mapping during live migration, it's crucial to >>>> decouple the vhost-vdpa iotlb abstraction from the virtio >>>> device life cycle, i.e. iotlb mappings should be left >>>> intact across virtio device reset [1]. For it to work, the >>>> on-chip IOMMU parent device could implement a separate >>>> .reset_map() operation callback to restore 1:1 DMA mapping >>>> without having to resort to the .reset() callback, the >>>> latter of which is mainly used to reset virtio device state. >>>> This new .reset_map() callback will be invoked only before >>>> the vhost-vdpa driver is to be removed and detached from >>>> the vdpa bus, such that other vdpa bus drivers, e.g. >>>> virtio-vdpa, can start with 1:1 DMA mapping when they >>>> are attached. For the context, those on-chip IOMMU parent >>>> devices, create the 1:1 DMA mapping at vdpa device creation, >>>> and they would implicitly destroy the 1:1 mapping when >>>> the first .set_map or .dma_map callback is invoked. >>>> >>>> This patchset is rebased on top of the latest vhost tree. >>>> >>>> [1] Reducing vdpa migration downtime because of memory pin / maps >>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >>>> >>>> --- >>>> v4: >>>> - Rework compatibility using new .compat_reset driver op >>> I still think having a set_backend_feature() >> This will overload backend features with the role of carrying over >> compatibility quirks, which I tried to avoid from. While I think the >> .compat_reset from the v4 code just works with the backend features >> acknowledgement (and maybe others as well) to determine, but not >> directly tie it to backend features itself. These two have different >> implications in terms of requirement, scope and maintaining/deprecation, >> better to cope with compat quirks in explicit and driver visible way. >> >>> or reset_map(clean=true) might be better. >> An explicit op might be marginally better in driver writer's point of >> view. Compliant driver doesn't have to bother asserting clean_map never >> be true so their code would never bother dealing with this case, as >> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map >> during reset for older userspace": >> >> " >> The separation of .compat_reset from the regular .reset allows >> vhost-vdpa able to know which driver had broken behavior before, so it >> can apply the corresponding compatibility quirk to the individual >> driver >> whenever needed. Compared to overloading the existing .reset with >> flags, .compat_reset won't cause any extra burden to the implementation >> of every compliant driver. >> " >> >>> As it tries hard to not introduce new stuff on the bus. >> Honestly I don't see substantial difference between these other than the >> color. There's no single best solution that stands out among the 3. And >> I assume you already noticed it from all the above 3 approaches will >> have to go with backend features negotiation, that the 1st vdpa reset >> before backend feature negotiation will use the compliant version of >> .reset that doesn't clean up the map. While I don't think this nuance >> matters much to existing older userspace apps, as the maps should >> already get cleaned by previous process in vhost_vdpa_cleanup(), but if >> bug-for-bug behavioral compatibility is what you want, module parameter >> will be the single best answer. >> >> Regards, >> -Siwei >> >>> But we can listen to others for sure. >>> >>> Thanks >>>
On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > Hello Si-Wei > Thanks a lot for testing! Please be aware that there's a follow-up fix > for a potential oops in this v4 series: > The first, when I did not apply this patch [1], I will also hit this patch mentioned problem. After I applied this patch, this problem will no longer to hit again. But I hit another issues, about the error messages please review the attached file. [1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ My test steps: git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git cd linux/ b4 am 1697880319-4937-1-git-send-email-si-wei.liu@oracle.com b4 am 20231018171456.1624030-2-dtatulea@nvidia.com b4 am 1698102863-21122-1-git-send-email-si-wei.liu@oracle.com git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx cp /boot/config-5.14.0-377.el9.x86_64 .config make -j 32 make modules_install make install Thanks Lei > https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ > > Would be nice to have it applied for any tests. > > Thanks, > -Siwei > > On 10/23/2023 11:51 PM, Lei Yang wrote: > > QE tested this series v4 with regression testing on real nic, there is > > no new regression bug. > > > > Tested-by: Lei Yang <leiyang@redhat.com> > > > > On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 10/22/2023 8:51 PM, Jason Wang wrote: > >>> Hi Si-Wei: > >>> > >>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>> In order to reduce needlessly high setup and teardown cost > >>>> of iotlb mapping during live migration, it's crucial to > >>>> decouple the vhost-vdpa iotlb abstraction from the virtio > >>>> device life cycle, i.e. iotlb mappings should be left > >>>> intact across virtio device reset [1]. For it to work, the > >>>> on-chip IOMMU parent device could implement a separate > >>>> .reset_map() operation callback to restore 1:1 DMA mapping > >>>> without having to resort to the .reset() callback, the > >>>> latter of which is mainly used to reset virtio device state. > >>>> This new .reset_map() callback will be invoked only before > >>>> the vhost-vdpa driver is to be removed and detached from > >>>> the vdpa bus, such that other vdpa bus drivers, e.g. > >>>> virtio-vdpa, can start with 1:1 DMA mapping when they > >>>> are attached. For the context, those on-chip IOMMU parent > >>>> devices, create the 1:1 DMA mapping at vdpa device creation, > >>>> and they would implicitly destroy the 1:1 mapping when > >>>> the first .set_map or .dma_map callback is invoked. > >>>> > >>>> This patchset is rebased on top of the latest vhost tree. > >>>> > >>>> [1] Reducing vdpa migration downtime because of memory pin / maps > >>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >>>> > >>>> --- > >>>> v4: > >>>> - Rework compatibility using new .compat_reset driver op > >>> I still think having a set_backend_feature() > >> This will overload backend features with the role of carrying over > >> compatibility quirks, which I tried to avoid from. While I think the > >> .compat_reset from the v4 code just works with the backend features > >> acknowledgement (and maybe others as well) to determine, but not > >> directly tie it to backend features itself. These two have different > >> implications in terms of requirement, scope and maintaining/deprecation, > >> better to cope with compat quirks in explicit and driver visible way. > >> > >>> or reset_map(clean=true) might be better. > >> An explicit op might be marginally better in driver writer's point of > >> view. Compliant driver doesn't have to bother asserting clean_map never > >> be true so their code would never bother dealing with this case, as > >> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map > >> during reset for older userspace": > >> > >> " > >> The separation of .compat_reset from the regular .reset allows > >> vhost-vdpa able to know which driver had broken behavior before, so it > >> can apply the corresponding compatibility quirk to the individual > >> driver > >> whenever needed. Compared to overloading the existing .reset with > >> flags, .compat_reset won't cause any extra burden to the implementation > >> of every compliant driver. > >> " > >> > >>> As it tries hard to not introduce new stuff on the bus. > >> Honestly I don't see substantial difference between these other than the > >> color. There's no single best solution that stands out among the 3. And > >> I assume you already noticed it from all the above 3 approaches will > >> have to go with backend features negotiation, that the 1st vdpa reset > >> before backend feature negotiation will use the compliant version of > >> .reset that doesn't clean up the map. While I don't think this nuance > >> matters much to existing older userspace apps, as the maps should > >> already get cleaned by previous process in vhost_vdpa_cleanup(), but if > >> bug-for-bug behavioral compatibility is what you want, module parameter > >> will be the single best answer. > >> > >> Regards, > >> -Siwei > >> > >>> But we can listen to others for sure. > >>> > >>> Thanks > >>> >
Hi Yang Lei, Thanks for testing my patches and reporting! As for the issue, could you please try what I posted in: https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@oracle.com/ and let me know how it goes? Thank you very much! Thanks, -Siwei On 10/25/2023 2:41 AM, Lei Yang wrote: > On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > Hello Si-Wei >> Thanks a lot for testing! Please be aware that there's a follow-up fix >> for a potential oops in this v4 series: >> > The first, when I did not apply this patch [1], I will also hit this > patch mentioned problem. After I applied this patch, this problem will > no longer to hit again. But I hit another issues, about the error > messages please review the attached file. > [1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ > > My test steps: > git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > cd linux/ > b4 am 1697880319-4937-1-git-send-email-si-wei.liu@oracle.com > b4 am 20231018171456.1624030-2-dtatulea@nvidia.com > b4 am 1698102863-21122-1-git-send-email-si-wei.liu@oracle.com > git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx > git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx > git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx > cp /boot/config-5.14.0-377.el9.x86_64 .config > make -j 32 > make modules_install > make install > > Thanks > > Lei >> https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ >> >> Would be nice to have it applied for any tests. >> >> Thanks, >> -Siwei >> >> On 10/23/2023 11:51 PM, Lei Yang wrote: >>> QE tested this series v4 with regression testing on real nic, there is >>> no new regression bug. >>> >>> Tested-by: Lei Yang <leiyang@redhat.com> >>> >>> On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> >>>> On 10/22/2023 8:51 PM, Jason Wang wrote: >>>>> Hi Si-Wei: >>>>> >>>>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>>>> In order to reduce needlessly high setup and teardown cost >>>>>> of iotlb mapping during live migration, it's crucial to >>>>>> decouple the vhost-vdpa iotlb abstraction from the virtio >>>>>> device life cycle, i.e. iotlb mappings should be left >>>>>> intact across virtio device reset [1]. For it to work, the >>>>>> on-chip IOMMU parent device could implement a separate >>>>>> .reset_map() operation callback to restore 1:1 DMA mapping >>>>>> without having to resort to the .reset() callback, the >>>>>> latter of which is mainly used to reset virtio device state. >>>>>> This new .reset_map() callback will be invoked only before >>>>>> the vhost-vdpa driver is to be removed and detached from >>>>>> the vdpa bus, such that other vdpa bus drivers, e.g. >>>>>> virtio-vdpa, can start with 1:1 DMA mapping when they >>>>>> are attached. For the context, those on-chip IOMMU parent >>>>>> devices, create the 1:1 DMA mapping at vdpa device creation, >>>>>> and they would implicitly destroy the 1:1 mapping when >>>>>> the first .set_map or .dma_map callback is invoked. >>>>>> >>>>>> This patchset is rebased on top of the latest vhost tree. >>>>>> >>>>>> [1] Reducing vdpa migration downtime because of memory pin / maps >>>>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >>>>>> >>>>>> --- >>>>>> v4: >>>>>> - Rework compatibility using new .compat_reset driver op >>>>> I still think having a set_backend_feature() >>>> This will overload backend features with the role of carrying over >>>> compatibility quirks, which I tried to avoid from. While I think the >>>> .compat_reset from the v4 code just works with the backend features >>>> acknowledgement (and maybe others as well) to determine, but not >>>> directly tie it to backend features itself. These two have different >>>> implications in terms of requirement, scope and maintaining/deprecation, >>>> better to cope with compat quirks in explicit and driver visible way. >>>> >>>>> or reset_map(clean=true) might be better. >>>> An explicit op might be marginally better in driver writer's point of >>>> view. Compliant driver doesn't have to bother asserting clean_map never >>>> be true so their code would never bother dealing with this case, as >>>> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map >>>> during reset for older userspace": >>>> >>>> " >>>> The separation of .compat_reset from the regular .reset allows >>>> vhost-vdpa able to know which driver had broken behavior before, so it >>>> can apply the corresponding compatibility quirk to the individual >>>> driver >>>> whenever needed. Compared to overloading the existing .reset with >>>> flags, .compat_reset won't cause any extra burden to the implementation >>>> of every compliant driver. >>>> " >>>> >>>>> As it tries hard to not introduce new stuff on the bus. >>>> Honestly I don't see substantial difference between these other than the >>>> color. There's no single best solution that stands out among the 3. And >>>> I assume you already noticed it from all the above 3 approaches will >>>> have to go with backend features negotiation, that the 1st vdpa reset >>>> before backend feature negotiation will use the compliant version of >>>> .reset that doesn't clean up the map. While I don't think this nuance >>>> matters much to existing older userspace apps, as the maps should >>>> already get cleaned by previous process in vhost_vdpa_cleanup(), but if >>>> bug-for-bug behavioral compatibility is what you want, module parameter >>>> will be the single best answer. >>>> >>>> Regards, >>>> -Siwei >>>> >>>>> But we can listen to others for sure. >>>>> >>>>> Thanks >>>>>
On Thu, Oct 26, 2023 at 7:32 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Hi Yang Lei, > > Thanks for testing my patches and reporting! As for the issue, could you > please try what I posted in: > > https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@oracle.com/ > HI Si-Wei > and let me know how it goes? Thank you very much! This problem has gone after applying this patch [1]. [1] https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@oracle.com/ Thanks Lei > > Thanks, > -Siwei > > On 10/25/2023 2:41 AM, Lei Yang wrote: > > On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Hello Si-Wei > >> Thanks a lot for testing! Please be aware that there's a follow-up fix > >> for a potential oops in this v4 series: > >> > > The first, when I did not apply this patch [1], I will also hit this > > patch mentioned problem. After I applied this patch, this problem will > > no longer to hit again. But I hit another issues, about the error > > messages please review the attached file. > > [1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ > > > > My test steps: > > git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > > cd linux/ > > b4 am 1697880319-4937-1-git-send-email-si-wei.liu@oracle.com > > b4 am 20231018171456.1624030-2-dtatulea@nvidia.com > > b4 am 1698102863-21122-1-git-send-email-si-wei.liu@oracle.com > > git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx > > git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx > > git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx > > cp /boot/config-5.14.0-377.el9.x86_64 .config > > make -j 32 > > make modules_install > > make install > > > > Thanks > > > > Lei > >> https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@oracle.com/ > >> > >> Would be nice to have it applied for any tests. > >> > >> Thanks, > >> -Siwei > >> > >> On 10/23/2023 11:51 PM, Lei Yang wrote: > >>> QE tested this series v4 with regression testing on real nic, there is > >>> no new regression bug. > >>> > >>> Tested-by: Lei Yang <leiyang@redhat.com> > >>> > >>> On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>> > >>>> On 10/22/2023 8:51 PM, Jason Wang wrote: > >>>>> Hi Si-Wei: > >>>>> > >>>>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>>>> In order to reduce needlessly high setup and teardown cost > >>>>>> of iotlb mapping during live migration, it's crucial to > >>>>>> decouple the vhost-vdpa iotlb abstraction from the virtio > >>>>>> device life cycle, i.e. iotlb mappings should be left > >>>>>> intact across virtio device reset [1]. For it to work, the > >>>>>> on-chip IOMMU parent device could implement a separate > >>>>>> .reset_map() operation callback to restore 1:1 DMA mapping > >>>>>> without having to resort to the .reset() callback, the > >>>>>> latter of which is mainly used to reset virtio device state. > >>>>>> This new .reset_map() callback will be invoked only before > >>>>>> the vhost-vdpa driver is to be removed and detached from > >>>>>> the vdpa bus, such that other vdpa bus drivers, e.g. > >>>>>> virtio-vdpa, can start with 1:1 DMA mapping when they > >>>>>> are attached. For the context, those on-chip IOMMU parent > >>>>>> devices, create the 1:1 DMA mapping at vdpa device creation, > >>>>>> and they would implicitly destroy the 1:1 mapping when > >>>>>> the first .set_map or .dma_map callback is invoked. > >>>>>> > >>>>>> This patchset is rebased on top of the latest vhost tree. > >>>>>> > >>>>>> [1] Reducing vdpa migration downtime because of memory pin / maps > >>>>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >>>>>> > >>>>>> --- > >>>>>> v4: > >>>>>> - Rework compatibility using new .compat_reset driver op > >>>>> I still think having a set_backend_feature() > >>>> This will overload backend features with the role of carrying over > >>>> compatibility quirks, which I tried to avoid from. While I think the > >>>> .compat_reset from the v4 code just works with the backend features > >>>> acknowledgement (and maybe others as well) to determine, but not > >>>> directly tie it to backend features itself. These two have different > >>>> implications in terms of requirement, scope and maintaining/deprecation, > >>>> better to cope with compat quirks in explicit and driver visible way. > >>>> > >>>>> or reset_map(clean=true) might be better. > >>>> An explicit op might be marginally better in driver writer's point of > >>>> view. Compliant driver doesn't have to bother asserting clean_map never > >>>> be true so their code would never bother dealing with this case, as > >>>> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map > >>>> during reset for older userspace": > >>>> > >>>> " > >>>> The separation of .compat_reset from the regular .reset allows > >>>> vhost-vdpa able to know which driver had broken behavior before, so it > >>>> can apply the corresponding compatibility quirk to the individual > >>>> driver > >>>> whenever needed. Compared to overloading the existing .reset with > >>>> flags, .compat_reset won't cause any extra burden to the implementation > >>>> of every compliant driver. > >>>> " > >>>> > >>>>> As it tries hard to not introduce new stuff on the bus. > >>>> Honestly I don't see substantial difference between these other than the > >>>> color. There's no single best solution that stands out among the 3. And > >>>> I assume you already noticed it from all the above 3 approaches will > >>>> have to go with backend features negotiation, that the 1st vdpa reset > >>>> before backend feature negotiation will use the compliant version of > >>>> .reset that doesn't clean up the map. While I don't think this nuance > >>>> matters much to existing older userspace apps, as the maps should > >>>> already get cleaned by previous process in vhost_vdpa_cleanup(), but if > >>>> bug-for-bug behavioral compatibility is what you want, module parameter > >>>> will be the single best answer. > >>>> > >>>> Regards, > >>>> -Siwei > >>>> > >>>>> But we can listen to others for sure. > >>>>> > >>>>> Thanks > >>>>> >