Message ID | 20230802171231.11001-3-dtatulea@nvidia.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp616264vqx; Wed, 2 Aug 2023 10:32:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlHjtppAPDz7ZL4GWPLG7Xv8s8HJapmSTrfsjgRC/bA4PjIksWMNSH76RTGNwUlIMd1D3ZXs X-Received: by 2002:a17:902:e546:b0:1b1:9d43:ad4c with SMTP id n6-20020a170902e54600b001b19d43ad4cmr16930872plf.40.1690997548444; Wed, 02 Aug 2023 10:32:28 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1690997548; cv=pass; d=google.com; s=arc-20160816; b=TW5LSYFeUQQOvCOVyRrkHSvqxWbakrkQ2Q5roNVep8mj7KJJoaeEW7rIDpdRuqc3L5 6oYkMD3gSxlXFGxSeh81ylNg9jy9kmxWeXc0Gq+4ncDDFki64Oii+2O7qOGX1NvKenQ6 Bh9Xj8j+Wh1wffWZu6o2BYls94VDsKfqJbSACdf5b+COvafMUOQY2FhWpZRedjianuZw 5/Oz8kjSTB5BC7sbOesRBno43WY40NuuBAbsU0uzrR08co/9kR519tEyL6olVS6/EUKi 8W9RWmezU3vxB15JSxK12BIqdTfEUCEW9G13HQZ3/pVfsEKsENE0WvcmW1i4btQ5cMxK E6zg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Z3o64W6AxgCasG8EB+Q2EH9SyaH2V1AMN/IpgxvfuVs=; fh=+95HZF2HMWf/zVxErTglth2K4K2H1g2tqr9wBazu2Nw=; b=dpX4ymktsv/PvzOXnXwiQdiWxhkblpqeWjAMdT2zko7NG4OVYmLhO1gt6tXvBuYfrl kQP7kb7htRr/KZgoYa0boLlyYNQXXbZTF2aN3peKdhzGFREJym7hTdOff6QVGtcgbZDr oEGJcThYZ/DdcBkMnE7CQcQAFWk4YvlOEFflxyIH4uEv5ZKSShPpHQ5gKNvSpPGzO+bD RbaA+Nc8Rjvcsyo0CrVP1as+LaulUvz9hoq4RG5c77/n3v4wMOjJQiKpdru23a7yj3fb WHPGrJPSEmbsVvVlr+5cGZi07cEO8soHyrHRRRZbFPTq/h6PG7K/LKimluwaDVLREYKh 0Fqw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=cvCPZyaC; arc=pass (i=1 spf=pass spfdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a6-20020a170902ecc600b001bbfdb016a5si7601983plh.185.2023.08.02.10.32.10; Wed, 02 Aug 2023 10:32:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@Nvidia.com header.s=selector2 header.b=cvCPZyaC; arc=pass (i=1 spf=pass spfdomain=nvidia.com dmarc=pass fromdomain=nvidia.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232381AbjHBRNZ (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Wed, 2 Aug 2023 13:13:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232014AbjHBRNR (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 2 Aug 2023 13:13:17 -0400 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2056.outbound.protection.outlook.com [40.107.96.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 803C31734 for <linux-kernel@vger.kernel.org>; Wed, 2 Aug 2023 10:13:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IavSKqK64YjUSvBo4hps72Hs1aAbpsJDk2lYlUu5hk1AndYdMQ6CRs9yDTY0KHELrKo1t+dSnJVL+ggH9t7D6Nsbp/J9xWhwWSu+ri0QrnOpy5JaqcqF7uJLqCiAi4JcRXkl+zlj1AwcZd2zMHPJIH9W2Q646LmNe7BNwQkNfe7KsHbm0XHuJ2OsnYVVc6RgaKqOqHa9NQBgrm5llg+0oJrApwNxmbfmqOOcSJqSy30CfoGl0nxwF/PH32Uyh6HgOlbM2OrIWLeqkSUEe2dTzSCaIcyDPC5or5Vi89iDAZaeFBsS4jVi8PwTr6beR1fK7IJ4exlosyxxJoxkkWVd7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Z3o64W6AxgCasG8EB+Q2EH9SyaH2V1AMN/IpgxvfuVs=; b=Hb6Dtycl6qkai1ebLP7oi+5Y8Ry0K9NPaqUEujnDgeF9ya7eXREAukAcrmYHIoSKWYtsYFc9DSbuiV4I2e5jDSrMQ4DOTea5B3I9Rg85Z1XCIj3zagLRHNr+khUJmnabkrqO466IQj20ruwulfo45WGHxV1jzbXTvUhTX2bMuinAPVxfiFFUcosB6+36jWp4rceilYwTZeListn+b88/yiZ6poJetYbEwUDDalS6wvQwVmHe8zWUqvIN7RkpXEW+eEVnD3+fJUrl9quYQxAAWdlVvoDlSuXORnPxXnZIOMbfKkBxL1+rF5vqctGLCxgPKo5vS14WE8SwmUUeWBy6mQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=redhat.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Z3o64W6AxgCasG8EB+Q2EH9SyaH2V1AMN/IpgxvfuVs=; b=cvCPZyaCfaccG4etB2GxvLGzW7+Lk1cgP0eHYtKczFv1vvUb3nmbTvyu7uzU13SfZtOANUMlMnakdDizy2XCgT1/Qdv8POug8z2l2bth8VaG04T85axBquZ2lIMtG2Vv7Ddy1biV8XC0X0CmCWLJ31402wCXzHE8VyFe+LtnbVNn7cIzVdFtLMg1V/VgFI7NM1L3SCCsPmqSM2BdFgieO7WAyQP/QqEZDvZs6pYktlJekU1keoyMtce089clFVEYm5Q3y+3CSf7GHZKpl1lQ17Q4lqzUwoC7JOg1TaKh9tAZ55KZ6rIvcDC846/K63IiyvNVsokEj/6e7pK8zFdsgg== Received: from MW4PR03CA0096.namprd03.prod.outlook.com (2603:10b6:303:b7::11) by IA0PR12MB7555.namprd12.prod.outlook.com (2603:10b6:208:43d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.45; Wed, 2 Aug 2023 17:13:08 +0000 Received: from MWH0EPF000971E6.namprd02.prod.outlook.com (2603:10b6:303:b7:cafe::72) by MW4PR03CA0096.outlook.office365.com (2603:10b6:303:b7::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.19 via Frontend Transport; Wed, 2 Aug 2023 17:13:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by MWH0EPF000971E6.mail.protection.outlook.com (10.167.243.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.20 via Frontend Transport; Wed, 2 Aug 2023 17:13:07 +0000 Received: from rnnvmail203.nvidia.com (10.129.68.9) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 2 Aug 2023 10:12:53 -0700 Received: from rnnvmail203.nvidia.com (10.129.68.9) by rnnvmail203.nvidia.com (10.129.68.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 2 Aug 2023 10:12:53 -0700 Received: from c-237-113-220-225.mtl.labs.mlnx (10.127.8.12) by mail.nvidia.com (10.129.68.9) with Microsoft SMTP Server id 15.2.986.37 via Frontend Transport; Wed, 2 Aug 2023 10:12:50 -0700 From: Dragos Tatulea <dtatulea@nvidia.com> To: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Xuan Zhuo <xuanzhuo@linux.alibaba.com> CC: Dragos Tatulea <dtatulea@nvidia.com>, =?utf-8?q?Eugenio_P=C3=A9rez?= <eperezma@redhat.com>, Gal Pressman <gal@nvidia.com>, <virtualization@lists.linux-foundation.org>, <linux-kernel@vger.kernel.org> Subject: [PATCH 1/2] vdpa/mlx5: Fix mr->initialized semantics Date: Wed, 2 Aug 2023 20:12:18 +0300 Message-ID: <20230802171231.11001-3-dtatulea@nvidia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230802171231.11001-1-dtatulea@nvidia.com> References: <20230802171231.11001-1-dtatulea@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000971E6:EE_|IA0PR12MB7555:EE_ X-MS-Office365-Filtering-Correlation-Id: 25d91cf9-007c-40ad-25b5-08db937bbddb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3ebzthq04FOkvko+bkA4hJChFsnc701BWXCqTZ7KAlO+zHhcFSd6zjT0c7euDAf7YyyZkZbY2AErdRGkcRmhtqrB5fBaTEYocUxWobBa69ad2Z4VQIVITWnpBTg3dDznbP5/5wbXfMYruZryzcy6fSKhloN0Qakc4FcXTKBdFKDI2rzW4KYzD9nZY2dXoSzps20Ny7mbSTxTyk2awKssEecvSxdXTMAYKzL4NO78gNn4hgIs0P6zFyUkuQNklqiavplQ7LBMzmEsBbTtXt9p0Lo3HBIet5dlPUf7Nwl/XSgVfxXWjyq8EGS9U7PYS9GzSYiDCU07JQu8ztSn12aajbc4YIagvSftCVQ/Tl4Nyqc2dF6ZV9XfR3OS0O+cKci9SNAbtuYxICn9/++p2ZVazW+Nl6XGxKo4pIcd1PxE0R+u1/12BMcramAWhwHi/GdIO9hiT83lco0Kalo7m5FK88KFkcIy1eHr5SoCAlxqZH8/khWM5zWpodsH/BmltGtmphoGHd7JgKAI0Fe5kCaKwlwCiJ5HKIfnQxG5LgqG6cfqztYEAwPnKe+l82egTF3CBFY0nwx52vnARH4BXv1rxdZDGQvEGu6wZt7q4l1dSMoG6U41LxgIh8k0ADWy55MgekFxjk7LEKKRdnF04J+yEup2KNVzhaNZVOnJ6BdSUF3zkVpcCykTM/Intial52DoSBU2oKMkEAccl9qPr1v28IUL7k0Y1z1pRjRrRZs13e/EAyFbwtlPEwoG8ylPDz2P X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(396003)(346002)(136003)(376002)(39860400002)(451199021)(82310400008)(36840700001)(46966006)(40470700004)(36756003)(2616005)(66574015)(426003)(5660300002)(8936002)(186003)(36860700001)(8676002)(47076005)(83380400001)(26005)(336012)(478600001)(4326008)(70206006)(54906003)(110136005)(316002)(6666004)(41300700001)(70586007)(86362001)(40480700001)(40460700003)(1076003)(2906002)(356005)(7636003)(82740400003);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Aug 2023 17:13:07.7010 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 25d91cf9-007c-40ad-25b5-08db937bbddb X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000971E6.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB7555 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773139444939134780 X-GMAIL-MSGID: 1773139444939134780 |
Series |
vdpa/mlx5: Fixes for ASID handling
|
|
Commit Message
Dragos Tatulea
Aug. 2, 2023, 5:12 p.m. UTC
The mr->initialized flag is shared between the control vq and data vq part of the mr init/uninit. But if the control vq and data vq get placed in different ASIDs, it can happen that initializing the control vq will prevent the data vq mr from being initialized. This patch consolidates the control and data vq init parts into their own init functions. The mr->initialized will now be used for the data vq only. The control vq currently doesn't need a flag. The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got split into data and control vq functions which are now also ASID aware. Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Gal Pressman <gal@nvidia.com> --- drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- 2 files changed, 71 insertions(+), 27 deletions(-)
Comments
On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > The mr->initialized flag is shared between the control vq and data vq > part of the mr init/uninit. But if the control vq and data vq get placed > in different ASIDs, it can happen that initializing the control vq will > prevent the data vq mr from being initialized. > > This patch consolidates the control and data vq init parts into their > own init functions. The mr->initialized will now be used for the data vq > only. The control vq currently doesn't need a flag. > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > split into data and control vq functions which are now also ASID aware. > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > Reviewed-by: Gal Pressman <gal@nvidia.com> > --- > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > 2 files changed, 71 insertions(+), 27 deletions(-) > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > index 25fc4120b618..a0420be5059f 100644 > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > struct list_head head; > unsigned long num_directs; > unsigned long num_klms; > + /* state of dvq mr */ > bool initialized; > > /* serialize mkey creation and destruction */ > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > index 03e543229791..4ae14a248a4b 100644 > --- a/drivers/vdpa/mlx5/core/mr.c > +++ b/drivers/vdpa/mlx5/core/mr.c > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > } > } > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > +{ > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > + return; > + > + prune_iotlb(mvdev); > +} > + > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > { > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > - mutex_lock(&mr->mkey_mtx); > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > + return; > + > if (!mr->initialized) > - goto out; > + return; > > - prune_iotlb(mvdev); > if (mr->user_mr) > destroy_user_mr(mvdev, mr); > else > destroy_dma_mr(mvdev, mr); > > mr->initialized = false; > -out: > +} > + > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > +{ > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > + > + mutex_lock(&mr->mkey_mtx); > + > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > + > mutex_unlock(&mr->mkey_mtx); > } > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > - struct vhost_iotlb *iotlb, unsigned int asid) > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > +{ > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); > +} > + > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > + struct vhost_iotlb *iotlb, > + unsigned int asid) > +{ > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > + return 0; > + > + return dup_iotlb(mvdev, iotlb); This worries me as conceptually, there should be no difference between dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. One example is that, if we only do dup_iotlb() but not try to create dma mr here, we will break virtio-vdpa: commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 Author: Eli Cohen <elic@nvidia.com> Date: Wed Jun 2 11:58:54 2021 +0300 vdpa/mlx5: Add support for running with virtio_vdpa In order to support running vdpa using vritio_vdpa driver, we need to create a different kind of MR, one that has 1:1 mapping, since the addresses referring to virtqueues are dma addresses. We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware supports the general capability umem_uid_0. The reason for that is that 1:1 MRs must be created with uid == 0 while virtqueue objects can be created with uid == 0 only when the firmware capability is on. If the set_map() callback is called with new translations provided through iotlb, the driver will destroy the 1:1 MR and create a regular one. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Thanks > +} > + > +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > + struct vhost_iotlb *iotlb, > + unsigned int asid) > { > struct mlx5_vdpa_mr *mr = &mvdev->mr; > int err; > > - if (mr->initialized) > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > return 0; > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > - if (iotlb) > - err = create_user_mr(mvdev, iotlb); > - else > - err = create_dma_mr(mvdev, mr); > + if (mr->initialized) > + return 0; > > - if (err) > - return err; > - } > + if (iotlb) > + err = create_user_mr(mvdev, iotlb); > + else > + err = create_dma_mr(mvdev, mr); > > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > - err = dup_iotlb(mvdev, iotlb); > - if (err) > - goto out_err; > - } > + if (err) > + return err; > > mr->initialized = true; > + > + return 0; > +} > + > +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > + struct vhost_iotlb *iotlb, unsigned int asid) > +{ > + int err; > + > + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > + if (err) > + return err; > + > + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > + if (err) > + goto out_err; > + > return 0; > > out_err: > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > - if (iotlb) > - destroy_user_mr(mvdev, mr); > - else > - destroy_dma_mr(mvdev, mr); > - } > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > return err; > } > -- > 2.41.0 >
On Thu, 2023-08-03 at 16:03 +0800, Jason Wang wrote: > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > The mr->initialized flag is shared between the control vq and data vq > > part of the mr init/uninit. But if the control vq and data vq get placed > > in different ASIDs, it can happen that initializing the control vq will > > prevent the data vq mr from being initialized. > > > > This patch consolidates the control and data vq init parts into their > > own init functions. The mr->initialized will now be used for the data vq > > only. The control vq currently doesn't need a flag. > > > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > > split into data and control vq functions which are now also ASID aware. > > > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for > > control and data") > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > > Reviewed-by: Gal Pressman <gal@nvidia.com> > > --- > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > > 2 files changed, 71 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > index 25fc4120b618..a0420be5059f 100644 > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > > struct list_head head; > > unsigned long num_directs; > > unsigned long num_klms; > > + /* state of dvq mr */ > > bool initialized; > > > > /* serialize mkey creation and destruction */ > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > > index 03e543229791..4ae14a248a4b 100644 > > --- a/drivers/vdpa/mlx5/core/mr.c > > +++ b/drivers/vdpa/mlx5/core/mr.c > > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev > > *mvdev, struct mlx5_vdpa_mr *mr > > } > > } > > > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned > > int asid) > > +{ > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > + return; > > + > > + prune_iotlb(mvdev); > > +} > > + > > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned > > int asid) > > { > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > - mutex_lock(&mr->mkey_mtx); > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > + return; > > + > > if (!mr->initialized) > > - goto out; > > + return; > > > > - prune_iotlb(mvdev); > > if (mr->user_mr) > > destroy_user_mr(mvdev, mr); > > else > > destroy_dma_mr(mvdev, mr); > > > > mr->initialized = false; > > -out: > > +} > > + > > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned > > int asid) > > +{ > > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > > + > > + mutex_lock(&mr->mkey_mtx); > > + > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > > + > > mutex_unlock(&mr->mkey_mtx); > > } > > > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > - struct vhost_iotlb *iotlb, unsigned int > > asid) > > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > +{ > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > >group2asid[MLX5_VDPA_CVQ_GROUP]); > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > >group2asid[MLX5_VDPA_DATAVQ_GROUP]); > > +} > > + > > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > + struct vhost_iotlb *iotlb, > > + unsigned int asid) > > +{ > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > + return 0; > > + > > + return dup_iotlb(mvdev, iotlb); > > This worries me as conceptually, there should be no difference between > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > Are you worried by the changes in this patch or about the possibility of having The reason for this change is that I noticed if you create one mr in one asid you could be blocked out from creating another one in a different asid due to mr->initialized being true. To me that seemed problematic. Is it not? > One example is that, if we only do dup_iotlb() but not try to create > dma mr here, we will break virtio-vdpa: > How will that be possible? _mlx5_vdpa_create_mr calls _mlx5_vdpa_create_dvq_mr and _mlx5_vdpa_create_cvq_mr. The only thing that is different in this patch is that the cvq is not protected by an init flag. My understanding was that it would be ok to dup_iotlb again. Is it not? If not I could add an additional initialized flag for the cvq mr. Thanks, Dragos > commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > Author: Eli Cohen <elic@nvidia.com> > Date: Wed Jun 2 11:58:54 2021 +0300 > > vdpa/mlx5: Add support for running with virtio_vdpa > > In order to support running vdpa using vritio_vdpa driver, we need to > create a different kind of MR, one that has 1:1 mapping, since the > addresses referring to virtqueues are dma addresses. > > We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > supports the general capability umem_uid_0. The reason for that is that > 1:1 MRs must be created with uid == 0 while virtqueue objects can be > created with uid == 0 only when the firmware capability is on. > > If the set_map() callback is called with new translations provided > through iotlb, the driver will destroy the 1:1 MR and create a regular > one. > > Signed-off-by: Eli Cohen <elic@nvidia.com> > Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > Acked-by: Jason Wang <jasowang@redhat.com> > > > Thanks > > > > +} > > + > > +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > + struct vhost_iotlb *iotlb, > > + unsigned int asid) > > { > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > int err; > > > > - if (mr->initialized) > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > return 0; > > > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > - if (iotlb) > > - err = create_user_mr(mvdev, iotlb); > > - else > > - err = create_dma_mr(mvdev, mr); > > + if (mr->initialized) > > + return 0; > > > > - if (err) > > - return err; > > - } > > + if (iotlb) > > + err = create_user_mr(mvdev, iotlb); > > + else > > + err = create_dma_mr(mvdev, mr); > > > > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > > - err = dup_iotlb(mvdev, iotlb); > > - if (err) > > - goto out_err; > > - } > > + if (err) > > + return err; > > > > mr->initialized = true; > > + > > + return 0; > > +} > > + > > +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > + struct vhost_iotlb *iotlb, unsigned int > > asid) > > +{ > > + int err; > > + > > + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > > + if (err) > > + return err; > > + > > + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > > + if (err) > > + goto out_err; > > + > > return 0; > > > > out_err: > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > - if (iotlb) > > - destroy_user_mr(mvdev, mr); > > - else > > - destroy_dma_mr(mvdev, mr); > > - } > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > return err; > > } > > -- > > 2.41.0 > > >
On 8/3/2023 1:03 AM, Jason Wang wrote: > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >> The mr->initialized flag is shared between the control vq and data vq >> part of the mr init/uninit. But if the control vq and data vq get placed >> in different ASIDs, it can happen that initializing the control vq will >> prevent the data vq mr from being initialized. >> >> This patch consolidates the control and data vq init parts into their >> own init functions. The mr->initialized will now be used for the data vq >> only. The control vq currently doesn't need a flag. >> >> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got >> split into data and control vq functions which are now also ASID aware. >> >> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") >> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> >> Reviewed-by: Gal Pressman <gal@nvidia.com> >> --- >> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- >> 2 files changed, 71 insertions(+), 27 deletions(-) >> >> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> index 25fc4120b618..a0420be5059f 100644 >> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { >> struct list_head head; >> unsigned long num_directs; >> unsigned long num_klms; >> + /* state of dvq mr */ >> bool initialized; >> >> /* serialize mkey creation and destruction */ >> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >> index 03e543229791..4ae14a248a4b 100644 >> --- a/drivers/vdpa/mlx5/core/mr.c >> +++ b/drivers/vdpa/mlx5/core/mr.c >> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >> } >> } >> >> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> +{ >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >> + return; >> + >> + prune_iotlb(mvdev); >> +} >> + >> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> { >> struct mlx5_vdpa_mr *mr = &mvdev->mr; >> >> - mutex_lock(&mr->mkey_mtx); >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >> + return; >> + >> if (!mr->initialized) >> - goto out; >> + return; >> >> - prune_iotlb(mvdev); >> if (mr->user_mr) >> destroy_user_mr(mvdev, mr); >> else >> destroy_dma_mr(mvdev, mr); >> >> mr->initialized = false; >> -out: >> +} >> + >> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> +{ >> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >> + >> + mutex_lock(&mr->mkey_mtx); >> + >> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >> + >> mutex_unlock(&mr->mkey_mtx); >> } >> >> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >> - struct vhost_iotlb *iotlb, unsigned int asid) >> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >> +{ >> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); >> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); >> +} >> + >> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >> + struct vhost_iotlb *iotlb, >> + unsigned int asid) >> +{ >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >> + return 0; >> + >> + return dup_iotlb(mvdev, iotlb); > This worries me as conceptually, there should be no difference between > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > One example is that, if we only do dup_iotlb() but not try to create > dma mr here, we will break virtio-vdpa: For this case, I guess we may need another way to support virtio-vdpa 1:1 mapping rather than overloading virtio device reset semantics, see: https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > Conceptually, the address mapping is not a part of the abstraction for > a virtio device now. So resetting the memory mapping during virtio > device reset seems wrong. where we want to keep memory mapping intact across virtio device reset for best live migration latency/downtime. I wonder would it work to reset the mapping in vhost-vdpa life cycle out of virtio reset, say introduce a .reset_map() op to restore 1:1 mapping within vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can move the iotlb reset logic to there without worry breaking virtio-vdpa. Thanks, -Siwei > > commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > Author: Eli Cohen <elic@nvidia.com> > Date: Wed Jun 2 11:58:54 2021 +0300 > > vdpa/mlx5: Add support for running with virtio_vdpa > > In order to support running vdpa using vritio_vdpa driver, we need to > create a different kind of MR, one that has 1:1 mapping, since the > addresses referring to virtqueues are dma addresses. > > We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > supports the general capability umem_uid_0. The reason for that is that > 1:1 MRs must be created with uid == 0 while virtqueue objects can be > created with uid == 0 only when the firmware capability is on. > > If the set_map() callback is called with new translations provided > through iotlb, the driver will destroy the 1:1 MR and create a regular > one. > > Signed-off-by: Eli Cohen <elic@nvidia.com> > Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > Acked-by: Jason Wang <jasowang@redhat.com> > > Thanks > > >> +} >> + >> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >> + struct vhost_iotlb *iotlb, >> + unsigned int asid) >> { >> struct mlx5_vdpa_mr *mr = &mvdev->mr; >> int err; >> >> - if (mr->initialized) >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >> return 0; >> >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >> - if (iotlb) >> - err = create_user_mr(mvdev, iotlb); >> - else >> - err = create_dma_mr(mvdev, mr); >> + if (mr->initialized) >> + return 0; >> >> - if (err) >> - return err; >> - } >> + if (iotlb) >> + err = create_user_mr(mvdev, iotlb); >> + else >> + err = create_dma_mr(mvdev, mr); >> >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >> - err = dup_iotlb(mvdev, iotlb); >> - if (err) >> - goto out_err; >> - } >> + if (err) >> + return err; >> >> mr->initialized = true; >> + >> + return 0; >> +} >> + >> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >> + struct vhost_iotlb *iotlb, unsigned int asid) >> +{ >> + int err; >> + >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >> + if (err) >> + return err; >> + >> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >> + if (err) >> + goto out_err; >> + >> return 0; >> >> out_err: >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >> - if (iotlb) >> - destroy_user_mr(mvdev, mr); >> - else >> - destroy_dma_mr(mvdev, mr); >> - } >> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >> >> return err; >> } >> -- >> 2.41.0 >> > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On Thu, Aug 3, 2023 at 7:40 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > On Thu, 2023-08-03 at 16:03 +0800, Jason Wang wrote: > > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > The mr->initialized flag is shared between the control vq and data vq > > > part of the mr init/uninit. But if the control vq and data vq get placed > > > in different ASIDs, it can happen that initializing the control vq will > > > prevent the data vq mr from being initialized. > > > > > > This patch consolidates the control and data vq init parts into their > > > own init functions. The mr->initialized will now be used for the data vq > > > only. The control vq currently doesn't need a flag. > > > > > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > > > split into data and control vq functions which are now also ASID aware. > > > > > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for > > > control and data") > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > > > Reviewed-by: Gal Pressman <gal@nvidia.com> > > > --- > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > > > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > > > 2 files changed, 71 insertions(+), 27 deletions(-) > > > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > index 25fc4120b618..a0420be5059f 100644 > > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > > > struct list_head head; > > > unsigned long num_directs; > > > unsigned long num_klms; > > > + /* state of dvq mr */ > > > bool initialized; > > > > > > /* serialize mkey creation and destruction */ > > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > > > index 03e543229791..4ae14a248a4b 100644 > > > --- a/drivers/vdpa/mlx5/core/mr.c > > > +++ b/drivers/vdpa/mlx5/core/mr.c > > > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev > > > *mvdev, struct mlx5_vdpa_mr *mr > > > } > > > } > > > > > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned > > > int asid) > > > +{ > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > + return; > > > + > > > + prune_iotlb(mvdev); > > > +} > > > + > > > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned > > > int asid) > > > { > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > > - mutex_lock(&mr->mkey_mtx); > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > + return; > > > + > > > if (!mr->initialized) > > > - goto out; > > > + return; > > > > > > - prune_iotlb(mvdev); > > > if (mr->user_mr) > > > destroy_user_mr(mvdev, mr); > > > else > > > destroy_dma_mr(mvdev, mr); > > > > > > mr->initialized = false; > > > -out: > > > +} > > > + > > > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned > > > int asid) > > > +{ > > > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > + > > > + mutex_lock(&mr->mkey_mtx); > > > + > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > > > + > > > mutex_unlock(&mr->mkey_mtx); > > > } > > > > > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > - struct vhost_iotlb *iotlb, unsigned int > > > asid) > > > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > +{ > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > >group2asid[MLX5_VDPA_CVQ_GROUP]); > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > >group2asid[MLX5_VDPA_DATAVQ_GROUP]); > > > +} > > > + > > > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct vhost_iotlb *iotlb, > > > + unsigned int asid) > > > +{ > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > + return 0; > > > + > > > + return dup_iotlb(mvdev, iotlb); > > > > This worries me as conceptually, there should be no difference between > > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > > Are you worried by the changes in this patch or about the possibility of having > > The reason for this change is that I noticed if you create one mr in one asid > you could be blocked out from creating another one in a different asid due to > mr->initialized being true. To me that seemed problematic. Is it not? My feeling is that mr.c should be device agnostic. It needs to know nothing about the device details to work. But this patch seems to break the layer. > > > One example is that, if we only do dup_iotlb() but not try to create > > dma mr here, we will break virtio-vdpa: > > > How will that be possible? _mlx5_vdpa_create_mr calls _mlx5_vdpa_create_dvq_mr > and _mlx5_vdpa_create_cvq_mr. The only thing that is different in this patch is > that the cvq is not protected by an init flag. My understanding was that it > would be ok to dup_iotlb again. Is it not? If not I could add an additional > initialized flag for the cvq mr. You are right here. Thanks > > Thanks, > Dragos > > > commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > > Author: Eli Cohen <elic@nvidia.com> > > Date: Wed Jun 2 11:58:54 2021 +0300 > > > > vdpa/mlx5: Add support for running with virtio_vdpa > > > > In order to support running vdpa using vritio_vdpa driver, we need to > > create a different kind of MR, one that has 1:1 mapping, since the > > addresses referring to virtqueues are dma addresses. > > > > We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > > supports the general capability umem_uid_0. The reason for that is that > > 1:1 MRs must be created with uid == 0 while virtqueue objects can be > > created with uid == 0 only when the firmware capability is on. > > > > If the set_map() callback is called with new translations provided > > through iotlb, the driver will destroy the 1:1 MR and create a regular > > one. > > > > Signed-off-by: Eli Cohen <elic@nvidia.com> > > Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > Acked-by: Jason Wang <jasowang@redhat.com> > > > > > > > Thanks > > > > > > > > +} > > > + > > > +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct vhost_iotlb *iotlb, > > > + unsigned int asid) > > > { > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > int err; > > > > > > - if (mr->initialized) > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > return 0; > > > > > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > > - if (iotlb) > > > - err = create_user_mr(mvdev, iotlb); > > > - else > > > - err = create_dma_mr(mvdev, mr); > > > + if (mr->initialized) > > > + return 0; > > > > > > - if (err) > > > - return err; > > > - } > > > + if (iotlb) > > > + err = create_user_mr(mvdev, iotlb); > > > + else > > > + err = create_dma_mr(mvdev, mr); > > > > > > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > > > - err = dup_iotlb(mvdev, iotlb); > > > - if (err) > > > - goto out_err; > > > - } > > > + if (err) > > > + return err; > > > > > > mr->initialized = true; > > > + > > > + return 0; > > > +} > > > + > > > +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct vhost_iotlb *iotlb, unsigned int > > > asid) > > > +{ > > > + int err; > > > + > > > + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > > > + if (err) > > > + return err; > > > + > > > + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > > > + if (err) > > > + goto out_err; > > > + > > > return 0; > > > > > > out_err: > > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > > - if (iotlb) > > > - destroy_user_mr(mvdev, mr); > > > - else > > > - destroy_dma_mr(mvdev, mr); > > > - } > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > > > return err; > > > } > > > -- > > > 2.41.0 > > > > > >
On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 8/3/2023 1:03 AM, Jason Wang wrote: > > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > >> The mr->initialized flag is shared between the control vq and data vq > >> part of the mr init/uninit. But if the control vq and data vq get placed > >> in different ASIDs, it can happen that initializing the control vq will > >> prevent the data vq mr from being initialized. > >> > >> This patch consolidates the control and data vq init parts into their > >> own init functions. The mr->initialized will now be used for the data vq > >> only. The control vq currently doesn't need a flag. > >> > >> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > >> split into data and control vq functions which are now also ASID aware. > >> > >> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") > >> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > >> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > >> Reviewed-by: Gal Pressman <gal@nvidia.com> > >> --- > >> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > >> 2 files changed, 71 insertions(+), 27 deletions(-) > >> > >> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> index 25fc4120b618..a0420be5059f 100644 > >> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > >> struct list_head head; > >> unsigned long num_directs; > >> unsigned long num_klms; > >> + /* state of dvq mr */ > >> bool initialized; > >> > >> /* serialize mkey creation and destruction */ > >> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > >> index 03e543229791..4ae14a248a4b 100644 > >> --- a/drivers/vdpa/mlx5/core/mr.c > >> +++ b/drivers/vdpa/mlx5/core/mr.c > >> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > >> } > >> } > >> > >> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> +{ > >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >> + return; > >> + > >> + prune_iotlb(mvdev); > >> +} > >> + > >> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> { > >> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> > >> - mutex_lock(&mr->mkey_mtx); > >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >> + return; > >> + > >> if (!mr->initialized) > >> - goto out; > >> + return; > >> > >> - prune_iotlb(mvdev); > >> if (mr->user_mr) > >> destroy_user_mr(mvdev, mr); > >> else > >> destroy_dma_mr(mvdev, mr); > >> > >> mr->initialized = false; > >> -out: > >> +} > >> + > >> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> +{ > >> + struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> + > >> + mutex_lock(&mr->mkey_mtx); > >> + > >> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > >> + > >> mutex_unlock(&mr->mkey_mtx); > >> } > >> > >> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >> - struct vhost_iotlb *iotlb, unsigned int asid) > >> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >> +{ > >> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); > >> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); > >> +} > >> + > >> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > >> + struct vhost_iotlb *iotlb, > >> + unsigned int asid) > >> +{ > >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >> + return 0; > >> + > >> + return dup_iotlb(mvdev, iotlb); > > This worries me as conceptually, there should be no difference between > > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > > > One example is that, if we only do dup_iotlb() but not try to create > > dma mr here, we will break virtio-vdpa: > For this case, I guess we may need another way to support virtio-vdpa > 1:1 mapping rather than overloading virtio device reset semantics, see: > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > > > Conceptually, the address mapping is not a part of the abstraction for > > a virtio device now. So resetting the memory mapping during virtio > > device reset seems wrong. > > where we want to keep memory mapping intact across virtio device reset > for best live migration latency/downtime. I wonder would it work to > reset the mapping in vhost-vdpa life cycle out of virtio reset, say > introduce a .reset_map() op to restore 1:1 mapping within > vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can > move the iotlb reset logic to there without worry breaking virtio-vdpa. It looks to me we don't need a new ops. We can simply do set_map() twice or do you mean it would be faster? Thanks > > Thanks, > -Siwei > > > > > commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > > Author: Eli Cohen <elic@nvidia.com> > > Date: Wed Jun 2 11:58:54 2021 +0300 > > > > vdpa/mlx5: Add support for running with virtio_vdpa > > > > In order to support running vdpa using vritio_vdpa driver, we need to > > create a different kind of MR, one that has 1:1 mapping, since the > > addresses referring to virtqueues are dma addresses. > > > > We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > > supports the general capability umem_uid_0. The reason for that is that > > 1:1 MRs must be created with uid == 0 while virtqueue objects can be > > created with uid == 0 only when the firmware capability is on. > > > > If the set_map() callback is called with new translations provided > > through iotlb, the driver will destroy the 1:1 MR and create a regular > > one. > > > > Signed-off-by: Eli Cohen <elic@nvidia.com> > > Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > Acked-by: Jason Wang <jasowang@redhat.com> > > > > Thanks > > > > > >> +} > >> + > >> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > >> + struct vhost_iotlb *iotlb, > >> + unsigned int asid) > >> { > >> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> int err; > >> > >> - if (mr->initialized) > >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >> return 0; > >> > >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >> - if (iotlb) > >> - err = create_user_mr(mvdev, iotlb); > >> - else > >> - err = create_dma_mr(mvdev, mr); > >> + if (mr->initialized) > >> + return 0; > >> > >> - if (err) > >> - return err; > >> - } > >> + if (iotlb) > >> + err = create_user_mr(mvdev, iotlb); > >> + else > >> + err = create_dma_mr(mvdev, mr); > >> > >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > >> - err = dup_iotlb(mvdev, iotlb); > >> - if (err) > >> - goto out_err; > >> - } > >> + if (err) > >> + return err; > >> > >> mr->initialized = true; > >> + > >> + return 0; > >> +} > >> + > >> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >> + struct vhost_iotlb *iotlb, unsigned int asid) > >> +{ > >> + int err; > >> + > >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > >> + if (err) > >> + return err; > >> + > >> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > >> + if (err) > >> + goto out_err; > >> + > >> return 0; > >> > >> out_err: > >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >> - if (iotlb) > >> - destroy_user_mr(mvdev, mr); > >> - else > >> - destroy_dma_mr(mvdev, mr); > >> - } > >> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >> > >> return err; > >> } > >> -- > >> 2.41.0 > >> > > _______________________________________________ > > Virtualization mailing list > > Virtualization@lists.linux-foundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/virtualization >
On Tue, 2023-08-08 at 10:57 +0800, Jason Wang wrote: > On Thu, Aug 3, 2023 at 7:40 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > On Thu, 2023-08-03 at 16:03 +0800, Jason Wang wrote: > > > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > > > The mr->initialized flag is shared between the control vq and data vq > > > > part of the mr init/uninit. But if the control vq and data vq get placed > > > > in different ASIDs, it can happen that initializing the control vq will > > > > prevent the data vq mr from being initialized. > > > > > > > > This patch consolidates the control and data vq init parts into their > > > > own init functions. The mr->initialized will now be used for the data vq > > > > only. The control vq currently doesn't need a flag. > > > > > > > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > > > > split into data and control vq functions which are now also ASID aware. > > > > > > > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for > > > > control and data") > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > > > > Reviewed-by: Gal Pressman <gal@nvidia.com> > > > > --- > > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > > > > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > > > > 2 files changed, 71 insertions(+), 27 deletions(-) > > > > > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > index 25fc4120b618..a0420be5059f 100644 > > > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > > > > struct list_head head; > > > > unsigned long num_directs; > > > > unsigned long num_klms; > > > > + /* state of dvq mr */ > > > > bool initialized; > > > > > > > > /* serialize mkey creation and destruction */ > > > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > > > > index 03e543229791..4ae14a248a4b 100644 > > > > --- a/drivers/vdpa/mlx5/core/mr.c > > > > +++ b/drivers/vdpa/mlx5/core/mr.c > > > > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev > > > > *mvdev, struct mlx5_vdpa_mr *mr > > > > } > > > > } > > > > > > > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > unsigned > > > > int asid) > > > > +{ > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > + return; > > > > + > > > > + prune_iotlb(mvdev); > > > > +} > > > > + > > > > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > unsigned > > > > int asid) > > > > { > > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > > > > - mutex_lock(&mr->mkey_mtx); > > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > > + return; > > > > + > > > > if (!mr->initialized) > > > > - goto out; > > > > + return; > > > > > > > > - prune_iotlb(mvdev); > > > > if (mr->user_mr) > > > > destroy_user_mr(mvdev, mr); > > > > else > > > > destroy_dma_mr(mvdev, mr); > > > > > > > > mr->initialized = false; > > > > -out: > > > > +} > > > > + > > > > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, > > > > unsigned > > > > int asid) > > > > +{ > > > > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > + > > > > + mutex_lock(&mr->mkey_mtx); > > > > + > > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > > > > + > > > > mutex_unlock(&mr->mkey_mtx); > > > > } > > > > > > > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > > - struct vhost_iotlb *iotlb, unsigned int > > > > asid) > > > > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > +{ > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > group2asid[MLX5_VDPA_CVQ_GROUP]); > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > group2asid[MLX5_VDPA_DATAVQ_GROUP]); > > > > +} > > > > + > > > > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > + struct vhost_iotlb *iotlb, > > > > + unsigned int asid) > > > > +{ > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > + return 0; > > > > + > > > > + return dup_iotlb(mvdev, iotlb); > > > > > > This worries me as conceptually, there should be no difference between > > > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > > > > Are you worried by the changes in this patch or about the possibility of > > having > > > > The reason for this change is that I noticed if you create one mr in one > > asid > > you could be blocked out from creating another one in a different asid due > > to > > mr->initialized being true. To me that seemed problematic. Is it not? > > My feeling is that mr.c should be device agnostic. It needs to know > nothing about the device details to work. But this patch seems to > break the layer. > But the same logic was there before (with the exception of cvq not having an init flag anymore). So what am I missing here? > > > > > One example is that, if we only do dup_iotlb() but not try to create > > > dma mr here, we will break virtio-vdpa: > > > > > How will that be possible? _mlx5_vdpa_create_mr calls > > _mlx5_vdpa_create_dvq_mr > > and _mlx5_vdpa_create_cvq_mr. The only thing that is different in this patch > > is > > that the cvq is not protected by an init flag. My understanding was that it > > would be ok to dup_iotlb again. Is it not? If not I could add an additional > > initialized flag for the cvq mr. > > You are right here. > > Thanks > > > > > > Thanks, > > Dragos > > > > > commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > > > Author: Eli Cohen <elic@nvidia.com> > > > Date: Wed Jun 2 11:58:54 2021 +0300 > > > > > > vdpa/mlx5: Add support for running with virtio_vdpa > > > > > > In order to support running vdpa using vritio_vdpa driver, we need to > > > create a different kind of MR, one that has 1:1 mapping, since the > > > addresses referring to virtqueues are dma addresses. > > > > > > We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > > > supports the general capability umem_uid_0. The reason for that is > > > that > > > 1:1 MRs must be created with uid == 0 while virtqueue objects can be > > > created with uid == 0 only when the firmware capability is on. > > > > > > If the set_map() callback is called with new translations provided > > > through iotlb, the driver will destroy the 1:1 MR and create a regular > > > one. > > > > > > Signed-off-by: Eli Cohen <elic@nvidia.com> > > > Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > > Acked-by: Jason Wang <jasowang@redhat.com> > > > > > > > > > > > Thanks > > > > > > > > > > > > +} > > > > + > > > > +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > + struct vhost_iotlb *iotlb, > > > > + unsigned int asid) > > > > { > > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > int err; > > > > > > > > - if (mr->initialized) > > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > > return 0; > > > > > > > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > > > - if (iotlb) > > > > - err = create_user_mr(mvdev, iotlb); > > > > - else > > > > - err = create_dma_mr(mvdev, mr); > > > > + if (mr->initialized) > > > > + return 0; > > > > > > > > - if (err) > > > > - return err; > > > > - } > > > > + if (iotlb) > > > > + err = create_user_mr(mvdev, iotlb); > > > > + else > > > > + err = create_dma_mr(mvdev, mr); > > > > > > > > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > > > > - err = dup_iotlb(mvdev, iotlb); > > > > - if (err) > > > > - goto out_err; > > > > - } > > > > + if (err) > > > > + return err; > > > > > > > > mr->initialized = true; > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > > + struct vhost_iotlb *iotlb, unsigned int > > > > asid) > > > > +{ > > > > + int err; > > > > + > > > > + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > > > > + if (err) > > > > + return err; > > > > + > > > > + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > > > > + if (err) > > > > + goto out_err; > > > > + > > > > return 0; > > > > > > > > out_err: > > > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > > > > - if (iotlb) > > > > - destroy_user_mr(mvdev, mr); > > > > - else > > > > - destroy_dma_mr(mvdev, mr); > > > > - } > > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > > > > > return err; > > > > } > > > > -- > > > > 2.41.0 > > > > > > > > > >
On 8/7/2023 8:00 PM, Jason Wang wrote: > On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 8/3/2023 1:03 AM, Jason Wang wrote: >>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >>>> The mr->initialized flag is shared between the control vq and data vq >>>> part of the mr init/uninit. But if the control vq and data vq get placed >>>> in different ASIDs, it can happen that initializing the control vq will >>>> prevent the data vq mr from being initialized. >>>> >>>> This patch consolidates the control and data vq init parts into their >>>> own init functions. The mr->initialized will now be used for the data vq >>>> only. The control vq currently doesn't need a flag. >>>> >>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got >>>> split into data and control vq functions which are now also ASID aware. >>>> >>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") >>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> >>>> Reviewed-by: Gal Pressman <gal@nvidia.com> >>>> --- >>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- >>>> 2 files changed, 71 insertions(+), 27 deletions(-) >>>> >>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> index 25fc4120b618..a0420be5059f 100644 >>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { >>>> struct list_head head; >>>> unsigned long num_directs; >>>> unsigned long num_klms; >>>> + /* state of dvq mr */ >>>> bool initialized; >>>> >>>> /* serialize mkey creation and destruction */ >>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >>>> index 03e543229791..4ae14a248a4b 100644 >>>> --- a/drivers/vdpa/mlx5/core/mr.c >>>> +++ b/drivers/vdpa/mlx5/core/mr.c >>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >>>> } >>>> } >>>> >>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> +{ >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>> + return; >>>> + >>>> + prune_iotlb(mvdev); >>>> +} >>>> + >>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> { >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> >>>> - mutex_lock(&mr->mkey_mtx); >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>> + return; >>>> + >>>> if (!mr->initialized) >>>> - goto out; >>>> + return; >>>> >>>> - prune_iotlb(mvdev); >>>> if (mr->user_mr) >>>> destroy_user_mr(mvdev, mr); >>>> else >>>> destroy_dma_mr(mvdev, mr); >>>> >>>> mr->initialized = false; >>>> -out: >>>> +} >>>> + >>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> +{ >>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> + >>>> + mutex_lock(&mr->mkey_mtx); >>>> + >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >>>> + >>>> mutex_unlock(&mr->mkey_mtx); >>>> } >>>> >>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>> - struct vhost_iotlb *iotlb, unsigned int asid) >>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>> +{ >>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); >>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); >>>> +} >>>> + >>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >>>> + struct vhost_iotlb *iotlb, >>>> + unsigned int asid) >>>> +{ >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>> + return 0; >>>> + >>>> + return dup_iotlb(mvdev, iotlb); >>> This worries me as conceptually, there should be no difference between >>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. >>> >>> One example is that, if we only do dup_iotlb() but not try to create >>> dma mr here, we will break virtio-vdpa: >> For this case, I guess we may need another way to support virtio-vdpa >> 1:1 mapping rather than overloading virtio device reset semantics, see: >> >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >> >> > Conceptually, the address mapping is not a part of the abstraction for >> > a virtio device now. So resetting the memory mapping during virtio >> > device reset seems wrong. >> >> where we want to keep memory mapping intact across virtio device reset >> for best live migration latency/downtime. I wonder would it work to >> reset the mapping in vhost-vdpa life cycle out of virtio reset, say >> introduce a .reset_map() op to restore 1:1 mapping within >> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can >> move the iotlb reset logic to there without worry breaking virtio-vdpa. > It looks to me we don't need a new ops. We can simply do set_map() > twice What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed in to destroy all iotlb mappings previously added, and second set_map(0, -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, this is supported today I think. Why there'll be such obscure distinction, or what's the benefit to treat second .set_map() as recreating 1:1 mapping? > or do you mean it would be faster? I think with .reset_map() we at least can avoid indefinite latency hiccup from destroying and recreating 1:1 mapping with the unwarranted 2rd unmap call. And .reset_map() should work with both .dma_map() and .set_map() APIs with clear semantics. Regards, -Siwei > > Thanks > >> Thanks, >> -Siwei >> >>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 >>> Author: Eli Cohen <elic@nvidia.com> >>> Date: Wed Jun 2 11:58:54 2021 +0300 >>> >>> vdpa/mlx5: Add support for running with virtio_vdpa >>> >>> In order to support running vdpa using vritio_vdpa driver, we need to >>> create a different kind of MR, one that has 1:1 mapping, since the >>> addresses referring to virtqueues are dma addresses. >>> >>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware >>> supports the general capability umem_uid_0. The reason for that is that >>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be >>> created with uid == 0 only when the firmware capability is on. >>> >>> If the set_map() callback is called with new translations provided >>> through iotlb, the driver will destroy the 1:1 MR and create a regular >>> one. >>> >>> Signed-off-by: Eli Cohen <elic@nvidia.com> >>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com >>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> >>> Acked-by: Jason Wang <jasowang@redhat.com> >>> >>> Thanks >>> >>> >>>> +} >>>> + >>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >>>> + struct vhost_iotlb *iotlb, >>>> + unsigned int asid) >>>> { >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> int err; >>>> >>>> - if (mr->initialized) >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>> return 0; >>>> >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>> - if (iotlb) >>>> - err = create_user_mr(mvdev, iotlb); >>>> - else >>>> - err = create_dma_mr(mvdev, mr); >>>> + if (mr->initialized) >>>> + return 0; >>>> >>>> - if (err) >>>> - return err; >>>> - } >>>> + if (iotlb) >>>> + err = create_user_mr(mvdev, iotlb); >>>> + else >>>> + err = create_dma_mr(mvdev, mr); >>>> >>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >>>> - err = dup_iotlb(mvdev, iotlb); >>>> - if (err) >>>> - goto out_err; >>>> - } >>>> + if (err) >>>> + return err; >>>> >>>> mr->initialized = true; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>> + struct vhost_iotlb *iotlb, unsigned int asid) >>>> +{ >>>> + int err; >>>> + >>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >>>> + if (err) >>>> + return err; >>>> + >>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >>>> + if (err) >>>> + goto out_err; >>>> + >>>> return 0; >>>> >>>> out_err: >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>> - if (iotlb) >>>> - destroy_user_mr(mvdev, mr); >>>> - else >>>> - destroy_dma_mr(mvdev, mr); >>>> - } >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>> >>>> return err; >>>> } >>>> -- >>>> 2.41.0 >>>> >>> _______________________________________________ >>> Virtualization mailing list >>> Virtualization@lists.linux-foundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On Tue, Aug 8, 2023 at 3:24 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > On Tue, 2023-08-08 at 10:57 +0800, Jason Wang wrote: > > On Thu, Aug 3, 2023 at 7:40 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > On Thu, 2023-08-03 at 16:03 +0800, Jason Wang wrote: > > > > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > > > > > The mr->initialized flag is shared between the control vq and data vq > > > > > part of the mr init/uninit. But if the control vq and data vq get placed > > > > > in different ASIDs, it can happen that initializing the control vq will > > > > > prevent the data vq mr from being initialized. > > > > > > > > > > This patch consolidates the control and data vq init parts into their > > > > > own init functions. The mr->initialized will now be used for the data vq > > > > > only. The control vq currently doesn't need a flag. > > > > > > > > > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > > > > > split into data and control vq functions which are now also ASID aware. > > > > > > > > > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for > > > > > control and data") > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > > > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com> > > > > > --- > > > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > > > > > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > > > > > 2 files changed, 71 insertions(+), 27 deletions(-) > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > index 25fc4120b618..a0420be5059f 100644 > > > > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > > > > > struct list_head head; > > > > > unsigned long num_directs; > > > > > unsigned long num_klms; > > > > > + /* state of dvq mr */ > > > > > bool initialized; > > > > > > > > > > /* serialize mkey creation and destruction */ > > > > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > > > > > index 03e543229791..4ae14a248a4b 100644 > > > > > --- a/drivers/vdpa/mlx5/core/mr.c > > > > > +++ b/drivers/vdpa/mlx5/core/mr.c > > > > > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev > > > > > *mvdev, struct mlx5_vdpa_mr *mr > > > > > } > > > > > } > > > > > > > > > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > unsigned > > > > > int asid) > > > > > +{ > > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > > + return; > > > > > + > > > > > + prune_iotlb(mvdev); > > > > > +} > > > > > + > > > > > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > unsigned > > > > > int asid) > > > > > { > > > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > > > > > > - mutex_lock(&mr->mkey_mtx); > > > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > > > + return; > > > > > + > > > > > if (!mr->initialized) > > > > > - goto out; > > > > > + return; > > > > > > > > > > - prune_iotlb(mvdev); > > > > > if (mr->user_mr) > > > > > destroy_user_mr(mvdev, mr); > > > > > else > > > > > destroy_dma_mr(mvdev, mr); > > > > > > > > > > mr->initialized = false; > > > > > -out: > > > > > +} > > > > > + > > > > > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, > > > > > unsigned > > > > > int asid) > > > > > +{ > > > > > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > + > > > > > + mutex_lock(&mr->mkey_mtx); > > > > > + > > > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > > > > > + > > > > > mutex_unlock(&mr->mkey_mtx); > > > > > } > > > > > > > > > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > > > - struct vhost_iotlb *iotlb, unsigned int > > > > > asid) > > > > > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > > +{ > > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > > group2asid[MLX5_VDPA_CVQ_GROUP]); > > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > > group2asid[MLX5_VDPA_DATAVQ_GROUP]); > > > > > +} > > > > > + > > > > > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > + struct vhost_iotlb *iotlb, > > > > > + unsigned int asid) > > > > > +{ > > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > > + return 0; > > > > > + > > > > > + return dup_iotlb(mvdev, iotlb); > > > > > > > > This worries me as conceptually, there should be no difference between > > > > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > > > > > > Are you worried by the changes in this patch or about the possibility of > > > having > > > > > > The reason for this change is that I noticed if you create one mr in one > > > asid > > > you could be blocked out from creating another one in a different asid due > > > to > > > mr->initialized being true. To me that seemed problematic. Is it not? > > > > My feeling is that mr.c should be device agnostic. It needs to know > > nothing about the device details to work. But this patch seems to > > break the layer. > > > But the same logic was there before (with the exception of cvq not having an > init flag anymore). So what am I missing here? Nothing, I think you're right. I think we can have this patch go first and tweak on top by moving CVQ aware logic into the net specific codes. Thanks
On Wed, Aug 9, 2023 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 8/7/2023 8:00 PM, Jason Wang wrote: > > On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 8/3/2023 1:03 AM, Jason Wang wrote: > >>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > >>>> The mr->initialized flag is shared between the control vq and data vq > >>>> part of the mr init/uninit. But if the control vq and data vq get placed > >>>> in different ASIDs, it can happen that initializing the control vq will > >>>> prevent the data vq mr from being initialized. > >>>> > >>>> This patch consolidates the control and data vq init parts into their > >>>> own init functions. The mr->initialized will now be used for the data vq > >>>> only. The control vq currently doesn't need a flag. > >>>> > >>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > >>>> split into data and control vq functions which are now also ASID aware. > >>>> > >>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") > >>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > >>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > >>>> Reviewed-by: Gal Pressman <gal@nvidia.com> > >>>> --- > >>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > >>>> 2 files changed, 71 insertions(+), 27 deletions(-) > >>>> > >>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>> index 25fc4120b618..a0420be5059f 100644 > >>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > >>>> struct list_head head; > >>>> unsigned long num_directs; > >>>> unsigned long num_klms; > >>>> + /* state of dvq mr */ > >>>> bool initialized; > >>>> > >>>> /* serialize mkey creation and destruction */ > >>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > >>>> index 03e543229791..4ae14a248a4b 100644 > >>>> --- a/drivers/vdpa/mlx5/core/mr.c > >>>> +++ b/drivers/vdpa/mlx5/core/mr.c > >>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > >>>> } > >>>> } > >>>> > >>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>> +{ > >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>> + return; > >>>> + > >>>> + prune_iotlb(mvdev); > >>>> +} > >>>> + > >>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>> { > >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>> > >>>> - mutex_lock(&mr->mkey_mtx); > >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>> + return; > >>>> + > >>>> if (!mr->initialized) > >>>> - goto out; > >>>> + return; > >>>> > >>>> - prune_iotlb(mvdev); > >>>> if (mr->user_mr) > >>>> destroy_user_mr(mvdev, mr); > >>>> else > >>>> destroy_dma_mr(mvdev, mr); > >>>> > >>>> mr->initialized = false; > >>>> -out: > >>>> +} > >>>> + > >>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>> +{ > >>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>> + > >>>> + mutex_lock(&mr->mkey_mtx); > >>>> + > >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > >>>> + > >>>> mutex_unlock(&mr->mkey_mtx); > >>>> } > >>>> > >>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>> - struct vhost_iotlb *iotlb, unsigned int asid) > >>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>> +{ > >>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); > >>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); > >>>> +} > >>>> + > >>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>> + struct vhost_iotlb *iotlb, > >>>> + unsigned int asid) > >>>> +{ > >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>> + return 0; > >>>> + > >>>> + return dup_iotlb(mvdev, iotlb); > >>> This worries me as conceptually, there should be no difference between > >>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > >>> > >>> One example is that, if we only do dup_iotlb() but not try to create > >>> dma mr here, we will break virtio-vdpa: > >> For this case, I guess we may need another way to support virtio-vdpa > >> 1:1 mapping rather than overloading virtio device reset semantics, see: > >> > >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >> > >> > Conceptually, the address mapping is not a part of the abstraction for > >> > a virtio device now. So resetting the memory mapping during virtio > >> > device reset seems wrong. > >> > >> where we want to keep memory mapping intact across virtio device reset > >> for best live migration latency/downtime. I wonder would it work to > >> reset the mapping in vhost-vdpa life cycle out of virtio reset, say > >> introduce a .reset_map() op to restore 1:1 mapping within > >> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can > >> move the iotlb reset logic to there without worry breaking virtio-vdpa. > > It looks to me we don't need a new ops. We can simply do set_map() > > twice > What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed > in to destroy all iotlb mappings previously added, and second set_map(0, > -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but > doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, > this is supported today I think. Why there'll be such obscure > distinction, or what's the benefit to treat second .set_map() as > recreating 1:1 mapping? Ok, I think I miss some context. I agree that it's better to decouple memory mappings from the virtio reset. It helps to reduce the unnecessary memory transactions. It might require a new feature flag. Regarding the method of restoring to 1:1 DMA MR, it might be dangerous for (buggy) vhost-vDPA devices. Since its userspace doesn't set up any mapping it can explore the kernel with that via CVQ? Thanks > > > or do you mean it would be faster? > I think with .reset_map() we at least can avoid indefinite latency > hiccup from destroying and recreating 1:1 mapping with the unwarranted > 2rd unmap call. And .reset_map() should work with both .dma_map() and > .set_map() APIs with clear semantics. > > Regards, > -Siwei > > > > Thanks > > > >> Thanks, > >> -Siwei > >> > >>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > >>> Author: Eli Cohen <elic@nvidia.com> > >>> Date: Wed Jun 2 11:58:54 2021 +0300 > >>> > >>> vdpa/mlx5: Add support for running with virtio_vdpa > >>> > >>> In order to support running vdpa using vritio_vdpa driver, we need to > >>> create a different kind of MR, one that has 1:1 mapping, since the > >>> addresses referring to virtqueues are dma addresses. > >>> > >>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > >>> supports the general capability umem_uid_0. The reason for that is that > >>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be > >>> created with uid == 0 only when the firmware capability is on. > >>> > >>> If the set_map() callback is called with new translations provided > >>> through iotlb, the driver will destroy the 1:1 MR and create a regular > >>> one. > >>> > >>> Signed-off-by: Eli Cohen <elic@nvidia.com> > >>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > >>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > >>> Acked-by: Jason Wang <jasowang@redhat.com> > >>> > >>> Thanks > >>> > >>> > >>>> +} > >>>> + > >>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>> + struct vhost_iotlb *iotlb, > >>>> + unsigned int asid) > >>>> { > >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>> int err; > >>>> > >>>> - if (mr->initialized) > >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>> return 0; > >>>> > >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>> - if (iotlb) > >>>> - err = create_user_mr(mvdev, iotlb); > >>>> - else > >>>> - err = create_dma_mr(mvdev, mr); > >>>> + if (mr->initialized) > >>>> + return 0; > >>>> > >>>> - if (err) > >>>> - return err; > >>>> - } > >>>> + if (iotlb) > >>>> + err = create_user_mr(mvdev, iotlb); > >>>> + else > >>>> + err = create_dma_mr(mvdev, mr); > >>>> > >>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > >>>> - err = dup_iotlb(mvdev, iotlb); > >>>> - if (err) > >>>> - goto out_err; > >>>> - } > >>>> + if (err) > >>>> + return err; > >>>> > >>>> mr->initialized = true; > >>>> + > >>>> + return 0; > >>>> +} > >>>> + > >>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>> + struct vhost_iotlb *iotlb, unsigned int asid) > >>>> +{ > >>>> + int err; > >>>> + > >>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > >>>> + if (err) > >>>> + return err; > >>>> + > >>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > >>>> + if (err) > >>>> + goto out_err; > >>>> + > >>>> return 0; > >>>> > >>>> out_err: > >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>> - if (iotlb) > >>>> - destroy_user_mr(mvdev, mr); > >>>> - else > >>>> - destroy_dma_mr(mvdev, mr); > >>>> - } > >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>> > >>>> return err; > >>>> } > >>>> -- > >>>> 2.41.0 > >>>> > >>> _______________________________________________ > >>> Virtualization mailing list > >>> Virtualization@lists.linux-foundation.org > >>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization >
On 8/8/2023 11:52 PM, Jason Wang wrote: > On Wed, Aug 9, 2023 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 8/7/2023 8:00 PM, Jason Wang wrote: >>> On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> >>>> On 8/3/2023 1:03 AM, Jason Wang wrote: >>>>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >>>>>> The mr->initialized flag is shared between the control vq and data vq >>>>>> part of the mr init/uninit. But if the control vq and data vq get placed >>>>>> in different ASIDs, it can happen that initializing the control vq will >>>>>> prevent the data vq mr from being initialized. >>>>>> >>>>>> This patch consolidates the control and data vq init parts into their >>>>>> own init functions. The mr->initialized will now be used for the data vq >>>>>> only. The control vq currently doesn't need a flag. >>>>>> >>>>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got >>>>>> split into data and control vq functions which are now also ASID aware. >>>>>> >>>>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") >>>>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >>>>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> >>>>>> Reviewed-by: Gal Pressman <gal@nvidia.com> >>>>>> --- >>>>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >>>>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- >>>>>> 2 files changed, 71 insertions(+), 27 deletions(-) >>>>>> >>>>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>> index 25fc4120b618..a0420be5059f 100644 >>>>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { >>>>>> struct list_head head; >>>>>> unsigned long num_directs; >>>>>> unsigned long num_klms; >>>>>> + /* state of dvq mr */ >>>>>> bool initialized; >>>>>> >>>>>> /* serialize mkey creation and destruction */ >>>>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >>>>>> index 03e543229791..4ae14a248a4b 100644 >>>>>> --- a/drivers/vdpa/mlx5/core/mr.c >>>>>> +++ b/drivers/vdpa/mlx5/core/mr.c >>>>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >>>>>> } >>>>>> } >>>>>> >>>>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>> +{ >>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>>>> + return; >>>>>> + >>>>>> + prune_iotlb(mvdev); >>>>>> +} >>>>>> + >>>>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>> { >>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>> >>>>>> - mutex_lock(&mr->mkey_mtx); >>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>>>> + return; >>>>>> + >>>>>> if (!mr->initialized) >>>>>> - goto out; >>>>>> + return; >>>>>> >>>>>> - prune_iotlb(mvdev); >>>>>> if (mr->user_mr) >>>>>> destroy_user_mr(mvdev, mr); >>>>>> else >>>>>> destroy_dma_mr(mvdev, mr); >>>>>> >>>>>> mr->initialized = false; >>>>>> -out: >>>>>> +} >>>>>> + >>>>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>> +{ >>>>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>> + >>>>>> + mutex_lock(&mr->mkey_mtx); >>>>>> + >>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >>>>>> + >>>>>> mutex_unlock(&mr->mkey_mtx); >>>>>> } >>>>>> >>>>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>>>> - struct vhost_iotlb *iotlb, unsigned int asid) >>>>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>>>> +{ >>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); >>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); >>>>>> +} >>>>>> + >>>>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >>>>>> + struct vhost_iotlb *iotlb, >>>>>> + unsigned int asid) >>>>>> +{ >>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>>>> + return 0; >>>>>> + >>>>>> + return dup_iotlb(mvdev, iotlb); >>>>> This worries me as conceptually, there should be no difference between >>>>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. >>>>> >>>>> One example is that, if we only do dup_iotlb() but not try to create >>>>> dma mr here, we will break virtio-vdpa: >>>> For this case, I guess we may need another way to support virtio-vdpa >>>> 1:1 mapping rather than overloading virtio device reset semantics, see: >>>> >>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >>>> >>>> > Conceptually, the address mapping is not a part of the abstraction for >>>> > a virtio device now. So resetting the memory mapping during virtio >>>> > device reset seems wrong. >>>> >>>> where we want to keep memory mapping intact across virtio device reset >>>> for best live migration latency/downtime. I wonder would it work to >>>> reset the mapping in vhost-vdpa life cycle out of virtio reset, say >>>> introduce a .reset_map() op to restore 1:1 mapping within >>>> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can >>>> move the iotlb reset logic to there without worry breaking virtio-vdpa. >>> It looks to me we don't need a new ops. We can simply do set_map() >>> twice >> What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed >> in to destroy all iotlb mappings previously added, and second set_map(0, >> -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but >> doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, >> this is supported today I think. Why there'll be such obscure >> distinction, or what's the benefit to treat second .set_map() as >> recreating 1:1 mapping? > Ok, I think I miss some context. I agree that it's better to decouple > memory mappings from the virtio reset. It helps to reduce the > unnecessary memory transactions. It might require a new feature flag. This I agreed. AFAICT QEMU would need to check this new feature flag to make sure memory mappings are kept intact across reset, otherwise for the sake of avoid breaking older kernels it has to recreate all the mappings after reset like how it is done today. > Regarding the method of restoring to 1:1 DMA MR, it might be dangerous > for (buggy) vhost-vDPA devices. Since its userspace doesn't set up any > mapping it can explore the kernel with that via CVQ? Not sure I understand this proposal. The 1:1 DMA MR is first created at vdpa device add, and gets destroyed implicitly when the first .set_map or .dma_map call is made, which is only possible after the vhost-vdpa module is loaded and bound to vdpa devices. Naturally the DMA MR should be restored to how it was before when vhost-vdpa module is unloaded, or if anything the 1:1 DMA MR creation can be deferred to until virtio-vdpa is probed and bound to devices. Today vhost_vdpa_remove_as() as part of the vhost-vdpa unload code path already gets all mappings purged through vhost_vdpa_iotlb_unmap(0, -1ULL), and it should be pretty safe to restore DMA MR via .reset_map() right after. Not sure what's the concern here with buggy vhost-vdpa device? Noted when vhost-vdpa is being unloaded there's even no chance to probe kernel through CVQ, as the virtio feature is not even negotiated at that point. And it is even trickier to wait for CVQ response from device indefinitely when trying to unload a module. Regards, -Siwei > > Thanks > >>> or do you mean it would be faster? >> I think with .reset_map() we at least can avoid indefinite latency >> hiccup from destroying and recreating 1:1 mapping with the unwarranted >> 2rd unmap call. And .reset_map() should work with both .dma_map() and >> .set_map() APIs with clear semantics. >> >> Regards, >> -Siwei >>> Thanks >>> >>>> Thanks, >>>> -Siwei >>>> >>>>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 >>>>> Author: Eli Cohen <elic@nvidia.com> >>>>> Date: Wed Jun 2 11:58:54 2021 +0300 >>>>> >>>>> vdpa/mlx5: Add support for running with virtio_vdpa >>>>> >>>>> In order to support running vdpa using vritio_vdpa driver, we need to >>>>> create a different kind of MR, one that has 1:1 mapping, since the >>>>> addresses referring to virtqueues are dma addresses. >>>>> >>>>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware >>>>> supports the general capability umem_uid_0. The reason for that is that >>>>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be >>>>> created with uid == 0 only when the firmware capability is on. >>>>> >>>>> If the set_map() callback is called with new translations provided >>>>> through iotlb, the driver will destroy the 1:1 MR and create a regular >>>>> one. >>>>> >>>>> Signed-off-by: Eli Cohen <elic@nvidia.com> >>>>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com >>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> >>>>> Acked-by: Jason Wang <jasowang@redhat.com> >>>>> >>>>> Thanks >>>>> >>>>> >>>>>> +} >>>>>> + >>>>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >>>>>> + struct vhost_iotlb *iotlb, >>>>>> + unsigned int asid) >>>>>> { >>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>> int err; >>>>>> >>>>>> - if (mr->initialized) >>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>>>> return 0; >>>>>> >>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>>>> - if (iotlb) >>>>>> - err = create_user_mr(mvdev, iotlb); >>>>>> - else >>>>>> - err = create_dma_mr(mvdev, mr); >>>>>> + if (mr->initialized) >>>>>> + return 0; >>>>>> >>>>>> - if (err) >>>>>> - return err; >>>>>> - } >>>>>> + if (iotlb) >>>>>> + err = create_user_mr(mvdev, iotlb); >>>>>> + else >>>>>> + err = create_dma_mr(mvdev, mr); >>>>>> >>>>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >>>>>> - err = dup_iotlb(mvdev, iotlb); >>>>>> - if (err) >>>>>> - goto out_err; >>>>>> - } >>>>>> + if (err) >>>>>> + return err; >>>>>> >>>>>> mr->initialized = true; >>>>>> + >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>>>> + struct vhost_iotlb *iotlb, unsigned int asid) >>>>>> +{ >>>>>> + int err; >>>>>> + >>>>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >>>>>> + if (err) >>>>>> + return err; >>>>>> + >>>>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >>>>>> + if (err) >>>>>> + goto out_err; >>>>>> + >>>>>> return 0; >>>>>> >>>>>> out_err: >>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>>>> - if (iotlb) >>>>>> - destroy_user_mr(mvdev, mr); >>>>>> - else >>>>>> - destroy_dma_mr(mvdev, mr); >>>>>> - } >>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>>>> >>>>>> return err; >>>>>> } >>>>>> -- >>>>>> 2.41.0 >>>>>> >>>>> _______________________________________________ >>>>> Virtualization mailing list >>>>> Virtualization@lists.linux-foundation.org >>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On Thu, Aug 10, 2023 at 8:40 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 8/8/2023 11:52 PM, Jason Wang wrote: > > On Wed, Aug 9, 2023 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 8/7/2023 8:00 PM, Jason Wang wrote: > >>> On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>> > >>>> On 8/3/2023 1:03 AM, Jason Wang wrote: > >>>>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > >>>>>> The mr->initialized flag is shared between the control vq and data vq > >>>>>> part of the mr init/uninit. But if the control vq and data vq get placed > >>>>>> in different ASIDs, it can happen that initializing the control vq will > >>>>>> prevent the data vq mr from being initialized. > >>>>>> > >>>>>> This patch consolidates the control and data vq init parts into their > >>>>>> own init functions. The mr->initialized will now be used for the data vq > >>>>>> only. The control vq currently doesn't need a flag. > >>>>>> > >>>>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > >>>>>> split into data and control vq functions which are now also ASID aware. > >>>>>> > >>>>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") > >>>>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > >>>>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > >>>>>> Reviewed-by: Gal Pressman <gal@nvidia.com> > >>>>>> --- > >>>>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >>>>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > >>>>>> 2 files changed, 71 insertions(+), 27 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>> index 25fc4120b618..a0420be5059f 100644 > >>>>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > >>>>>> struct list_head head; > >>>>>> unsigned long num_directs; > >>>>>> unsigned long num_klms; > >>>>>> + /* state of dvq mr */ > >>>>>> bool initialized; > >>>>>> > >>>>>> /* serialize mkey creation and destruction */ > >>>>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > >>>>>> index 03e543229791..4ae14a248a4b 100644 > >>>>>> --- a/drivers/vdpa/mlx5/core/mr.c > >>>>>> +++ b/drivers/vdpa/mlx5/core/mr.c > >>>>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>> +{ > >>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>>>> + return; > >>>>>> + > >>>>>> + prune_iotlb(mvdev); > >>>>>> +} > >>>>>> + > >>>>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>> { > >>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>> > >>>>>> - mutex_lock(&mr->mkey_mtx); > >>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>>>> + return; > >>>>>> + > >>>>>> if (!mr->initialized) > >>>>>> - goto out; > >>>>>> + return; > >>>>>> > >>>>>> - prune_iotlb(mvdev); > >>>>>> if (mr->user_mr) > >>>>>> destroy_user_mr(mvdev, mr); > >>>>>> else > >>>>>> destroy_dma_mr(mvdev, mr); > >>>>>> > >>>>>> mr->initialized = false; > >>>>>> -out: > >>>>>> +} > >>>>>> + > >>>>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>> +{ > >>>>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>> + > >>>>>> + mutex_lock(&mr->mkey_mtx); > >>>>>> + > >>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > >>>>>> + > >>>>>> mutex_unlock(&mr->mkey_mtx); > >>>>>> } > >>>>>> > >>>>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>> - struct vhost_iotlb *iotlb, unsigned int asid) > >>>>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>>>> +{ > >>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); > >>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); > >>>>>> +} > >>>>>> + > >>>>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>> + struct vhost_iotlb *iotlb, > >>>>>> + unsigned int asid) > >>>>>> +{ > >>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>>>> + return 0; > >>>>>> + > >>>>>> + return dup_iotlb(mvdev, iotlb); > >>>>> This worries me as conceptually, there should be no difference between > >>>>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > >>>>> > >>>>> One example is that, if we only do dup_iotlb() but not try to create > >>>>> dma mr here, we will break virtio-vdpa: > >>>> For this case, I guess we may need another way to support virtio-vdpa > >>>> 1:1 mapping rather than overloading virtio device reset semantics, see: > >>>> > >>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >>>> > >>>> > Conceptually, the address mapping is not a part of the abstraction for > >>>> > a virtio device now. So resetting the memory mapping during virtio > >>>> > device reset seems wrong. > >>>> > >>>> where we want to keep memory mapping intact across virtio device reset > >>>> for best live migration latency/downtime. I wonder would it work to > >>>> reset the mapping in vhost-vdpa life cycle out of virtio reset, say > >>>> introduce a .reset_map() op to restore 1:1 mapping within > >>>> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can > >>>> move the iotlb reset logic to there without worry breaking virtio-vdpa. > >>> It looks to me we don't need a new ops. We can simply do set_map() > >>> twice > >> What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed > >> in to destroy all iotlb mappings previously added, and second set_map(0, > >> -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but > >> doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, > >> this is supported today I think. Why there'll be such obscure > >> distinction, or what's the benefit to treat second .set_map() as > >> recreating 1:1 mapping? > > Ok, I think I miss some context. I agree that it's better to decouple > > memory mappings from the virtio reset. It helps to reduce the > > unnecessary memory transactions. It might require a new feature flag. > This I agreed. AFAICT QEMU would need to check this new feature flag to > make sure memory mappings are kept intact across reset, otherwise for > the sake of avoid breaking older kernels it has to recreate all the > mappings after reset like how it is done today. > > > Regarding the method of restoring to 1:1 DMA MR, it might be dangerous > > for (buggy) vhost-vDPA devices. Since its userspace doesn't set up any > > mapping it can explore the kernel with that via CVQ? > Not sure I understand this proposal. The 1:1 DMA MR is first created at > vdpa device add, and gets destroyed implicitly when the first .set_map > or .dma_map call is made, which is only possible after the vhost-vdpa > module is loaded and bound to vdpa devices. So what happens if there's a buggy userspace that doesn't do any IOTLB setup? Thanks > Naturally the DMA MR should > be restored to how it was before when vhost-vdpa module is unloaded, or > if anything the 1:1 DMA MR creation can be deferred to until virtio-vdpa > is probed and bound to devices. Today vhost_vdpa_remove_as() as part of > the vhost-vdpa unload code path already gets all mappings purged through > vhost_vdpa_iotlb_unmap(0, -1ULL), and it should be pretty safe to > restore DMA MR via .reset_map() right after. Not sure what's the concern > here with buggy vhost-vdpa device? > > Noted when vhost-vdpa is being unloaded there's even no chance to probe > kernel through CVQ, as the virtio feature is not even negotiated at that > point. And it is even trickier to wait for CVQ response from device > indefinitely when trying to unload a module. > > Regards, > -Siwei > > > > Thanks > > > >>> or do you mean it would be faster? > >> I think with .reset_map() we at least can avoid indefinite latency > >> hiccup from destroying and recreating 1:1 mapping with the unwarranted > >> 2rd unmap call. And .reset_map() should work with both .dma_map() and > >> .set_map() APIs with clear semantics. > >> > >> Regards, > >> -Siwei > >>> Thanks > >>> > >>>> Thanks, > >>>> -Siwei > >>>> > >>>>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > >>>>> Author: Eli Cohen <elic@nvidia.com> > >>>>> Date: Wed Jun 2 11:58:54 2021 +0300 > >>>>> > >>>>> vdpa/mlx5: Add support for running with virtio_vdpa > >>>>> > >>>>> In order to support running vdpa using vritio_vdpa driver, we need to > >>>>> create a different kind of MR, one that has 1:1 mapping, since the > >>>>> addresses referring to virtqueues are dma addresses. > >>>>> > >>>>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > >>>>> supports the general capability umem_uid_0. The reason for that is that > >>>>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be > >>>>> created with uid == 0 only when the firmware capability is on. > >>>>> > >>>>> If the set_map() callback is called with new translations provided > >>>>> through iotlb, the driver will destroy the 1:1 MR and create a regular > >>>>> one. > >>>>> > >>>>> Signed-off-by: Eli Cohen <elic@nvidia.com> > >>>>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > >>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > >>>>> Acked-by: Jason Wang <jasowang@redhat.com> > >>>>> > >>>>> Thanks > >>>>> > >>>>> > >>>>>> +} > >>>>>> + > >>>>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>> + struct vhost_iotlb *iotlb, > >>>>>> + unsigned int asid) > >>>>>> { > >>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>> int err; > >>>>>> > >>>>>> - if (mr->initialized) > >>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>>>> return 0; > >>>>>> > >>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>>>> - if (iotlb) > >>>>>> - err = create_user_mr(mvdev, iotlb); > >>>>>> - else > >>>>>> - err = create_dma_mr(mvdev, mr); > >>>>>> + if (mr->initialized) > >>>>>> + return 0; > >>>>>> > >>>>>> - if (err) > >>>>>> - return err; > >>>>>> - } > >>>>>> + if (iotlb) > >>>>>> + err = create_user_mr(mvdev, iotlb); > >>>>>> + else > >>>>>> + err = create_dma_mr(mvdev, mr); > >>>>>> > >>>>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > >>>>>> - err = dup_iotlb(mvdev, iotlb); > >>>>>> - if (err) > >>>>>> - goto out_err; > >>>>>> - } > >>>>>> + if (err) > >>>>>> + return err; > >>>>>> > >>>>>> mr->initialized = true; > >>>>>> + > >>>>>> + return 0; > >>>>>> +} > >>>>>> + > >>>>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>> + struct vhost_iotlb *iotlb, unsigned int asid) > >>>>>> +{ > >>>>>> + int err; > >>>>>> + > >>>>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > >>>>>> + if (err) > >>>>>> + return err; > >>>>>> + > >>>>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > >>>>>> + if (err) > >>>>>> + goto out_err; > >>>>>> + > >>>>>> return 0; > >>>>>> > >>>>>> out_err: > >>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>>>> - if (iotlb) > >>>>>> - destroy_user_mr(mvdev, mr); > >>>>>> - else > >>>>>> - destroy_dma_mr(mvdev, mr); > >>>>>> - } > >>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>>>> > >>>>>> return err; > >>>>>> } > >>>>>> -- > >>>>>> 2.41.0 > >>>>>> > >>>>> _______________________________________________ > >>>>> Virtualization mailing list > >>>>> Virtualization@lists.linux-foundation.org > >>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization >
On 8/9/2023 8:10 PM, Jason Wang wrote: > On Thu, Aug 10, 2023 at 8:40 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 8/8/2023 11:52 PM, Jason Wang wrote: >>> On Wed, Aug 9, 2023 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> >>>> On 8/7/2023 8:00 PM, Jason Wang wrote: >>>>> On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>>>> On 8/3/2023 1:03 AM, Jason Wang wrote: >>>>>>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >>>>>>>> The mr->initialized flag is shared between the control vq and data vq >>>>>>>> part of the mr init/uninit. But if the control vq and data vq get placed >>>>>>>> in different ASIDs, it can happen that initializing the control vq will >>>>>>>> prevent the data vq mr from being initialized. >>>>>>>> >>>>>>>> This patch consolidates the control and data vq init parts into their >>>>>>>> own init functions. The mr->initialized will now be used for the data vq >>>>>>>> only. The control vq currently doesn't need a flag. >>>>>>>> >>>>>>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got >>>>>>>> split into data and control vq functions which are now also ASID aware. >>>>>>>> >>>>>>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") >>>>>>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >>>>>>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> >>>>>>>> Reviewed-by: Gal Pressman <gal@nvidia.com> >>>>>>>> --- >>>>>>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >>>>>>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- >>>>>>>> 2 files changed, 71 insertions(+), 27 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>>>> index 25fc4120b618..a0420be5059f 100644 >>>>>>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>>>>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { >>>>>>>> struct list_head head; >>>>>>>> unsigned long num_directs; >>>>>>>> unsigned long num_klms; >>>>>>>> + /* state of dvq mr */ >>>>>>>> bool initialized; >>>>>>>> >>>>>>>> /* serialize mkey creation and destruction */ >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >>>>>>>> index 03e543229791..4ae14a248a4b 100644 >>>>>>>> --- a/drivers/vdpa/mlx5/core/mr.c >>>>>>>> +++ b/drivers/vdpa/mlx5/core/mr.c >>>>>>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>>>>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>>>> +{ >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>>>>>> + return; >>>>>>>> + >>>>>>>> + prune_iotlb(mvdev); >>>>>>>> +} >>>>>>>> + >>>>>>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>>>> { >>>>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>>>> >>>>>>>> - mutex_lock(&mr->mkey_mtx); >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>>>>>> + return; >>>>>>>> + >>>>>>>> if (!mr->initialized) >>>>>>>> - goto out; >>>>>>>> + return; >>>>>>>> >>>>>>>> - prune_iotlb(mvdev); >>>>>>>> if (mr->user_mr) >>>>>>>> destroy_user_mr(mvdev, mr); >>>>>>>> else >>>>>>>> destroy_dma_mr(mvdev, mr); >>>>>>>> >>>>>>>> mr->initialized = false; >>>>>>>> -out: >>>>>>>> +} >>>>>>>> + >>>>>>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>>>>>> +{ >>>>>>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>>>> + >>>>>>>> + mutex_lock(&mr->mkey_mtx); >>>>>>>> + >>>>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>>>>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >>>>>>>> + >>>>>>>> mutex_unlock(&mr->mkey_mtx); >>>>>>>> } >>>>>>>> >>>>>>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>>>>>> - struct vhost_iotlb *iotlb, unsigned int asid) >>>>>>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>>>>>> +{ >>>>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); >>>>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); >>>>>>>> +} >>>>>>>> + >>>>>>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >>>>>>>> + struct vhost_iotlb *iotlb, >>>>>>>> + unsigned int asid) >>>>>>>> +{ >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>>>>>> + return 0; >>>>>>>> + >>>>>>>> + return dup_iotlb(mvdev, iotlb); >>>>>>> This worries me as conceptually, there should be no difference between >>>>>>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. >>>>>>> >>>>>>> One example is that, if we only do dup_iotlb() but not try to create >>>>>>> dma mr here, we will break virtio-vdpa: >>>>>> For this case, I guess we may need another way to support virtio-vdpa >>>>>> 1:1 mapping rather than overloading virtio device reset semantics, see: >>>>>> >>>>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html >>>>>> >>>>>> > Conceptually, the address mapping is not a part of the abstraction for >>>>>> > a virtio device now. So resetting the memory mapping during virtio >>>>>> > device reset seems wrong. >>>>>> >>>>>> where we want to keep memory mapping intact across virtio device reset >>>>>> for best live migration latency/downtime. I wonder would it work to >>>>>> reset the mapping in vhost-vdpa life cycle out of virtio reset, say >>>>>> introduce a .reset_map() op to restore 1:1 mapping within >>>>>> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can >>>>>> move the iotlb reset logic to there without worry breaking virtio-vdpa. >>>>> It looks to me we don't need a new ops. We can simply do set_map() >>>>> twice >>>> What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed >>>> in to destroy all iotlb mappings previously added, and second set_map(0, >>>> -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but >>>> doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, >>>> this is supported today I think. Why there'll be such obscure >>>> distinction, or what's the benefit to treat second .set_map() as >>>> recreating 1:1 mapping? >>> Ok, I think I miss some context. I agree that it's better to decouple >>> memory mappings from the virtio reset. It helps to reduce the >>> unnecessary memory transactions. It might require a new feature flag. >> This I agreed. AFAICT QEMU would need to check this new feature flag to >> make sure memory mappings are kept intact across reset, otherwise for >> the sake of avoid breaking older kernels it has to recreate all the >> mappings after reset like how it is done today. >> >>> Regarding the method of restoring to 1:1 DMA MR, it might be dangerous >>> for (buggy) vhost-vDPA devices. Since its userspace doesn't set up any >>> mapping it can explore the kernel with that via CVQ? >> Not sure I understand this proposal. The 1:1 DMA MR is first created at >> vdpa device add, and gets destroyed implicitly when the first .set_map >> or .dma_map call is made, which is only possible after the vhost-vdpa >> module is loaded and bound to vdpa devices. > So what happens if there's a buggy userspace that doesn't do any IOTLB setup? Then parent driver doesn't do anything in .reset_map() - as the DMA MR is still there. Parent driver should be able to tell apart if DMA MR has been destroyed or not by checking the internal state. -Siwei > > Thanks > >> Naturally the DMA MR should >> be restored to how it was before when vhost-vdpa module is unloaded, or >> if anything the 1:1 DMA MR creation can be deferred to until virtio-vdpa >> is probed and bound to devices. Today vhost_vdpa_remove_as() as part of >> the vhost-vdpa unload code path already gets all mappings purged through >> vhost_vdpa_iotlb_unmap(0, -1ULL), and it should be pretty safe to >> restore DMA MR via .reset_map() right after. Not sure what's the concern >> here with buggy vhost-vdpa device? >> >> Noted when vhost-vdpa is being unloaded there's even no chance to probe >> kernel through CVQ, as the virtio feature is not even negotiated at that >> point. And it is even trickier to wait for CVQ response from device >> indefinitely when trying to unload a module. >> >> Regards, >> -Siwei >>> Thanks >>> >>>>> or do you mean it would be faster? >>>> I think with .reset_map() we at least can avoid indefinite latency >>>> hiccup from destroying and recreating 1:1 mapping with the unwarranted >>>> 2rd unmap call. And .reset_map() should work with both .dma_map() and >>>> .set_map() APIs with clear semantics. >>>> >>>> Regards, >>>> -Siwei >>>>> Thanks >>>>> >>>>>> Thanks, >>>>>> -Siwei >>>>>> >>>>>>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 >>>>>>> Author: Eli Cohen <elic@nvidia.com> >>>>>>> Date: Wed Jun 2 11:58:54 2021 +0300 >>>>>>> >>>>>>> vdpa/mlx5: Add support for running with virtio_vdpa >>>>>>> >>>>>>> In order to support running vdpa using vritio_vdpa driver, we need to >>>>>>> create a different kind of MR, one that has 1:1 mapping, since the >>>>>>> addresses referring to virtqueues are dma addresses. >>>>>>> >>>>>>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware >>>>>>> supports the general capability umem_uid_0. The reason for that is that >>>>>>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be >>>>>>> created with uid == 0 only when the firmware capability is on. >>>>>>> >>>>>>> If the set_map() callback is called with new translations provided >>>>>>> through iotlb, the driver will destroy the 1:1 MR and create a regular >>>>>>> one. >>>>>>> >>>>>>> Signed-off-by: Eli Cohen <elic@nvidia.com> >>>>>>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com >>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> >>>>>>> Acked-by: Jason Wang <jasowang@redhat.com> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>>> +} >>>>>>>> + >>>>>>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >>>>>>>> + struct vhost_iotlb *iotlb, >>>>>>>> + unsigned int asid) >>>>>>>> { >>>>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>>>>>> int err; >>>>>>>> >>>>>>>> - if (mr->initialized) >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>>>>>> return 0; >>>>>>>> >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>>>>>> - if (iotlb) >>>>>>>> - err = create_user_mr(mvdev, iotlb); >>>>>>>> - else >>>>>>>> - err = create_dma_mr(mvdev, mr); >>>>>>>> + if (mr->initialized) >>>>>>>> + return 0; >>>>>>>> >>>>>>>> - if (err) >>>>>>>> - return err; >>>>>>>> - } >>>>>>>> + if (iotlb) >>>>>>>> + err = create_user_mr(mvdev, iotlb); >>>>>>>> + else >>>>>>>> + err = create_dma_mr(mvdev, mr); >>>>>>>> >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >>>>>>>> - err = dup_iotlb(mvdev, iotlb); >>>>>>>> - if (err) >>>>>>>> - goto out_err; >>>>>>>> - } >>>>>>>> + if (err) >>>>>>>> + return err; >>>>>>>> >>>>>>>> mr->initialized = true; >>>>>>>> + >>>>>>>> + return 0; >>>>>>>> +} >>>>>>>> + >>>>>>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>>>>>> + struct vhost_iotlb *iotlb, unsigned int asid) >>>>>>>> +{ >>>>>>>> + int err; >>>>>>>> + >>>>>>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >>>>>>>> + if (err) >>>>>>>> + return err; >>>>>>>> + >>>>>>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >>>>>>>> + if (err) >>>>>>>> + goto out_err; >>>>>>>> + >>>>>>>> return 0; >>>>>>>> >>>>>>>> out_err: >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>>>>>> - if (iotlb) >>>>>>>> - destroy_user_mr(mvdev, mr); >>>>>>>> - else >>>>>>>> - destroy_dma_mr(mvdev, mr); >>>>>>>> - } >>>>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>>>>>> >>>>>>>> return err; >>>>>>>> } >>>>>>>> -- >>>>>>>> 2.41.0 >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Virtualization mailing list >>>>>>> Virtualization@lists.linux-foundation.org >>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On Fri, Aug 11, 2023 at 6:21 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 8/9/2023 8:10 PM, Jason Wang wrote: > > On Thu, Aug 10, 2023 at 8:40 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 8/8/2023 11:52 PM, Jason Wang wrote: > >>> On Wed, Aug 9, 2023 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>> > >>>> On 8/7/2023 8:00 PM, Jason Wang wrote: > >>>>> On Fri, Aug 4, 2023 at 1:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>>>> On 8/3/2023 1:03 AM, Jason Wang wrote: > >>>>>>> On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > >>>>>>>> The mr->initialized flag is shared between the control vq and data vq > >>>>>>>> part of the mr init/uninit. But if the control vq and data vq get placed > >>>>>>>> in different ASIDs, it can happen that initializing the control vq will > >>>>>>>> prevent the data vq mr from being initialized. > >>>>>>>> > >>>>>>>> This patch consolidates the control and data vq init parts into their > >>>>>>>> own init functions. The mr->initialized will now be used for the data vq > >>>>>>>> only. The control vq currently doesn't need a flag. > >>>>>>>> > >>>>>>>> The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got > >>>>>>>> split into data and control vq functions which are now also ASID aware. > >>>>>>>> > >>>>>>>> Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") > >>>>>>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > >>>>>>>> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > >>>>>>>> Reviewed-by: Gal Pressman <gal@nvidia.com> > >>>>>>>> --- > >>>>>>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >>>>>>>> drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++--------- > >>>>>>>> 2 files changed, 71 insertions(+), 27 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>>>> index 25fc4120b618..a0420be5059f 100644 > >>>>>>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>>>> @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > >>>>>>>> struct list_head head; > >>>>>>>> unsigned long num_directs; > >>>>>>>> unsigned long num_klms; > >>>>>>>> + /* state of dvq mr */ > >>>>>>>> bool initialized; > >>>>>>>> > >>>>>>>> /* serialize mkey creation and destruction */ > >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > >>>>>>>> index 03e543229791..4ae14a248a4b 100644 > >>>>>>>> --- a/drivers/vdpa/mlx5/core/mr.c > >>>>>>>> +++ b/drivers/vdpa/mlx5/core/mr.c > >>>>>>>> @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>>>>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>>>> +{ > >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>>>>>> + return; > >>>>>>>> + > >>>>>>>> + prune_iotlb(mvdev); > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>>>> { > >>>>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>>>> > >>>>>>>> - mutex_lock(&mr->mkey_mtx); > >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>>>>>> + return; > >>>>>>>> + > >>>>>>>> if (!mr->initialized) > >>>>>>>> - goto out; > >>>>>>>> + return; > >>>>>>>> > >>>>>>>> - prune_iotlb(mvdev); > >>>>>>>> if (mr->user_mr) > >>>>>>>> destroy_user_mr(mvdev, mr); > >>>>>>>> else > >>>>>>>> destroy_dma_mr(mvdev, mr); > >>>>>>>> > >>>>>>>> mr->initialized = false; > >>>>>>>> -out: > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >>>>>>>> +{ > >>>>>>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>>>> + > >>>>>>>> + mutex_lock(&mr->mkey_mtx); > >>>>>>>> + > >>>>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>>>>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > >>>>>>>> + > >>>>>>>> mutex_unlock(&mr->mkey_mtx); > >>>>>>>> } > >>>>>>>> > >>>>>>>> -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>>>> - struct vhost_iotlb *iotlb, unsigned int asid) > >>>>>>>> +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >>>>>>>> +{ > >>>>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); > >>>>>>>> + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>>>> + struct vhost_iotlb *iotlb, > >>>>>>>> + unsigned int asid) > >>>>>>>> +{ > >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >>>>>>>> + return 0; > >>>>>>>> + > >>>>>>>> + return dup_iotlb(mvdev, iotlb); > >>>>>>> This worries me as conceptually, there should be no difference between > >>>>>>> dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > >>>>>>> > >>>>>>> One example is that, if we only do dup_iotlb() but not try to create > >>>>>>> dma mr here, we will break virtio-vdpa: > >>>>>> For this case, I guess we may need another way to support virtio-vdpa > >>>>>> 1:1 mapping rather than overloading virtio device reset semantics, see: > >>>>>> > >>>>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html > >>>>>> > >>>>>> > Conceptually, the address mapping is not a part of the abstraction for > >>>>>> > a virtio device now. So resetting the memory mapping during virtio > >>>>>> > device reset seems wrong. > >>>>>> > >>>>>> where we want to keep memory mapping intact across virtio device reset > >>>>>> for best live migration latency/downtime. I wonder would it work to > >>>>>> reset the mapping in vhost-vdpa life cycle out of virtio reset, say > >>>>>> introduce a .reset_map() op to restore 1:1 mapping within > >>>>>> vhost_vdpa_remove_as() right after vhost_vdpa_iotlb_unmap()? Then we can > >>>>>> move the iotlb reset logic to there without worry breaking virtio-vdpa. > >>>>> It looks to me we don't need a new ops. We can simply do set_map() > >>>>> twice > >>>> What does it mean, first set_map(0, -1ULL) with zero iotlb entry passed > >>>> in to destroy all iotlb mappings previously added, and second set_map(0, > >>>> -1ULL) to restore 1:1 DMA MR? But userspace (maybe a buggy one but > >>>> doesn't do harm) apart from vhost-vdpa itself can do unmap twice anyway, > >>>> this is supported today I think. Why there'll be such obscure > >>>> distinction, or what's the benefit to treat second .set_map() as > >>>> recreating 1:1 mapping? > >>> Ok, I think I miss some context. I agree that it's better to decouple > >>> memory mappings from the virtio reset. It helps to reduce the > >>> unnecessary memory transactions. It might require a new feature flag. > >> This I agreed. AFAICT QEMU would need to check this new feature flag to > >> make sure memory mappings are kept intact across reset, otherwise for > >> the sake of avoid breaking older kernels it has to recreate all the > >> mappings after reset like how it is done today. > >> > >>> Regarding the method of restoring to 1:1 DMA MR, it might be dangerous > >>> for (buggy) vhost-vDPA devices. Since its userspace doesn't set up any > >>> mapping it can explore the kernel with that via CVQ? > >> Not sure I understand this proposal. The 1:1 DMA MR is first created at > >> vdpa device add, and gets destroyed implicitly when the first .set_map > >> or .dma_map call is made, which is only possible after the vhost-vdpa > >> module is loaded and bound to vdpa devices. > > So what happens if there's a buggy userspace that doesn't do any IOTLB setup? > Then parent driver doesn't do anything in .reset_map() - as the DMA MR > is still there. Parent driver should be able to tell apart if DMA MR has > been destroyed or not by checking the internal state. Would you mind posting a patch to demonstrate this? Thanks > > -Siwei > > > > > Thanks > > > >> Naturally the DMA MR should > >> be restored to how it was before when vhost-vdpa module is unloaded, or > >> if anything the 1:1 DMA MR creation can be deferred to until virtio-vdpa > >> is probed and bound to devices. Today vhost_vdpa_remove_as() as part of > >> the vhost-vdpa unload code path already gets all mappings purged through > >> vhost_vdpa_iotlb_unmap(0, -1ULL), and it should be pretty safe to > >> restore DMA MR via .reset_map() right after. Not sure what's the concern > >> here with buggy vhost-vdpa device? > >> > >> Noted when vhost-vdpa is being unloaded there's even no chance to probe > >> kernel through CVQ, as the virtio feature is not even negotiated at that > >> point. And it is even trickier to wait for CVQ response from device > >> indefinitely when trying to unload a module. > >> > >> Regards, > >> -Siwei > >>> Thanks > >>> > >>>>> or do you mean it would be faster? > >>>> I think with .reset_map() we at least can avoid indefinite latency > >>>> hiccup from destroying and recreating 1:1 mapping with the unwarranted > >>>> 2rd unmap call. And .reset_map() should work with both .dma_map() and > >>>> .set_map() APIs with clear semantics. > >>>> > >>>> Regards, > >>>> -Siwei > >>>>> Thanks > >>>>> > >>>>>> Thanks, > >>>>>> -Siwei > >>>>>> > >>>>>>> commit 6f5312f801836e6af9bcbb0bdb44dc423e129206 > >>>>>>> Author: Eli Cohen <elic@nvidia.com> > >>>>>>> Date: Wed Jun 2 11:58:54 2021 +0300 > >>>>>>> > >>>>>>> vdpa/mlx5: Add support for running with virtio_vdpa > >>>>>>> > >>>>>>> In order to support running vdpa using vritio_vdpa driver, we need to > >>>>>>> create a different kind of MR, one that has 1:1 mapping, since the > >>>>>>> addresses referring to virtqueues are dma addresses. > >>>>>>> > >>>>>>> We create the 1:1 MR in mlx5_vdpa_dev_add() only in case firmware > >>>>>>> supports the general capability umem_uid_0. The reason for that is that > >>>>>>> 1:1 MRs must be created with uid == 0 while virtqueue objects can be > >>>>>>> created with uid == 0 only when the firmware capability is on. > >>>>>>> > >>>>>>> If the set_map() callback is called with new translations provided > >>>>>>> through iotlb, the driver will destroy the 1:1 MR and create a regular > >>>>>>> one. > >>>>>>> > >>>>>>> Signed-off-by: Eli Cohen <elic@nvidia.com> > >>>>>>> Link: https://lore.kernel.org/r/20210602085854.62690-1-elic@nvidia.com > >>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > >>>>>>> Acked-by: Jason Wang <jasowang@redhat.com> > >>>>>>> > >>>>>>> Thanks > >>>>>>> > >>>>>>> > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>>>> + struct vhost_iotlb *iotlb, > >>>>>>>> + unsigned int asid) > >>>>>>>> { > >>>>>>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >>>>>>>> int err; > >>>>>>>> > >>>>>>>> - if (mr->initialized) > >>>>>>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >>>>>>>> return 0; > >>>>>>>> > >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>>>>>> - if (iotlb) > >>>>>>>> - err = create_user_mr(mvdev, iotlb); > >>>>>>>> - else > >>>>>>>> - err = create_dma_mr(mvdev, mr); > >>>>>>>> + if (mr->initialized) > >>>>>>>> + return 0; > >>>>>>>> > >>>>>>>> - if (err) > >>>>>>>> - return err; > >>>>>>>> - } > >>>>>>>> + if (iotlb) > >>>>>>>> + err = create_user_mr(mvdev, iotlb); > >>>>>>>> + else > >>>>>>>> + err = create_dma_mr(mvdev, mr); > >>>>>>>> > >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > >>>>>>>> - err = dup_iotlb(mvdev, iotlb); > >>>>>>>> - if (err) > >>>>>>>> - goto out_err; > >>>>>>>> - } > >>>>>>>> + if (err) > >>>>>>>> + return err; > >>>>>>>> > >>>>>>>> mr->initialized = true; > >>>>>>>> + > >>>>>>>> + return 0; > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >>>>>>>> + struct vhost_iotlb *iotlb, unsigned int asid) > >>>>>>>> +{ > >>>>>>>> + int err; > >>>>>>>> + > >>>>>>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > >>>>>>>> + if (err) > >>>>>>>> + return err; > >>>>>>>> + > >>>>>>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > >>>>>>>> + if (err) > >>>>>>>> + goto out_err; > >>>>>>>> + > >>>>>>>> return 0; > >>>>>>>> > >>>>>>>> out_err: > >>>>>>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >>>>>>>> - if (iotlb) > >>>>>>>> - destroy_user_mr(mvdev, mr); > >>>>>>>> - else > >>>>>>>> - destroy_dma_mr(mvdev, mr); > >>>>>>>> - } > >>>>>>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >>>>>>>> > >>>>>>>> return err; > >>>>>>>> } > >>>>>>>> -- > >>>>>>>> 2.41.0 > >>>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Virtualization mailing list > >>>>>>> Virtualization@lists.linux-foundation.org > >>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization >
On Wed, 2023-08-09 at 09:42 +0800, Jason Wang wrote: > On Tue, Aug 8, 2023 at 3:24 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > On Tue, 2023-08-08 at 10:57 +0800, Jason Wang wrote: > > > On Thu, Aug 3, 2023 at 7:40 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > > > On Thu, 2023-08-03 at 16:03 +0800, Jason Wang wrote: > > > > > On Thu, Aug 3, 2023 at 1:13 AM Dragos Tatulea <dtatulea@nvidia.com> > > > > > wrote: > > > > > > > > > > > > The mr->initialized flag is shared between the control vq and data > > > > > > vq > > > > > > part of the mr init/uninit. But if the control vq and data vq get > > > > > > placed > > > > > > in different ASIDs, it can happen that initializing the control vq > > > > > > will > > > > > > prevent the data vq mr from being initialized. > > > > > > > > > > > > This patch consolidates the control and data vq init parts into > > > > > > their > > > > > > own init functions. The mr->initialized will now be used for the > > > > > > data vq > > > > > > only. The control vq currently doesn't need a flag. > > > > > > > > > > > > The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr > > > > > > got > > > > > > split into data and control vq functions which are now also ASID > > > > > > aware. > > > > > > > > > > > > Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces > > > > > > for > > > > > > control and data") > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > > > > Reviewed-by: Eugenio Pérez <eperezma@redhat.com> > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com> > > > > > > --- > > > > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > > > > > > drivers/vdpa/mlx5/core/mr.c | 97 +++++++++++++++++++++------ > > > > > > --- > > > > > > 2 files changed, 71 insertions(+), 27 deletions(-) > > > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > > index 25fc4120b618..a0420be5059f 100644 > > > > > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > > > > @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { > > > > > > struct list_head head; > > > > > > unsigned long num_directs; > > > > > > unsigned long num_klms; > > > > > > + /* state of dvq mr */ > > > > > > bool initialized; > > > > > > > > > > > > /* serialize mkey creation and destruction */ > > > > > > diff --git a/drivers/vdpa/mlx5/core/mr.c > > > > > > b/drivers/vdpa/mlx5/core/mr.c > > > > > > index 03e543229791..4ae14a248a4b 100644 > > > > > > --- a/drivers/vdpa/mlx5/core/mr.c > > > > > > +++ b/drivers/vdpa/mlx5/core/mr.c > > > > > > @@ -489,60 +489,103 @@ static void destroy_user_mr(struct > > > > > > mlx5_vdpa_dev > > > > > > *mvdev, struct mlx5_vdpa_mr *mr > > > > > > } > > > > > > } > > > > > > > > > > > > -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > > > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > > unsigned > > > > > > int asid) > > > > > > +{ > > > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > > > + return; > > > > > > + > > > > > > + prune_iotlb(mvdev); > > > > > > +} > > > > > > + > > > > > > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > > unsigned > > > > > > int asid) > > > > > > { > > > > > > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > > > > > > > > - mutex_lock(&mr->mkey_mtx); > > > > > > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > > > > > > + return; > > > > > > + > > > > > > if (!mr->initialized) > > > > > > - goto out; > > > > > > + return; > > > > > > > > > > > > - prune_iotlb(mvdev); > > > > > > if (mr->user_mr) > > > > > > destroy_user_mr(mvdev, mr); > > > > > > else > > > > > > destroy_dma_mr(mvdev, mr); > > > > > > > > > > > > mr->initialized = false; > > > > > > -out: > > > > > > +} > > > > > > + > > > > > > +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, > > > > > > unsigned > > > > > > int asid) > > > > > > +{ > > > > > > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > > > > > > + > > > > > > + mutex_lock(&mr->mkey_mtx); > > > > > > + > > > > > > + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > > > > > > + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > > > > > > + > > > > > > mutex_unlock(&mr->mkey_mtx); > > > > > > } > > > > > > > > > > > > -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > > > > - struct vhost_iotlb *iotlb, unsigned > > > > > > int > > > > > > asid) > > > > > > +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > > > > > > +{ > > > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > > > group2asid[MLX5_VDPA_CVQ_GROUP]); > > > > > > + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev- > > > > > > > group2asid[MLX5_VDPA_DATAVQ_GROUP]); > > > > > > +} > > > > > > + > > > > > > +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > > > > > > + struct vhost_iotlb *iotlb, > > > > > > + unsigned int asid) > > > > > > +{ > > > > > > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > > > > > > + return 0; > > > > > > + > > > > > > + return dup_iotlb(mvdev, iotlb); > > > > > > > > > > This worries me as conceptually, there should be no difference between > > > > > dvq mr and cvq mr. The virtqueue should be loosely coupled with mr. > > > > > > > > > Are you worried by the changes in this patch or about the possibility of > > > > having > > > > > > > > The reason for this change is that I noticed if you create one mr in one > > > > asid > > > > you could be blocked out from creating another one in a different asid > > > > due > > > > to > > > > mr->initialized being true. To me that seemed problematic. Is it not? > > > > > > My feeling is that mr.c should be device agnostic. It needs to know > > > nothing about the device details to work. But this patch seems to > > > break the layer. > > > > > But the same logic was there before (with the exception of cvq not having an > > init flag anymore). So what am I missing here? > > Nothing, I think you're right. > > I think we can have this patch go first and tweak on top by moving CVQ > aware logic into the net specific codes. > Is this anything more than a re-org? My plan is to move the cvq mr part from mlx5_vdpa_dev into mlx5_vdpa_net. Is there anything else that you were expecting here? Thanks, Dragos
On Tue, Aug 15, 2023 at 9:45 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > --- > drivers/vhost/vdpa.c | 16 +++++++++++++++- > include/uapi/linux/vhost_types.h | 2 ++ > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > index 62b0a01..75092a7 100644 > --- a/drivers/vhost/vdpa.c > +++ b/drivers/vhost/vdpa.c > @@ -406,6 +406,14 @@ static bool vhost_vdpa_can_resume(const struct vhost_vdpa *v) > return ops->resume; > } > > +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) > +{ > + struct vdpa_device *vdpa = v->vdpa; > + const struct vdpa_config_ops *ops = vdpa->config; > + > + return (!ops->set_map && !ops->dma_map) || ops->reset_map; So this means the IOTLB/IOMMU mappings have already been decoupled from the vdpa reset. So it should have been noticed by the userspace. I guess we can just fix the simulator and mlx5 then we are fine? Thanks
diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h index 25fc4120b618..a0420be5059f 100644 --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h @@ -31,6 +31,7 @@ struct mlx5_vdpa_mr { struct list_head head; unsigned long num_directs; unsigned long num_klms; + /* state of dvq mr */ bool initialized; /* serialize mkey creation and destruction */ diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c index 03e543229791..4ae14a248a4b 100644 --- a/drivers/vdpa/mlx5/core/mr.c +++ b/drivers/vdpa/mlx5/core/mr.c @@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr } } -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) +{ + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) + return; + + prune_iotlb(mvdev); +} + +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) { struct mlx5_vdpa_mr *mr = &mvdev->mr; - mutex_lock(&mr->mkey_mtx); + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) + return; + if (!mr->initialized) - goto out; + return; - prune_iotlb(mvdev); if (mr->user_mr) destroy_user_mr(mvdev, mr); else destroy_dma_mr(mvdev, mr); mr->initialized = false; -out: +} + +static void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) +{ + struct mlx5_vdpa_mr *mr = &mvdev->mr; + + mutex_lock(&mr->mkey_mtx); + + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); + _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); + mutex_unlock(&mr->mkey_mtx); } -static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, - struct vhost_iotlb *iotlb, unsigned int asid) +void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) +{ + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]); + mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]); +} + +static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, + struct vhost_iotlb *iotlb, + unsigned int asid) +{ + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) + return 0; + + return dup_iotlb(mvdev, iotlb); +} + +static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, + struct vhost_iotlb *iotlb, + unsigned int asid) { struct mlx5_vdpa_mr *mr = &mvdev->mr; int err; - if (mr->initialized) + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) return 0; - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { - if (iotlb) - err = create_user_mr(mvdev, iotlb); - else - err = create_dma_mr(mvdev, mr); + if (mr->initialized) + return 0; - if (err) - return err; - } + if (iotlb) + err = create_user_mr(mvdev, iotlb); + else + err = create_dma_mr(mvdev, mr); - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { - err = dup_iotlb(mvdev, iotlb); - if (err) - goto out_err; - } + if (err) + return err; mr->initialized = true; + + return 0; +} + +static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, + struct vhost_iotlb *iotlb, unsigned int asid) +{ + int err; + + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); + if (err) + return err; + + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); + if (err) + goto out_err; + return 0; out_err: - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { - if (iotlb) - destroy_user_mr(mvdev, mr); - else - destroy_dma_mr(mvdev, mr); - } + _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); return err; }