[RFC,v2,1/7] IB/mlx5: Change ib_umem_odp_map_dma_single_page() to retain umem_mutex
Message ID | b9974985069900c80b8ff9e6b0b0b346c1592910.1668157436.git.matsuda-daisuke@fujitsu.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp636896wru; Fri, 11 Nov 2022 01:26:53 -0800 (PST) X-Google-Smtp-Source: AA0mqf66L25V+PmF8KDJUwQudPRYDETb7xhZ/hpCCBICg234p5hLHDh+h8bcq5iAN12+reCGv8Si X-Received: by 2002:a05:6402:2903:b0:467:65a2:f635 with SMTP id ee3-20020a056402290300b0046765a2f635mr146987edb.106.1668158813281; Fri, 11 Nov 2022 01:26:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668158813; cv=none; d=google.com; s=arc-20160816; b=1DiboHPx2u+qIH4l+y3yhbgz1RJp7omgTHMKRv6hZcGloRwqErzvRdiSHF8Lou5bxs thKQcGuZlm3K/H6gtAoAx9CCRqdYCUwjJji1atVXp+hD8vJXC28X+6zOfKzHJzehbM/h p6KEsiLaeS7Vxg/7/8Yx+E6qCUnmSfziikeLtSAI26a61ZDj+bbXBaJPumEtdGN0vQ5k elavL9SbNGvjn+eEMnYqvEr1J8kvpVPQoDQoL2TiGqyyu/v5bZCSNgVHO3aXqy/a3VDX 3useeoY+ll9BRHrQkkyyPUxu1YQQR9UIua3CUDMrW7XZKPEJCfmbmYkupePcEvzrIlsT Qx5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=xGHYrG+EmPNK91zO7HwDKG1o5fOCW7zm4ezkeE3+iGI=; b=teMi1yUIKglMIRizBv17TjTTb/F3qIbOvMiUyTfKU6+O4VuFmcLcvPHg9t+i21Sf1/ rh83qCWoaS/vuKo8s/tUJ5liJM1CH/DQy7j5YYf7ebNnkQx7OWOWf1Oslpb9KcUb06eL WCi28GJBCrbtxB7H4M9/JwQSQNjcmWA5Xolf4y70g2hTkthERAW3bdMia9fkCVduH4yd RMynRjDx2/MPWHJBF6qyncs27On45yXI8YwZ9473BbbqzD4dqjrTB7NWhz0P2uEfLPgj MZYdgIv3j19M+0K2VbPXfz7XCp0M467nQVdqmINlKudTz16OT1laWP/dZ4S66XqWr8Wo GWtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y9-20020a056402440900b00463525e8776si1829178eda.491.2022.11.11.01.26.30; Fri, 11 Nov 2022 01:26:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233520AbiKKJYD (ORCPT <rfc822;winker.wchi@gmail.com> + 99 others); Fri, 11 Nov 2022 04:24:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233493AbiKKJYA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 11 Nov 2022 04:24:00 -0500 Received: from esa6.hc1455-7.c3s2.iphmx.com (esa6.hc1455-7.c3s2.iphmx.com [68.232.139.139]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C21996C729; Fri, 11 Nov 2022 01:23:55 -0800 (PST) X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="96366187" X-IronPort-AV: E=Sophos;i="5.96,156,1665414000"; d="scan'208";a="96366187" Received: from unknown (HELO yto-r1.gw.nic.fujitsu.com) ([218.44.52.217]) by esa6.hc1455-7.c3s2.iphmx.com with ESMTP; 11 Nov 2022 18:23:54 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r1.gw.nic.fujitsu.com (Postfix) with ESMTP id 719DCD66A2; Fri, 11 Nov 2022 18:23:52 +0900 (JST) Received: from m3004.s.css.fujitsu.com (m3004.s.css.fujitsu.com [10.128.233.124]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id C683FE535F; Fri, 11 Nov 2022 18:23:51 +0900 (JST) Received: from localhost.localdomain (unknown [10.19.3.107]) by m3004.s.css.fujitsu.com (Postfix) with ESMTP id 9083620607A2; Fri, 11 Nov 2022 18:23:51 +0900 (JST) From: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> To: linux-rdma@vger.kernel.org, leonro@nvidia.com, jgg@nvidia.com, zyjzyj2000@gmail.com Cc: nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com, yangx.jy@fujitsu.com, lizhijian@fujitsu.com, y-goto@fujitsu.com, Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Subject: [RFC PATCH v2 1/7] IB/mlx5: Change ib_umem_odp_map_dma_single_page() to retain umem_mutex Date: Fri, 11 Nov 2022 18:22:22 +0900 Message-Id: <b9974985069900c80b8ff9e6b0b0b346c1592910.1668157436.git.matsuda-daisuke@fujitsu.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <cover.1668157436.git.matsuda-daisuke@fujitsu.com> References: <cover.1668157436.git.matsuda-daisuke@fujitsu.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749191295840904913?= X-GMAIL-MSGID: =?utf-8?q?1749191295840904913?= |
Series |
On-Demand Paging on SoftRoCE
|
|
Commit Message
Daisuke Matsuda (Fujitsu)
Nov. 11, 2022, 9:22 a.m. UTC
ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5
driver, holds umem_mutex on success and releases on failure. This
behavior is not convenient for other drivers to use it, so change it to
always retain mutex on return.
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
drivers/infiniband/core/umem_odp.c | 8 +++-----
drivers/infiniband/hw/mlx5/odp.c | 4 +++-
2 files changed, 6 insertions(+), 6 deletions(-)
Comments
On 11/11/2022 17:22, Daisuke Matsuda wrote: > ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5 > driver, holds umem_mutex on success and releases on failure. This > behavior is not convenient for other drivers to use it, so change it to > always retain mutex on return. > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> > --- > drivers/infiniband/core/umem_odp.c | 8 +++----- > drivers/infiniband/hw/mlx5/odp.c | 4 +++- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c > index e9fa22d31c23..49da6735f7c8 100644 > --- a/drivers/infiniband/core/umem_odp.c > +++ b/drivers/infiniband/core/umem_odp.c > @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page( > * > * Maps the range passed in the argument to DMA addresses. > * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. > - * Upon success the ODP MR will be locked to let caller complete its device > - * page table update. > + * The umem mutex is locked in this function. Callers are responsible for > + * releasing the lock. > * > * Returns the number of pages mapped in success, negative error code > * for failure. > @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, > break; > } > } > - /* upon success lock should stay on hold for the callee */ > + > if (!ret) > ret = dma_index - start_idx; > - else > - mutex_unlock(&umem_odp->umem_mutex); > > out_put_mm: > mmput_async(owning_mm); > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c > index bc97958818bb..a0de27651586 100644 > --- a/drivers/infiniband/hw/mlx5/odp.c > +++ b/drivers/infiniband/hw/mlx5/odp.c > @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, > access_mask |= ODP_WRITE_ALLOWED_BIT; > > np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); > - if (np < 0) > + if (np < 0) { > + mutex_unlock(&odp->umem_mutex); > return np; > + } refer to the comments of ib_umem_odp_map_dma_and_lock: 334 * Returns the number of pages mapped in success, negative error code 335 * for failure. I don't think it's correct to release the lock in all failure case, for example when it reaches below error path. 346 int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, 347 u64 bcnt, u64 access_mask, bool fault) 348 __acquires(&umem_odp->umem_mutex) 349 { 350 struct task_struct *owning_process = NULL; 351 struct mm_struct *owning_mm = umem_odp->umem.owning_mm; 352 int pfn_index, dma_index, ret = 0, start_idx; 353 unsigned int page_shift, hmm_order, pfn_start_idx; 354 unsigned long num_pfns, current_seq; 355 struct hmm_range range = {}; 356 unsigned long timeout; 357 358 if (access_mask == 0) 359 return -EINVAL; <<<<< no lock is hold yet 360 361 if (user_virt < ib_umem_start(umem_odp) || 362 user_virt + bcnt > ib_umem_end(umem_odp)) 363 return -EFAULT; <<<<< no lock is hold yet Further more, you changed public API's the behavior, do it matter for other out-of-tree drivers which is using it, i'm not familiar with this, maybe kernel has no restriction on it ? > > /* > * No need to check whether the MTTs really belong to this MR, since
On Thu, Nov 17, 2022 10:21 PM Li, Zhijian wrote: > On 11/11/2022 17:22, Daisuke Matsuda wrote: > > ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5 > > driver, holds umem_mutex on success and releases on failure. This > > behavior is not convenient for other drivers to use it, so change it to > > always retain mutex on return. > > > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> > > --- > > drivers/infiniband/core/umem_odp.c | 8 +++----- > > drivers/infiniband/hw/mlx5/odp.c | 4 +++- > > 2 files changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c > > index e9fa22d31c23..49da6735f7c8 100644 > > --- a/drivers/infiniband/core/umem_odp.c > > +++ b/drivers/infiniband/core/umem_odp.c > > @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page( > > * > > * Maps the range passed in the argument to DMA addresses. > > * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. > > - * Upon success the ODP MR will be locked to let caller complete its device > > - * page table update. > > + * The umem mutex is locked in this function. Callers are responsible for > > + * releasing the lock. > > * > > > > * Returns the number of pages mapped in success, negative error code > > * for failure. > > @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, > > break; > > } > > } > > - /* upon success lock should stay on hold for the callee */ > > + > > if (!ret) > > ret = dma_index - start_idx; > > - else > > - mutex_unlock(&umem_odp->umem_mutex); > > > > out_put_mm: > > mmput_async(owning_mm); > > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c > > index bc97958818bb..a0de27651586 100644 > > --- a/drivers/infiniband/hw/mlx5/odp.c > > +++ b/drivers/infiniband/hw/mlx5/odp.c > > @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, > > access_mask |= ODP_WRITE_ALLOWED_BIT; > > > > np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); > > - if (np < 0) > > + if (np < 0) { > > + mutex_unlock(&odp->umem_mutex); > > return np; > > + } > > refer to the comments of ib_umem_odp_map_dma_and_lock: > 334 * Returns the number of pages mapped in success, negative error > code > 335 * for failure. > > I don't think it's correct to release the lock in all failure case, for > example when it reaches below error path. Thank you. That's certainly true. I will fix this in v3. Probably I will drop this patch and make changes on rxe side instead. > > 346 int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 > user_virt, > 347 u64 bcnt, u64 access_mask, bool > fault) > 348 __acquires(&umem_odp->umem_mutex) > > 349 { > > 350 struct task_struct *owning_process = NULL; > > 351 struct mm_struct *owning_mm = umem_odp->umem.owning_mm; > > 352 int pfn_index, dma_index, ret = 0, start_idx; > > 353 unsigned int page_shift, hmm_order, pfn_start_idx; > > 354 unsigned long num_pfns, current_seq; > > 355 struct hmm_range range = {}; > > 356 unsigned long timeout; > > 357 > > 358 if (access_mask == 0) > > 359 return -EINVAL; <<<<< no lock is hold yet > > 360 > > 361 if (user_virt < ib_umem_start(umem_odp) || > > 362 user_virt + bcnt > ib_umem_end(umem_odp)) > > 363 return -EFAULT; <<<<< no lock is hold yet > > > Further more, you changed public API's the behavior, do it matter for > other out-of-tree drivers which is using it, i'm not familiar with this, > maybe kernel has no restriction on it ? Yes, they are just left behind. The developers have to modify the driver by themselves if they want to keep it compatible with the upstream. I think this is one of the general reasons why companies contribute their works to OSS communities. They can lower the cost of maintaining their software while getting benefits from new features. Cf. The Linux Kernel Driver Interface https://www.kernel.org/doc/html/latest/process/stable-api-nonsense.html Thanks, Daisuke > > > > > > /* > > * No need to check whether the MTTs really belong to this MR, since
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index e9fa22d31c23..49da6735f7c8 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page( * * Maps the range passed in the argument to DMA addresses. * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. - * Upon success the ODP MR will be locked to let caller complete its device - * page table update. + * The umem mutex is locked in this function. Callers are responsible for + * releasing the lock. * * Returns the number of pages mapped in success, negative error code * for failure. @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, break; } } - /* upon success lock should stay on hold for the callee */ + if (!ret) ret = dma_index - start_idx; - else - mutex_unlock(&umem_odp->umem_mutex); out_put_mm: mmput_async(owning_mm); diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index bc97958818bb..a0de27651586 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, access_mask |= ODP_WRITE_ALLOWED_BIT; np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); - if (np < 0) + if (np < 0) { + mutex_unlock(&odp->umem_mutex); return np; + } /* * No need to check whether the MTTs really belong to this MR, since