Message ID | 1668141030-2-5-git-send-email-lizhijian@fujitsu.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp536421wru; Thu, 10 Nov 2022 20:35:02 -0800 (PST) X-Google-Smtp-Source: AA0mqf7uyJzaBxuAlQzqbdtwom9Zc2N0/pWJ7+pd9gnPgVfkkcdDe59662hZ33CEVctdH4GRJ7E+ X-Received: by 2002:a17:906:fb81:b0:7ae:9187:eb70 with SMTP id lr1-20020a170906fb8100b007ae9187eb70mr468941ejb.533.1668141302587; Thu, 10 Nov 2022 20:35:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668141302; cv=none; d=google.com; s=arc-20160816; b=mFM0JhhG97S9/Fz84T9B+oCDD7dxNaJCWkydUhChwCNyie8NaQDaOg1loYk8JoPJWW K3wCIANN1baPAa269TfjmYO2U5NCCHMYRGS/MjioE4zP9vVfgqIMmLWbQTi5PCHLZhQf KnS2d1t/xs7pE3E8ch0T2dfB01JXN5EwjzwXbiwMzR7qdxm8kgwHfwyWezdEqCyb0T/i WxybQnAe3Qtqga4n8z2OPggr/3mf1U6Yk9jRaAVt2EXElgX9Dur10fLi9EF8KHcZVPgE MPteK6pXAkLwUUVhGwPN6S2qj0Stlq4TLpNhoMEuzcEtAp67SnkJeiFk/6f4piQVpb7i +2gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=6BOQYMvWVSi2uBbZfL1Bzz07XDsCbIwy+Optf2MNYVA=; b=o9p+SdtGNauupIgiBBS4diuZg6Npo+6H3QYlyrvC2j/EwY0ukw4j8g5kRao0PRV7gq xyC9LfDuxR0JddPz326aKo5Lh6iJPEjDhml9/NOY0w7zZ08asF6dddHFqRatRERai0gE 3+bUi58WR0OaRo1pmW6T3lAs9YhBwuW/WXDOBrV2E3OKranmHr+hRi3zFfc5CTD44ya1 A8kKXBU3ZaLIwYSO/dZeP6ksPTICfu0dzxe6x+SlHi+dglNguhk4lDwlkTCojxaD6lc7 TJe1e45CylPmcfq0MUYyuCjGmP+6MMWtGCaCGF7qGEjk2KS12n91FgYhm8EV+5baDtJM qb/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fujitsu.com header.s=170520fj header.b=Wl7LeVr8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g4-20020aa7c584000000b0045bd677b3d5si1137581edq.342.2022.11.10.20.34.16; Thu, 10 Nov 2022 20:35:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fujitsu.com header.s=170520fj header.b=Wl7LeVr8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232594AbiKKEbS (ORCPT <rfc822;winker.wchi@gmail.com> + 99 others); Thu, 10 Nov 2022 23:31:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232117AbiKKEbC (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 10 Nov 2022 23:31:02 -0500 Received: from mail1.bemta37.messagelabs.com (mail1.bemta37.messagelabs.com [85.158.142.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93FAE69DD2; Thu, 10 Nov 2022 20:31:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fujitsu.com; s=170520fj; t=1668141059; i=@fujitsu.com; bh=6BOQYMvWVSi2uBbZfL1Bzz07XDsCbIwy+Optf2MNYVA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Wl7LeVr8srVymoGxdKgmQAMbHmM6+bwnmfzzLVYbzLZOCNZ7eMkzmAcCUP17WBfK6 2bAF/IG9DkwxOCM2lA2u8J5PiHCSB8QvaoP6QhDwyruzjpIogmV+erJ+U5MhMzKKHc I2W2UppcHCxJ80nhSlKL7lstMjHDY0s9gxN6P69gIxCagu/ift1fjA4QZ6S14k3APj PubLoCLVpB9FyvHKVpFwPf3IQYmPe3++EMraT/r/K5wiV3M0j8jecGg/iQnNN4Jc/A HP8IMX+l10b7mdhv5AWIBa/LrmR1VVTYOTOBdovSu9P7J6TcU/7XidRPWFcJrzPQvx 2Mhl+sZeUBoUg== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKIsWRWlGSWpSXmKPExsViZ8OxWZfpQm6 ywZtNphYzZ5xgtJjyaymzxeVdc9gsnh3qZbE4f6yf3YHVY+esu+wem1Z1snl83iTnsfXzbZYA lijWzLyk/IoE1oy1y/6xFJxTqtg0aS9jA2O7bBcjJ4eQwEZGiflnMiHsJUwSO8/LQdgHGCUe9 SiA2GwCGhL3Wm4ygtgiAp2MEtta00BsZgE3iU1vZrOD2MICdhIf7z9hAbFZBFQl3t9rYAOxeQ UcJRZ9PQ5WIyGgIDHl4XtmEJtTwEli+Z0vbBC7HCVOzbzDAlEvKHFyJsQcZgEJiYMvXgDVcwD 1KknM7I6HGFMhMWtWGxOErSZx9dwm5gmMgrOQdM9C0r2AkWkVo1lxalFZapGuqV5SUWZ6Rklu YmaOXmKVbqJeaqluXn5RSYauoV5iebFeanGxXnFlbnJOil5easkmRmDopxQnOO1gnLLsj94hR kkOJiVR3n02uclCfEn5KZUZicUZ8UWlOanFhxhlODiUJHj1zgLlBItS01Mr0jJzgHEIk5bg4F ES4RXeDpTmLS5IzC3OTIdInWJUlBLntToPlBAASWSU5sG1wWL/EqOslDAvIwMDgxBPQWpRbmY JqvwrRnEORiVh3rhzQFN4MvNK4Ka/AlrMBLTYLjULZHFJIkJKqoEpVy3ny/dFuypy52bn2Qgo Pt/88+kRjsksG68afykKOb0lUGmFYqLCkl093Zfj7JcI/HuYkbjh+9Qfcx9NeFS49OI9OZ6fL q9sPlZ/dg/Juarx0UXvWN+qTzv9wyQlVJ9wLxHsPlQXcWilj8dsvhsTFxbp/1LjuXE09uSU22 cMxD4EBPqfFZU4x3dKo1Qwebna95aqA1fc87m1xT4Y3ldtWexab81YFHfE5L/reaWdoQbvZ3h FhZkv3JnC9WlOenTYv+Ts0KsL6mz3SxTbuSUzdk1bZh4Vocj4UYN13q/c9Q8m2Tv2zG1YmHUv 8uBDcdHa3Uzlp+cr9W+ZYZb2/sbx9r5FEjzTl4d/YUpYZxKnxFKckWioxVxUnAgAXtqlj3gDA AA= X-Env-Sender: lizhijian@fujitsu.com X-Msg-Ref: server-15.tower-728.messagelabs.com!1668141058!608057!1 X-Originating-IP: [62.60.8.179] X-SYMC-ESS-Client-Auth: outbound-route-from=pass X-StarScan-Received: X-StarScan-Version: 9.100.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 14018 invoked from network); 11 Nov 2022 04:30:58 -0000 Received: from unknown (HELO n03ukasimr04.n03.fujitsu.local) (62.60.8.179) by server-15.tower-728.messagelabs.com with ECDHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 11 Nov 2022 04:30:58 -0000 Received: from n03ukasimr04.n03.fujitsu.local (localhost [127.0.0.1]) by n03ukasimr04.n03.fujitsu.local (Postfix) with ESMTP id 41D42156; Fri, 11 Nov 2022 04:30:58 +0000 (GMT) Received: from R01UKEXCASM126.r01.fujitsu.local (R01UKEXCASM126 [10.183.43.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by n03ukasimr04.n03.fujitsu.local (Postfix) with ESMTPS id 34D92153; Fri, 11 Nov 2022 04:30:58 +0000 (GMT) Received: from bc0da1a9c27e.localdomain (10.167.225.141) by R01UKEXCASM126.r01.fujitsu.local (10.183.43.178) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Fri, 11 Nov 2022 04:30:55 +0000 From: Li Zhijian <lizhijian@fujitsu.com> To: Zhu Yanjun <zyjzyj2000@gmail.com>, Jason Gunthorpe <jgg@ziepe.ca>, "Leon Romanovsky" <leon@kernel.org>, <linux-rdma@vger.kernel.org> CC: <linux-kernel@vger.kernel.org>, Li Zhijian <lizhijian@fujitsu.com> Subject: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Date: Fri, 11 Nov 2022 04:30:29 +0000 Message-ID: <1668141030-2-5-git-send-email-lizhijian@fujitsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1668141030-2-1-git-send-email-lizhijian@fujitsu.com> References: <1668141030-2-1-git-send-email-lizhijian@fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.167.225.141] X-ClientProxiedBy: G08CNEXCHPEKD07.g08.fujitsu.local (10.167.33.80) To R01UKEXCASM126.r01.fujitsu.local (10.183.43.178) X-Virus-Scanned: ClamAV using ClamSMTP X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749172934348826498?= X-GMAIL-MSGID: =?utf-8?q?1749172934348826498?= |
Series |
iova_to_vaddr refactor
|
|
Commit Message
Zhijian Li (Fujitsu)
Nov. 11, 2022, 4:30 a.m. UTC
For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to
map page.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------
drivers/infiniband/sw/rxe/rxe_resp.c | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++-
4 files changed, 27 insertions(+), 18 deletions(-)
Comments
On venerdì 11 novembre 2022 05:30:29 CET Li Zhijian wrote: > For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to > map page. > > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > drivers/infiniband/sw/rxe/rxe_loc.h | 1 + > drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------ > drivers/infiniband/sw/rxe/rxe_resp.c | 1 + > drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++- > 4 files changed, 27 insertions(+), 18 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h > b/drivers/infiniband/sw/rxe/rxe_loc.h index c2a5c8814a48..22a8c44d39c8 100644 > --- a/drivers/infiniband/sw/rxe/rxe_loc.h > +++ b/drivers/infiniband/sw/rxe/rxe_loc.h > @@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int > length, int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info > *dma, void *addr, int length, enum rxe_mr_copy_dir dir); > void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); > +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr); > struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, > enum rxe_mr_lookup_type type); > int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c > b/drivers/infiniband/sw/rxe/rxe_mr.c index a4e786b657b7..d26a4a33119c 100644 > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > @@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, struct ib_umem *umem; > struct sg_page_iter sg_iter; > int num_buf; > - void *vaddr; > int err; > - int i; > > umem = ib_umem_get(&rxe->ib_dev, start, length, access); > if (IS_ERR(umem)) { > @@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, num_buf = 0; > } > > - vaddr = page_address(sg_page_iter_page(&sg_iter)); > - if (!vaddr) { > - pr_warn("%s: Unable to get virtual address\n", > - __func__); > - err = -ENOMEM; > - goto err_cleanup_map; > - } > - > - buf->addr = (uintptr_t)vaddr; > + buf->page = sg_page_iter_page(&sg_iter); > num_buf++; > buf++; > - > } > } > > @@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, > > return 0; > > -err_cleanup_map: > - for (i = 0; i < mr->num_map; i++) > - kfree(mr->map[i]); > - kfree(mr->map); > err_release_umem: > ib_umem_release(umem); > err_out: > @@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int > *m_out, int *n_out, } > } > > +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr) > +{ > + if (mr->ibmr.type == IB_MR_TYPE_USER) > + kunmap_local(vaddr); > +} > + > static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) > { > size_t offset; > @@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, > int length) return NULL; > } > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > + if (mr->ibmr.type == IB_MR_TYPE_USER) { > + char *paddr; > + struct page *pg = mr->map[m]->buf[n].page; > + > + paddr = kmap_local_page(pg); > + if (paddr == NULL) { > + pr_warn("Failed to map page"); > + return NULL; > + } I know nothing about this code but I am here as a result of regular checks for changes to HIGHMEM mappings across the entire kernel. So please forgive me if I'm objecting to the correct changes. 1) It looks like this code had a call to page_address() and you converted it to mapping with kmap_local_page(). If page_address() is related and it used to work properly, the page you are mapping cannot come from ZONE_HIGHMEM. Therefore, kmap_local_page() looks like an overkill. I'm probably missing something... > + return paddr + offset; > + } else > + return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > } > > +/* must call rxe_unmap_vaddr to unmap vaddr */ > void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) > { > if (mr->state != RXE_MR_STATE_VALID) { > @@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, > int length, dest = (dir == RXE_TO_MR_OBJ) ? va : addr; > > memcpy(dest, src, bytes); > + rxe_unmap_vaddr(mr, va); > > length -= bytes; > addr += bytes; > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c > b/drivers/infiniband/sw/rxe/rxe_resp.c index 483043dc4e89..765cb9f8538a > 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp, > > *vaddr = value; > spin_unlock_bh(&atomic_ops_lock); > + rxe_unmap_vaddr(mr, vaddr); > > qp->resp.msn++; > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h > b/drivers/infiniband/sw/rxe/rxe_verbs.h index acab785ba7e2..6080a4b32f09 > 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.h > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h > @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type { > #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) > > struct rxe_phys_buf { > - u64 addr; > + union { > + u64 addr; /* IB_MR_TYPE_MEM_REG */ > + struct page *page; /* IB_MR_TYPE_USER */ > + }; > }; > > struct rxe_map { > -- > 2.31.1
On venerdì 11 novembre 2022 05:30:29 CET Li Zhijian wrote: > For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to > map page. > > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > drivers/infiniband/sw/rxe/rxe_loc.h | 1 + > drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------ > drivers/infiniband/sw/rxe/rxe_resp.c | 1 + > drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++- > 4 files changed, 27 insertions(+), 18 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h > b/drivers/infiniband/sw/rxe/rxe_loc.h index c2a5c8814a48..22a8c44d39c8 100644 > --- a/drivers/infiniband/sw/rxe/rxe_loc.h > +++ b/drivers/infiniband/sw/rxe/rxe_loc.h > @@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int > length, int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info > *dma, void *addr, int length, enum rxe_mr_copy_dir dir); > void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); > +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr); > struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, > enum rxe_mr_lookup_type type); > int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c > b/drivers/infiniband/sw/rxe/rxe_mr.c index a4e786b657b7..d26a4a33119c 100644 > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > @@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, struct ib_umem *umem; > struct sg_page_iter sg_iter; > int num_buf; > - void *vaddr; > int err; > - int i; > > umem = ib_umem_get(&rxe->ib_dev, start, length, access); > if (IS_ERR(umem)) { > @@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, num_buf = 0; > } > > - vaddr = page_address(sg_page_iter_page(&sg_iter)); > - if (!vaddr) { > - pr_warn("%s: Unable to get virtual address\n", > - __func__); > - err = -ENOMEM; > - goto err_cleanup_map; > - } > - > - buf->addr = (uintptr_t)vaddr; > + buf->page = sg_page_iter_page(&sg_iter); > num_buf++; > buf++; > - > } > } > > @@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 > length, u64 iova, > > return 0; > > -err_cleanup_map: > - for (i = 0; i < mr->num_map; i++) > - kfree(mr->map[i]); > - kfree(mr->map); > err_release_umem: > ib_umem_release(umem); > err_out: > @@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int > *m_out, int *n_out, } > } > > +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr) > +{ > + if (mr->ibmr.type == IB_MR_TYPE_USER) > + kunmap_local(vaddr); > +} > + > static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) > { > size_t offset; > @@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, > int length) return NULL; > } > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > + if (mr->ibmr.type == IB_MR_TYPE_USER) { > + char *paddr; > + struct page *pg = mr->map[m]->buf[n].page; > + > + paddr = kmap_local_page(pg); > + if (paddr == NULL) { > + pr_warn("Failed to map page"); > + return NULL; > + } I know nothing about this code but I am here as a result of regular checks for changes to HIGHMEM mappings across the entire kernel. So please forgive me if I'm objecting on correct changes. 1) It looks like this code had a call to page_address() and you converted it to mapping with kmap_local_page(). If page_address() is related and it used to work properly, the page you are mapping cannot come from ZONE_HIGHMEM. Therefore, kmap_local_page() looks like an overkill. I'm probably missing something... 2) What made you think that kmap_local_page() might fail and return NULL? AFAIK, kmap_local_page() won't ever return NULL, therefore there is no need to check. Regards, Fabio P.S.: I just noticed that the second entry in my list was missing from the other email which should be discarded. > + return paddr + offset; > + } else > + return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > } > > +/* must call rxe_unmap_vaddr to unmap vaddr */ > void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) > { > if (mr->state != RXE_MR_STATE_VALID) { > @@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, > int length, dest = (dir == RXE_TO_MR_OBJ) ? va : addr; > > memcpy(dest, src, bytes); > + rxe_unmap_vaddr(mr, va); > > length -= bytes; > addr += bytes; > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c > b/drivers/infiniband/sw/rxe/rxe_resp.c index 483043dc4e89..765cb9f8538a > 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp, > > *vaddr = value; > spin_unlock_bh(&atomic_ops_lock); > + rxe_unmap_vaddr(mr, vaddr); > > qp->resp.msn++; > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h > b/drivers/infiniband/sw/rxe/rxe_verbs.h index acab785ba7e2..6080a4b32f09 > 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.h > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h > @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type { > #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) > > struct rxe_phys_buf { > - u64 addr; > + union { > + u64 addr; /* IB_MR_TYPE_MEM_REG */ > + struct page *page; /* IB_MR_TYPE_USER */ > + }; > }; > > struct rxe_map { > -- > 2.31.1
On 16/11/2022 20:37, Fabio M. De Francesco wrote: >> - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; >> + if (mr->ibmr.type == IB_MR_TYPE_USER) { >> + char *paddr; >> + struct page *pg = mr->map[m]->buf[n].page; >> + >> + paddr = kmap_local_page(pg); >> + if (paddr == NULL) { >> + pr_warn("Failed to map page"); >> + return NULL; >> + } > I know nothing about this code but I am here as a result of regular checks for > changes to HIGHMEM mappings across the entire kernel. So please forgive me if > I'm objecting to the correct changes. > > 1) It looks like this code had a call to page_address() and you converted it > to mapping with kmap_local_page(). > > If page_address() is related and it used to work properly, the page you are > mapping cannot come from ZONE_HIGHMEM. Yes, you are totally right. Therefore, kmap_local_page() looks like > an overkill. The confusion about the page_address() here has been raised for a long time[1][2]. https://www.spinics.net/lists/linux-rdma/msg113206.html https://lore.kernel.org/all/20220121160654.GC773547@iweiny-DESK2.sc.intel.com/ Thanks Zhijian > > I'm probably missing something... >
On Fri, Nov 11, 2022 at 04:30:29AM +0000, Li Zhijian wrote: > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h > index acab785ba7e2..6080a4b32f09 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.h > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h > @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type { > #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) > > struct rxe_phys_buf { > - u64 addr; > + union { > + u64 addr; /* IB_MR_TYPE_MEM_REG */ > + struct page *page; /* IB_MR_TYPE_USER */ > + }; Everything rxe handles is a struct page, even the kernel registrations. Jason
On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote: > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > > + if (mr->ibmr.type == IB_MR_TYPE_USER) { > > + char *paddr; > > + struct page *pg = mr->map[m]->buf[n].page; > > + > > + paddr = kmap_local_page(pg); > > + if (paddr == NULL) { > > + pr_warn("Failed to map page"); > > + return NULL; > > + } > > I know nothing about this code but I am here as a result of regular checks for > changes to HIGHMEM mappings across the entire kernel. So please forgive me if > I'm objecting to the correct changes. > > 1) It looks like this code had a call to page_address() and you converted it > to mapping with kmap_local_page(). It only worked properly because the kconfig is blocking CONFIG_HIGHMEM so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and could be anything. This is done indirectly through config INFINIBAND_VIRT_DMA def_bool !HIGHMEM But this patch is undoing parts of what are causing that restriction by using the proper APIs. Though I'm not sure if we can eventually get to remove it for RXE completely. Jason
On lunedì 21 novembre 2022 19:53:54 CET Jason Gunthorpe wrote: > On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote: > > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > > > + if (mr->ibmr.type == IB_MR_TYPE_USER) { > > > + char *paddr; > > > + struct page *pg = mr->map[m]->buf[n].page; > > > + > > > + paddr = kmap_local_page(pg); > > > + if (paddr == NULL) { > > > + pr_warn("Failed to map page"); > > > + return NULL; > > > + } > > > > I know nothing about this code but I am here as a result of regular checks > > for changes to HIGHMEM mappings across the entire kernel. So please forgive > > me if I'm objecting to the correct changes. > > > > 1) It looks like this code had a call to page_address() and you converted it > > to mapping with kmap_local_page(). > > It only worked properly because the kconfig is blocking CONFIG_HIGHMEM > so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and > could be anything. > > This is done indirectly through > > config INFINIBAND_VIRT_DMA > def_bool !HIGHMEM > But this patch is undoing parts of what are causing that restriction > by using the proper APIs. Though I'm not sure if we can eventually get > to remove it for RXE completely. > Jason Ah, OK. This code was for !HIGHMEM kernels... I understand but, FWIW I doubt that the use of page_address(), instead of kmapping, should ever be made on the bases that kconfig is blocking HIGHMEM. However, if I understand correctly, that restriction (!HIGHMEM) is going to be removed. Therefore, page_address() will break on HIGHMEM and Jason is correctly converting to kmap_local_page(). There is only one thing left... I think that he missed another mail from me. The one you responded to was missing the final paragraph, so I sent another message few minutes later. kmap_local_page() returns always valid pointers to kernel virtual addresses. I can't see any ways for kmap_local_page() to return NULL. Therefore, I was asking for the reason to check for NULL. Thanks, Fabio
On mercoledì 23 novembre 2022 00:24:26 CET Fabio M. De Francesco wrote: > On lunedì 21 novembre 2022 19:53:54 CET Jason Gunthorpe wrote: > > On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote: > > > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; > > > > + if (mr->ibmr.type == IB_MR_TYPE_USER) { > > > > + char *paddr; > > > > + struct page *pg = mr->map[m]->buf[n].page; > > > > + > > > > + paddr = kmap_local_page(pg); > > > > + if (paddr == NULL) { > > > > + pr_warn("Failed to map page"); > > > > + return NULL; > > > > + } > > > > > > I know nothing about this code but I am here as a result of regular checks > > > for changes to HIGHMEM mappings across the entire kernel. So please > > forgive > > > > me if I'm objecting to the correct changes. > > > > > > 1) It looks like this code had a call to page_address() and you converted > > it > > > > to mapping with kmap_local_page(). > > > > It only worked properly because the kconfig is blocking CONFIG_HIGHMEM > > so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and > > could be anything. > > > > This is done indirectly through > > > > config INFINIBAND_VIRT_DMA > > > > def_bool !HIGHMEM > > > > But this patch is undoing parts of what are causing that restriction > > by using the proper APIs. Though I'm not sure if we can eventually get > > to remove it for RXE completely. > > Jason > > Ah, OK. This code was for !HIGHMEM kernels... > > I understand but, FWIW I doubt that the use of page_address(), instead of > kmapping, should ever be made on the bases that kconfig is blocking HIGHMEM. > > However, if I understand correctly, that restriction (!HIGHMEM) is going to be > removed. Therefore, page_address() will break on HIGHMEM and Jason is > correctly converting to kmap_local_page(). Jason, I'm sorry for the erroneous attribution :-( Fabio > There is only one thing left... I think that he missed another mail from me. > The one you responded to was missing the final paragraph, so I sent another > message few minutes later. > > kmap_local_page() returns always valid pointers to kernel virtual addresses. I > can't see any ways for kmap_local_page() to return NULL. Therefore, I was > asking for the reason to check for NULL. > > Thanks, > > Fabio
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index c2a5c8814a48..22a8c44d39c8 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma, void *addr, int length, enum rxe_mr_copy_dir dir); void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr); struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, enum rxe_mr_lookup_type type); int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index a4e786b657b7..d26a4a33119c 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, struct ib_umem *umem; struct sg_page_iter sg_iter; int num_buf; - void *vaddr; int err; - int i; umem = ib_umem_get(&rxe->ib_dev, start, length, access); if (IS_ERR(umem)) { @@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, num_buf = 0; } - vaddr = page_address(sg_page_iter_page(&sg_iter)); - if (!vaddr) { - pr_warn("%s: Unable to get virtual address\n", - __func__); - err = -ENOMEM; - goto err_cleanup_map; - } - - buf->addr = (uintptr_t)vaddr; + buf->page = sg_page_iter_page(&sg_iter); num_buf++; buf++; - } } @@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, return 0; -err_cleanup_map: - for (i = 0; i < mr->num_map; i++) - kfree(mr->map[i]); - kfree(mr->map); err_release_umem: ib_umem_release(umem); err_out: @@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out, } } +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr) +{ + if (mr->ibmr.type == IB_MR_TYPE_USER) + kunmap_local(vaddr); +} + static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) { size_t offset; @@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) return NULL; } - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; + if (mr->ibmr.type == IB_MR_TYPE_USER) { + char *paddr; + struct page *pg = mr->map[m]->buf[n].page; + + paddr = kmap_local_page(pg); + if (paddr == NULL) { + pr_warn("Failed to map page"); + return NULL; + } + return paddr + offset; + } else + return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; } +/* must call rxe_unmap_vaddr to unmap vaddr */ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) { if (mr->state != RXE_MR_STATE_VALID) { @@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, dest = (dir == RXE_TO_MR_OBJ) ? va : addr; memcpy(dest, src, bytes); + rxe_unmap_vaddr(mr, va); length -= bytes; addr += bytes; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 483043dc4e89..765cb9f8538a 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp, *vaddr = value; spin_unlock_bh(&atomic_ops_lock); + rxe_unmap_vaddr(mr, vaddr); qp->resp.msn++; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index acab785ba7e2..6080a4b32f09 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type { #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) struct rxe_phys_buf { - u64 addr; + union { + u64 addr; /* IB_MR_TYPE_MEM_REG */ + struct page *page; /* IB_MR_TYPE_USER */ + }; }; struct rxe_map {