From patchwork Sat Jan 13 08:59:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 187876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp668095dyc; Sat, 13 Jan 2024 01:04:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IHBNO/U2JP3su4JlupkG631kYsCeAghqs3OODqCpM0QnFBjkziYdioogPreYSuvZGL0GBUJ X-Received: by 2002:a05:622a:e:b0:429:97ec:2222 with SMTP id x14-20020a05622a000e00b0042997ec2222mr3381121qtw.103.1705136665457; Sat, 13 Jan 2024 01:04:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705136665; cv=none; d=google.com; s=arc-20160816; b=uAprvdkQvz8mgptPdxtjvd9RLkgPLMAedurKJdfQUg0tjQQs6pAKXjoGnl0y2W+uDh +GSVT64Ujzf6MN9yh8UThfUrbxRfTEcC/OO16uyE4xaXEW3lJDBH3R3mMM5ofX720KZa sHtLNc8wNq/H+h2OQ7/YVTvwBugbbx4V0+4QLr/7eAd4xBo16zDtewnGTzVx2WgEQLw+ ivsJtJ3pFWfhX0he2WfUxJYuk5YQr2QZArVrI35o+ONYlUyZTSGSg+6DzRxUxh4fmY+e AMiib3HIqc3Cm9LLSeRmElbt4p1gh+Dlvxyq3wbutfLznO07SRpB0Wb7hbmK7Uk9s267 m1gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=nMji0b4tSDZd5rU0x1eYwfnjfxa/o9TG0rwD4+EZRHM=; fh=KWvyQxL3Ff+3WPSMjlYu+P4255AmcMULAsFol6M1vNI=; b=aJBq82uhgaO+QaH2Zh/qfqgdDapyas9n1G85LP97CNheIdfVv6HXuDvZhFLqdeFIYa V8kySFw1ieOoo+7VKQbZiSuwYFq8eECS3R3GwArFYpnIF72poTQT+pBHkLKxfVgjKhTV vAxmL7YvaZL5oFSp6fxxSF70YmJsjAxpn0gmEJcTGCnFKLcN/lKLngQ9hHZPdg6tzBjr GzTedqdFN9Z7uzAlRd5RFwAsQbPaMld8g7D5IVjLNFmyxGi8G8cI8UoHyBfBU4cbd09j 4oiyt4UbEBsZXaKqdaUaAb0eSzvk4Nm4TVfJb4JjsxeR/GefycDjLKvTRKHubH4rzz3K 6dHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25222-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25222-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id e18-20020ac85992000000b004239fe197a8si4456096qte.649.2024.01.13.01.04.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Jan 2024 01:04:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25222-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25222-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25222-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 35B361C22872 for ; Sat, 13 Jan 2024 09:04:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9127517551; Sat, 13 Jan 2024 09:03:38 +0000 (UTC) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D6C3655; Sat, 13 Jan 2024 09:03:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TBspq5y84z29jyN; Sat, 13 Jan 2024 17:01:55 +0800 (CST) Received: from kwepemi500006.china.huawei.com (unknown [7.221.188.68]) by mail.maildlp.com (Postfix) with ESMTPS id 5C13C1404F7; Sat, 13 Jan 2024 17:03:30 +0800 (CST) Received: from localhost.localdomain (10.67.165.2) by kwepemi500006.china.huawei.com (7.221.188.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 13 Jan 2024 17:03:29 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-next 1/6] RDMA/hns: Refactor mtr find Date: Sat, 13 Jan 2024 16:59:30 +0800 Message-ID: <20240113085935.2838701-2-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> References: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemi500006.china.huawei.com (7.221.188.68) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787965384146501369 X-GMAIL-MSGID: 1787965384146501369 From: Chengchang Tang hns_roce_mtr_find() is a collection of multiple functions, and the return value is also difficult to understand, which is not conducive to modification and maintenance. Separate the function of obtaining MTR root BA from this function. And some adjustments has been made to improve readability. Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_cq.c | 11 +-- drivers/infiniband/hw/hns/hns_roce_device.h | 7 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 102 ++++++++++---------- drivers/infiniband/hw/hns/hns_roce_mr.c | 86 +++++++++++------ 4 files changed, 121 insertions(+), 85 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c index 1b6d16af8c12..7250d0643b5c 100644 --- a/drivers/infiniband/hw/hns/hns_roce_cq.c +++ b/drivers/infiniband/hw/hns/hns_roce_cq.c @@ -133,14 +133,12 @@ static int alloc_cqc(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq) struct hns_roce_cq_table *cq_table = &hr_dev->cq_table; struct ib_device *ibdev = &hr_dev->ib_dev; u64 mtts[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle; int ret; - ret = hns_roce_mtr_find(hr_dev, &hr_cq->mtr, 0, mtts, ARRAY_SIZE(mtts), - &dma_handle); - if (!ret) { + ret = hns_roce_mtr_find(hr_dev, &hr_cq->mtr, 0, mtts, ARRAY_SIZE(mtts)); + if (ret) { ibdev_err(ibdev, "failed to find CQ mtr, ret = %d.\n", ret); - return -EINVAL; + return ret; } /* Get CQC memory HEM(Hardware Entry Memory) table */ @@ -157,7 +155,8 @@ static int alloc_cqc(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq) goto err_put; } - ret = hns_roce_create_cqc(hr_dev, hr_cq, mtts, dma_handle); + ret = hns_roce_create_cqc(hr_dev, hr_cq, mtts, + hns_roce_get_mtr_ba(&hr_cq->mtr)); if (ret) goto err_xa; diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index b1fce5ddf631..dd652dc090b0 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1152,8 +1152,13 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev); /* hns roce hw need current block and next block addr from mtt */ #define MTT_MIN_COUNT 2 +static inline dma_addr_t hns_roce_get_mtr_ba(struct hns_roce_mtr *mtr) +{ + return mtr->hem_cfg.root_ba; +} + int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, - u32 offset, u64 *mtt_buf, int mtt_max, u64 *base_addr); + u32 offset, u64 *mtt_buf, int mtt_max); int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, struct hns_roce_buf_attr *buf_attr, unsigned int page_shift, struct ib_udata *udata, diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 8206daea6767..94e9e6a237cf 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3195,21 +3195,22 @@ static int set_mtpt_pbl(struct hns_roce_dev *hr_dev, u64 pages[HNS_ROCE_V2_MAX_INNER_MTPT_NUM] = { 0 }; struct ib_device *ibdev = &hr_dev->ib_dev; dma_addr_t pbl_ba; - int i, count; + int ret; + int i; - count = hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, pages, - min_t(int, ARRAY_SIZE(pages), mr->npages), - &pbl_ba); - if (count < 1) { - ibdev_err(ibdev, "failed to find PBL mtr, count = %d.\n", - count); - return -ENOBUFS; + ret = hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, pages, + min_t(int, ARRAY_SIZE(pages), mr->npages)); + if (ret) { + ibdev_err(ibdev, "failed to find PBL mtr, ret = %d.\n", ret); + return ret; } /* Aligned to the hardware address access unit */ - for (i = 0; i < count; i++) + for (i = 0; i < ARRAY_SIZE(pages); i++) pages[i] >>= 6; + pbl_ba = hns_roce_get_mtr_ba(&mr->pbl_mtr); + mpt_entry->pbl_size = cpu_to_le32(mr->npages); mpt_entry->pbl_ba_l = cpu_to_le32(pbl_ba >> 3); hr_reg_write(mpt_entry, MPT_PBL_BA_H, upper_32_bits(pbl_ba >> 3)); @@ -3308,18 +3309,12 @@ static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev, static int hns_roce_v2_frmr_write_mtpt(struct hns_roce_dev *hr_dev, void *mb_buf, struct hns_roce_mr *mr) { - struct ib_device *ibdev = &hr_dev->ib_dev; + dma_addr_t pbl_ba = hns_roce_get_mtr_ba(&mr->pbl_mtr); struct hns_roce_v2_mpt_entry *mpt_entry; - dma_addr_t pbl_ba = 0; mpt_entry = mb_buf; memset(mpt_entry, 0, sizeof(*mpt_entry)); - if (hns_roce_mtr_find(hr_dev, &mr->pbl_mtr, 0, NULL, 0, &pbl_ba) < 0) { - ibdev_err(ibdev, "failed to find frmr mtr.\n"); - return -ENOBUFS; - } - hr_reg_write(mpt_entry, MPT_ST, V2_MPT_ST_FREE); hr_reg_write(mpt_entry, MPT_PD, mr->pd); @@ -4346,17 +4341,20 @@ static int config_qp_rq_buf(struct hns_roce_dev *hr_dev, { u64 mtts[MTT_MIN_COUNT] = { 0 }; u64 wqe_sge_ba; - int count; + int ret; /* Search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.offset, mtts, - MTT_MIN_COUNT, &wqe_sge_ba); - if (hr_qp->rq.wqe_cnt && count < 1) { + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.offset, mtts, + MTT_MIN_COUNT); + if (hr_qp->rq.wqe_cnt && ret) { ibdev_err(&hr_dev->ib_dev, - "failed to find RQ WQE, QPN = 0x%lx.\n", hr_qp->qpn); - return -EINVAL; + "failed to find QP(0x%lx) RQ WQE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; } + wqe_sge_ba = hns_roce_get_mtr_ba(&hr_qp->mtr); + context->wqe_sge_ba = cpu_to_le32(wqe_sge_ba >> 3); qpc_mask->wqe_sge_ba = 0; @@ -4418,23 +4416,23 @@ static int config_qp_sq_buf(struct hns_roce_dev *hr_dev, struct ib_device *ibdev = &hr_dev->ib_dev; u64 sge_cur_blk = 0; u64 sq_cur_blk = 0; - int count; + int ret; /* search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, 0, &sq_cur_blk, 1, NULL); - if (count < 1) { - ibdev_err(ibdev, "failed to find QP(0x%lx) SQ buf.\n", - hr_qp->qpn); - return -EINVAL; + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->sq.offset, + &sq_cur_blk, 1); + if (ret) { + ibdev_err(ibdev, "failed to find QP(0x%lx) SQ WQE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; } if (hr_qp->sge.sge_cnt > 0) { - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, - hr_qp->sge.offset, - &sge_cur_blk, 1, NULL); - if (count < 1) { - ibdev_err(ibdev, "failed to find QP(0x%lx) SGE buf.\n", - hr_qp->qpn); - return -EINVAL; + ret = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, + hr_qp->sge.offset, &sge_cur_blk, 1); + if (ret) { + ibdev_err(ibdev, "failed to find QP(0x%lx) SGE buf, ret = %d.\n", + hr_qp->qpn, ret); + return ret; } } @@ -5581,18 +5579,20 @@ static int hns_roce_v2_write_srqc_index_queue(struct hns_roce_srq *srq, struct ib_device *ibdev = srq->ibsrq.device; struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); u64 mtts_idx[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle_idx = 0; + dma_addr_t dma_handle_idx; int ret; /* Get physical address of idx que buf */ ret = hns_roce_mtr_find(hr_dev, &idx_que->mtr, 0, mtts_idx, - ARRAY_SIZE(mtts_idx), &dma_handle_idx); - if (ret < 1) { + ARRAY_SIZE(mtts_idx)); + if (ret) { ibdev_err(ibdev, "failed to find mtr for SRQ idx, ret = %d.\n", ret); - return -ENOBUFS; + return ret; } + dma_handle_idx = hns_roce_get_mtr_ba(&idx_que->mtr); + hr_reg_write(ctx, SRQC_IDX_HOP_NUM, to_hr_hem_hopnum(hr_dev->caps.idx_hop_num, srq->wqe_cnt)); @@ -5624,20 +5624,22 @@ static int hns_roce_v2_write_srqc(struct hns_roce_srq *srq, void *mb_buf) struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); struct hns_roce_srq_context *ctx = mb_buf; u64 mtts_wqe[MTT_MIN_COUNT] = {}; - dma_addr_t dma_handle_wqe = 0; + dma_addr_t dma_handle_wqe; int ret; memset(ctx, 0, sizeof(*ctx)); /* Get the physical address of srq buf */ ret = hns_roce_mtr_find(hr_dev, &srq->buf_mtr, 0, mtts_wqe, - ARRAY_SIZE(mtts_wqe), &dma_handle_wqe); - if (ret < 1) { + ARRAY_SIZE(mtts_wqe)); + if (ret) { ibdev_err(ibdev, "failed to find mtr for SRQ WQE, ret = %d.\n", ret); - return -ENOBUFS; + return ret; } + dma_handle_wqe = hns_roce_get_mtr_ba(&srq->buf_mtr); + hr_reg_write(ctx, SRQC_SRQ_ST, 1); hr_reg_write_bool(ctx, SRQC_SRQ_TYPE, srq->ibsrq.srq_type == IB_SRQT_XRC); @@ -6353,7 +6355,7 @@ static int config_eqc(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq, u64 eqe_ba[MTT_MIN_COUNT] = { 0 }; struct hns_roce_eq_context *eqc; u64 bt_ba = 0; - int count; + int ret; eqc = mb_buf; memset(eqc, 0, sizeof(struct hns_roce_eq_context)); @@ -6361,13 +6363,15 @@ static int config_eqc(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq, init_eq_config(hr_dev, eq); /* if not multi-hop, eqe buffer only use one trunk */ - count = hns_roce_mtr_find(hr_dev, &eq->mtr, 0, eqe_ba, MTT_MIN_COUNT, - &bt_ba); - if (count < 1) { - dev_err(hr_dev->dev, "failed to find EQE mtr\n"); - return -ENOBUFS; + ret = hns_roce_mtr_find(hr_dev, &eq->mtr, 0, eqe_ba, + ARRAY_SIZE(eqe_ba)); + if (ret) { + dev_err(hr_dev->dev, "failed to find EQE mtr, ret = %d\n", ret); + return ret; } + bt_ba = hns_roce_get_mtr_ba(&eq->mtr); + hr_reg_write(eqc, EQC_EQ_ST, HNS_ROCE_V2_EQ_STATE_VALID); hr_reg_write(eqc, EQC_EQE_HOP_NUM, eq->hop_num); hr_reg_write(eqc, EQC_OVER_IGNORE, eq->over_ignore); diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 91cd580480fe..9537a2c00bb6 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -809,47 +809,53 @@ int hns_roce_mtr_map(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, return ret; } -int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, - u32 offset, u64 *mtt_buf, int mtt_max, u64 *base_addr) +static int hns_roce_get_direct_addr_mtt(struct hns_roce_hem_cfg *cfg, + u32 start_index, u64 *mtt_buf, + int mtt_cnt) { - struct hns_roce_hem_cfg *cfg = &mtr->hem_cfg; - int mtt_count, left; - u32 start_index; + int mtt_count; int total = 0; - __le64 *mtts; u32 npage; u64 addr; - if (!mtt_buf || mtt_max < 1) - goto done; - - /* no mtt memory in direct mode, so just return the buffer address */ - if (cfg->is_direct) { - start_index = offset >> HNS_HW_PAGE_SHIFT; - for (mtt_count = 0; mtt_count < cfg->region_count && - total < mtt_max; mtt_count++) { - npage = cfg->region[mtt_count].offset; - if (npage < start_index) - continue; + if (mtt_cnt > cfg->region_count) + return -EINVAL; - addr = cfg->root_ba + (npage << HNS_HW_PAGE_SHIFT); - mtt_buf[total] = addr; + for (mtt_count = 0; mtt_count < cfg->region_count && total < mtt_cnt; + mtt_count++) { + npage = cfg->region[mtt_count].offset; + if (npage < start_index) + continue; - total++; - } + addr = cfg->root_ba + (npage << HNS_HW_PAGE_SHIFT); + mtt_buf[total] = addr; - goto done; + total++; } - start_index = offset >> cfg->buf_pg_shift; - left = mtt_max; + if (!total) + return -ENOENT; + + return 0; +} + +static int hns_roce_get_mhop_mtt(struct hns_roce_dev *hr_dev, + struct hns_roce_mtr *mtr, u32 start_index, + u64 *mtt_buf, int mtt_cnt) +{ + int left = mtt_cnt; + int total = 0; + int mtt_count; + __le64 *mtts; + u32 npage; + while (left > 0) { mtt_count = 0; mtts = hns_roce_hem_list_find_mtt(hr_dev, &mtr->hem_list, start_index + total, &mtt_count); if (!mtts || !mtt_count) - goto done; + break; npage = min(mtt_count, left); left -= npage; @@ -857,11 +863,33 @@ int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, mtt_buf[total++] = le64_to_cpu(mtts[mtt_count]); } -done: - if (base_addr) - *base_addr = cfg->root_ba; + if (!total) + return -ENOENT; + + return 0; +} + +int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, + u32 offset, u64 *mtt_buf, int mtt_max) +{ + struct hns_roce_hem_cfg *cfg = &mtr->hem_cfg; + u32 start_index; + int ret; + + if (!mtt_buf || mtt_max < 1) + return -EINVAL; - return total; + /* no mtt memory in direct mode, so just return the buffer address */ + if (cfg->is_direct) { + start_index = offset >> HNS_HW_PAGE_SHIFT; + ret = hns_roce_get_direct_addr_mtt(cfg, start_index, + mtt_buf, mtt_max); + } else { + start_index = offset >> cfg->buf_pg_shift; + ret = hns_roce_get_mhop_mtt(hr_dev, mtr, start_index, + mtt_buf, mtt_max); + } + return ret; } static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, From patchwork Sat Jan 13 08:59:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 187877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp668342dyc; Sat, 13 Jan 2024 01:05:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IGKyBcgceMz1i/rYgG6N6c+EAvSUgp71dFpmvRHoma4/LCvHxrMxVRmum8jq/F1qTvz7D/w X-Received: by 2002:a05:600c:1d26:b0:40e:61f1:ba3 with SMTP id l38-20020a05600c1d2600b0040e61f10ba3mr1315596wms.36.1705136705850; Sat, 13 Jan 2024 01:05:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705136705; cv=none; d=google.com; s=arc-20160816; b=dif0S9xQiFzsPv9PWypbATu1Z2FooBhOzbQwNtb1bFM+Blry0fYEOkr8qEB4P6IsNJ 7ptZmSFD0GZVEgHBukX4v/xbRiP5QxGYky/OroUxZ3NQz1BRqrd/OarS4x7lce71JqD1 Ci+oWSmk7rSaNQzCclf8rJT0zDZBmYwBLtUtW5W4XZFLkAXSqZUaBWEQnO6fukwV9Xa1 SPxGJF9DxbFvPes6hlNJ4D59gDCf8J1uOb7hcQ7grUvMefO/K3rHau0IneFN9DY+o8+/ fJt+yOGTQBpu8DXrMkT7Wsmi1FxDDoPRVj/h73XrzEq+P+qH6UB+P/AAsnwMT73KWQLa hudQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=4enNPqHj4NAmJuKN5eRT1RgZ9Y+4BwwLTSdSl7oxMYo=; fh=KWvyQxL3Ff+3WPSMjlYu+P4255AmcMULAsFol6M1vNI=; b=iwoa5nMvVgwMDbPqQEIZRywA9EJ7mF9O8ZvTqCUZNVqgrMZO9b4r84MinaMsg+iC1j Y0k7fiYytRRwRYVJV2bw6DHd4DQSuoOtdJt3HUfnED0PWQGnrt0ShaMZlR1yViWFrKo5 Pgj1KYaQt5KtmcUEv5eXF1Ziv4lHEQlZfr0d/xkNNQ6t5KIsrkxAuQmMSYRcv3muhPVR cEKOHX+Mm4qmRIp3u/QPBIh3JsqarQYwbrQ8wcguGAagMgI5Jp6C1D4KhdPRgZpFXrA/ hHMM3vmjVU/rHOjC70K60P0DVp/HDF45VWk3gMfBSw/Wsxw6CuAR8CJYz4p3BQHm4oHB 8ViQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25226-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25226-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id y7-20020a50eb07000000b00556ab6b5ab4si2124690edp.148.2024.01.13.01.05.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Jan 2024 01:05:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25226-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25226-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25226-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 4606F1F2362C for ; Sat, 13 Jan 2024 09:05:05 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 69FE517592; Sat, 13 Jan 2024 09:03:42 +0000 (UTC) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88A1917550; Sat, 13 Jan 2024 09:03:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4TBsq632W3zvTkk; Sat, 13 Jan 2024 17:02:10 +0800 (CST) Received: from kwepemi500006.china.huawei.com (unknown [7.221.188.68]) by mail.maildlp.com (Postfix) with ESMTPS id AA74D18001D; Sat, 13 Jan 2024 17:03:30 +0800 (CST) Received: from localhost.localdomain (10.67.165.2) by kwepemi500006.china.huawei.com (7.221.188.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 13 Jan 2024 17:03:30 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-next 2/6] RDMA/hns: Refactor mtr_init_buf_cfg() Date: Sat, 13 Jan 2024 16:59:31 +0800 Message-ID: <20240113085935.2838701-3-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> References: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemi500006.china.huawei.com (7.221.188.68) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787965426570749815 X-GMAIL-MSGID: 1787965426570749815 From: Chengchang Tang page_shift and page_cnt is only used in mtr_map_bufs(). And these parameter could be calculated indepedently. Strip the computation of page_shift and page_cnt from mtr_init_buf_cfg(), reducing the number of parameters of it. This helps reducing coupling between mtr_init_buf_cfg() and mtr_map_bufs(). Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_mr.c | 76 +++++++++++++++---------- 1 file changed, 45 insertions(+), 31 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 9537a2c00bb6..adc401aea8df 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -707,14 +707,37 @@ static int mtr_alloc_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, return 0; } -static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, - int page_count, unsigned int page_shift) +static int cal_mtr_pg_cnt(struct hns_roce_mtr *mtr) +{ + struct hns_roce_buf_region *region; + int page_cnt = 0; + int i; + + for (i = 0; i < mtr->hem_cfg.region_count; i++) { + region = &mtr->hem_cfg.region[i]; + page_cnt += region->count; + } + + return page_cnt; +} + +static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr) { struct ib_device *ibdev = &hr_dev->ib_dev; + int page_count = cal_mtr_pg_cnt(mtr); + unsigned int page_shift; dma_addr_t *pages; int npage; int ret; + /* When HEM buffer uses 0-level addressing, the page size is + * equal to the whole buffer size, and we split the buffer into + * small pages which is used to check whether the adjacent + * units are in the continuous space and its size is fixed to + * 4K based on hns ROCEE's requirement. + */ + page_shift = mtr->hem_cfg.is_direct ? HNS_HW_PAGE_SHIFT : + mtr->hem_cfg.buf_pg_shift; /* alloc a tmp array to store buffer's dma address */ pages = kvcalloc(page_count, sizeof(dma_addr_t), GFP_KERNEL); if (!pages) @@ -894,37 +917,30 @@ int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, struct hns_roce_buf_attr *attr, - struct hns_roce_hem_cfg *cfg, - unsigned int *buf_page_shift, u64 unalinged_size) + struct hns_roce_hem_cfg *cfg, u64 unalinged_size) { + struct ib_device *ibdev = &hr_dev->ib_dev; struct hns_roce_buf_region *r; u64 first_region_padding; int page_cnt, region_cnt; - unsigned int page_shift; + size_t buf_pg_sz; size_t buf_size; /* If mtt is disabled, all pages must be within a continuous range */ cfg->is_direct = !mtr_has_mtt(attr); buf_size = mtr_bufs_size(attr); if (cfg->is_direct) { - /* When HEM buffer uses 0-level addressing, the page size is - * equal to the whole buffer size, and we split the buffer into - * small pages which is used to check whether the adjacent - * units are in the continuous space and its size is fixed to - * 4K based on hns ROCEE's requirement. - */ - page_shift = HNS_HW_PAGE_SHIFT; - - /* The ROCEE requires the page size to be 4K * 2 ^ N. */ + buf_pg_sz = HNS_HW_PAGE_SIZE; cfg->buf_pg_count = 1; + /* The ROCEE requires the page size to be 4K * 2 ^ N. */ cfg->buf_pg_shift = HNS_HW_PAGE_SHIFT + order_base_2(DIV_ROUND_UP(buf_size, HNS_HW_PAGE_SIZE)); first_region_padding = 0; } else { - page_shift = attr->page_shift; cfg->buf_pg_count = DIV_ROUND_UP(buf_size + unalinged_size, - 1 << page_shift); - cfg->buf_pg_shift = page_shift; + 1 << attr->page_shift); + cfg->buf_pg_shift = attr->page_shift; + buf_pg_sz = 1 << cfg->buf_pg_shift; first_region_padding = unalinged_size; } @@ -937,7 +953,7 @@ static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, r->offset = page_cnt; buf_size = hr_hw_page_align(attr->region[region_cnt].size + first_region_padding); - r->count = DIV_ROUND_UP(buf_size, 1 << page_shift); + r->count = DIV_ROUND_UP(buf_size, buf_pg_sz); first_region_padding = 0; page_cnt += r->count; r->hopnum = to_hr_hem_hopnum(attr->region[region_cnt].hopnum, @@ -945,9 +961,13 @@ static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, } cfg->region_count = region_cnt; - *buf_page_shift = page_shift; + if (cfg->region_count < 1 || cfg->buf_pg_shift < HNS_HW_PAGE_SHIFT) { + ibdev_err(ibdev, "failed to init mtr cfg, count %d shift %u.\n", + cfg->region_count, cfg->buf_pg_shift); + return -EINVAL; + } - return page_cnt; + return 0; } static u64 cal_pages_per_l1ba(unsigned int ba_per_bt, unsigned int hopnum) @@ -1035,18 +1055,12 @@ int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, unsigned long user_addr) { struct ib_device *ibdev = &hr_dev->ib_dev; - unsigned int buf_page_shift = 0; - int buf_page_cnt; int ret; - buf_page_cnt = mtr_init_buf_cfg(hr_dev, buf_attr, &mtr->hem_cfg, - &buf_page_shift, - udata ? user_addr & ~PAGE_MASK : 0); - if (buf_page_cnt < 1 || buf_page_shift < HNS_HW_PAGE_SHIFT) { - ibdev_err(ibdev, "failed to init mtr cfg, count %d shift %u.\n", - buf_page_cnt, buf_page_shift); - return -EINVAL; - } + ret = mtr_init_buf_cfg(hr_dev, buf_attr, &mtr->hem_cfg, + udata ? user_addr & ~PAGE_MASK : 0); + if (ret) + return ret; ret = mtr_alloc_mtt(hr_dev, mtr, ba_page_shift); if (ret) { @@ -1070,7 +1084,7 @@ int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, } /* Write buffer's dma address to MTT */ - ret = mtr_map_bufs(hr_dev, mtr, buf_page_cnt, buf_page_shift); + ret = mtr_map_bufs(hr_dev, mtr); if (ret) ibdev_err(ibdev, "failed to map mtr bufs, ret = %d.\n", ret); else From patchwork Sat Jan 13 08:59:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 187878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp668388dyc; Sat, 13 Jan 2024 01:05:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IHlA9zsUsOedm+NAeTe938fSb6ztM9FQhRCpVz6ORT2CetAlYGM3VSygZi88nTX/dQ9ESM2 X-Received: by 2002:a05:620a:2b4b:b0:783:515d:e9c4 with SMTP id dp11-20020a05620a2b4b00b00783515de9c4mr515957qkb.141.1705136714664; Sat, 13 Jan 2024 01:05:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705136714; cv=none; d=google.com; s=arc-20160816; b=iIqfHCp3CZsVgoTb5nqMoi5dy3xajqxjasyTwxwBRGYIdLOFarjlrzh3ec+fIRW4w6 i2aDO5OMdR3yQJ3tLwRd+eBdDDSlyw19Od9oQOd8p5qBRTO6iVWsDGw75W9BRe3Hry43 v2wY6Uxfb7AVdcaCOdOGPdXgbW9SyOLNAP1PAoSU0VuV4QQ+kwVOH8d3g3IagPyxe6av zJJbdJYuAT3SM8/HOjyioGqV1j3K1UJEzFOo8dyvYH1ETKh3EPGwyOrYgoimBoGAeuWW /Cg5XNcd2SKqRTVbOX8ImxirJjMSZITW7C+A/rtNvVCdTNIBWufTdcp+hoHRy/+MwO84 HLkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=K0M9it2wlHB1GKEUWGtPWukvGFoiwICJpK6OhfUCL0Q=; fh=KWvyQxL3Ff+3WPSMjlYu+P4255AmcMULAsFol6M1vNI=; b=kY2VeO3qGPgXt2Nl6WARxIO7tX+aWEj2IaJrowYGfzqS+i9eCXndA38OOCo2L/6GZz CYriRBYn2iivOTpxAoIjWZm6pdhetyDLXJ2lpXK1tneUSOIH+oBlZlhUj5lS4ZNTx0iW 7GHH4Aq8jHVca1TdztpMRA2Nx5pIq1UcOhQHw3naRucJHUqiRoLQI6MByiGlNpttVTfE JgomortkGKhiT0njE+EKeF26GZCHhcdLwf0g7cXs8Kqwnf4VNuKFJV3CNZ+6PuHR7dPg MxsBwjDK5liZJEjzLgYEvhgCalqolVTulrtMA5mQ6dcBN+vUorajlA96nwPDiTwo1t26 ANFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25227-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25227-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id g14-20020a05620a108e00b0078324a9ede1si4354126qkk.524.2024.01.13.01.05.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Jan 2024 01:05:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25227-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25227-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25227-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 672171C20B12 for ; Sat, 13 Jan 2024 09:05:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0D74B175A9; Sat, 13 Jan 2024 09:03:44 +0000 (UTC) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9848A17554; Sat, 13 Jan 2024 09:03:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4TBsqk70X2zsVqY; Sat, 13 Jan 2024 17:02:42 +0800 (CST) Received: from kwepemi500006.china.huawei.com (unknown [7.221.188.68]) by mail.maildlp.com (Postfix) with ESMTPS id 549801402E2; Sat, 13 Jan 2024 17:03:31 +0800 (CST) Received: from localhost.localdomain (10.67.165.2) by kwepemi500006.china.huawei.com (7.221.188.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 13 Jan 2024 17:03:30 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-next 4/6] RDMA/hns: Support flexible umem page size Date: Sat, 13 Jan 2024 16:59:33 +0800 Message-ID: <20240113085935.2838701-5-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> References: <20240113085935.2838701-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemi500006.china.huawei.com (7.221.188.68) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787965435734746696 X-GMAIL-MSGID: 1787965435734746696 From: Chengchang Tang In the current implementation, a fixed page size is used to configure the umem PBL, which is not flexible enough and is not conducive to the performance of the HW. Find a best page size to get better performance. Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_device.h | 9 ++ drivers/infiniband/hw/hns/hns_roce_mr.c | 121 ++++++++++++++------ 2 files changed, 92 insertions(+), 38 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index dd652dc090b0..1a8516019516 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -179,6 +179,7 @@ enum { #define HNS_ROCE_CMD_SUCCESS 1 +#define HNS_ROCE_MAX_HOP_NUM 3 /* The minimum page size is 4K for hardware */ #define HNS_HW_PAGE_SHIFT 12 #define HNS_HW_PAGE_SIZE (1 << HNS_HW_PAGE_SHIFT) @@ -269,6 +270,11 @@ struct hns_roce_hem_list { dma_addr_t root_ba; /* pointer to the root ba table */ }; +enum mtr_type { + MTR_DEFAULT = 0, + MTR_PBL, +}; + struct hns_roce_buf_attr { struct { size_t size; /* region size */ @@ -277,7 +283,10 @@ struct hns_roce_buf_attr { unsigned int region_count; /* valid region count */ unsigned int page_shift; /* buffer page shift */ unsigned int user_access; /* umem access flag */ + u64 iova; + enum mtr_type type; bool mtt_only; /* only alloc buffer-required MTT memory */ + bool adaptive; /* adaptive for page_shift and hopnum */ }; struct hns_roce_hem_cfg { diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 74ea9d8482b9..d00b4aa7214b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -32,6 +32,7 @@ */ #include +#include #include #include #include "hns_roce_device.h" @@ -103,6 +104,10 @@ static int alloc_mr_pbl(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr, buf_attr.user_access = mr->access; /* fast MR's buffer is alloced before mapping, not at creation */ buf_attr.mtt_only = is_fast; + buf_attr.iova = mr->iova; + /* pagesize and hopnum is fixed for fast MR */ + buf_attr.adaptive = !is_fast; + buf_attr.type = MTR_PBL; err = hns_roce_mtr_create(hr_dev, &mr->pbl_mtr, &buf_attr, hr_dev->caps.pbl_ba_pg_sz + PAGE_SHIFT, @@ -721,6 +726,16 @@ static int cal_mtr_pg_cnt(struct hns_roce_mtr *mtr) return page_cnt; } +static bool need_split_huge_page(struct hns_roce_mtr *mtr) +{ + /* When HEM buffer uses 0-level addressing, the page size is + * equal to the whole buffer size. If the current MTR has multiple + * regions, we split the buffer into small pages(4k, required by hns + * ROCEE). These pages will be used in multiple regions. + */ + return mtr->hem_cfg.is_direct && mtr->hem_cfg.region_count > 1; +} + static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr) { struct ib_device *ibdev = &hr_dev->ib_dev; @@ -730,14 +745,8 @@ static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr) int npage; int ret; - /* When HEM buffer uses 0-level addressing, the page size is - * equal to the whole buffer size, and we split the buffer into - * small pages which is used to check whether the adjacent - * units are in the continuous space and its size is fixed to - * 4K based on hns ROCEE's requirement. - */ - page_shift = mtr->hem_cfg.is_direct ? HNS_HW_PAGE_SHIFT : - mtr->hem_cfg.buf_pg_shift; + page_shift = need_split_huge_page(mtr) ? HNS_HW_PAGE_SHIFT : + mtr->hem_cfg.buf_pg_shift; /* alloc a tmp array to store buffer's dma address */ pages = kvcalloc(page_count, sizeof(dma_addr_t), GFP_KERNEL); if (!pages) @@ -757,7 +766,7 @@ static int mtr_map_bufs(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr) goto err_alloc_list; } - if (mtr->hem_cfg.is_direct && npage > 1) { + if (need_split_huge_page(mtr) && npage > 1) { ret = mtr_check_direct_pages(pages, npage, page_shift); if (ret) { ibdev_err(ibdev, "failed to check %s page: %d / %d.\n", @@ -915,56 +924,89 @@ int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, return ret; } -static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, - struct hns_roce_buf_attr *attr, - struct hns_roce_hem_cfg *cfg, u64 unalinged_size) +static int get_best_page_shift(struct hns_roce_dev *hr_dev, + struct hns_roce_mtr *mtr, + struct hns_roce_buf_attr *buf_attr) +{ + unsigned int page_sz; + + if (!buf_attr->adaptive || buf_attr->type != MTR_PBL || !mtr->umem) + return 0; + + page_sz = ib_umem_find_best_pgsz(mtr->umem, + hr_dev->caps.page_size_cap, + buf_attr->iova); + if (!page_sz) + return -EINVAL; + + buf_attr->page_shift = order_base_2(page_sz); + return 0; +} + +static bool is_buf_attr_valid(struct hns_roce_dev *hr_dev, + struct hns_roce_buf_attr *attr) { struct ib_device *ibdev = &hr_dev->ib_dev; + + if (attr->region_count > ARRAY_SIZE(attr->region) || + attr->region_count < 1 || attr->page_shift < HNS_HW_PAGE_SHIFT) { + ibdev_err(ibdev, + "invalid buf attr, region count %d, page shift %u.\n", + attr->region_count, attr->page_shift); + return false; + } + + return true; +} + +static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev, + struct hns_roce_mtr *mtr, + struct hns_roce_buf_attr *attr) +{ + struct hns_roce_hem_cfg *cfg = &mtr->hem_cfg; struct hns_roce_buf_region *r; - u64 first_region_padding; - int page_cnt, region_cnt; size_t buf_pg_sz; size_t buf_size; + int page_cnt, i; + u64 pgoff = 0; + + if (!is_buf_attr_valid(hr_dev, attr)) + return -EINVAL; /* If mtt is disabled, all pages must be within a continuous range */ cfg->is_direct = !mtr_has_mtt(attr); + cfg->region_count = attr->region_count; buf_size = mtr_bufs_size(attr); - if (cfg->is_direct) { + if (need_split_huge_page(mtr)) { buf_pg_sz = HNS_HW_PAGE_SIZE; cfg->buf_pg_count = 1; /* The ROCEE requires the page size to be 4K * 2 ^ N. */ cfg->buf_pg_shift = HNS_HW_PAGE_SHIFT + order_base_2(DIV_ROUND_UP(buf_size, HNS_HW_PAGE_SIZE)); - first_region_padding = 0; } else { - cfg->buf_pg_count = DIV_ROUND_UP(buf_size + unalinged_size, - 1 << attr->page_shift); + buf_pg_sz = 1 << attr->page_shift; + cfg->buf_pg_count = mtr->umem ? + ib_umem_num_dma_blocks(mtr->umem, buf_pg_sz) : + DIV_ROUND_UP(buf_size, buf_pg_sz); cfg->buf_pg_shift = attr->page_shift; - buf_pg_sz = 1 << cfg->buf_pg_shift; - first_region_padding = unalinged_size; + pgoff = mtr->umem ? mtr->umem->address & ~PAGE_MASK : 0; } /* Convert buffer size to page index and page count for each region and * the buffer's offset needs to be appended to the first region. */ - for (page_cnt = 0, region_cnt = 0; region_cnt < attr->region_count && - region_cnt < ARRAY_SIZE(cfg->region); region_cnt++) { - r = &cfg->region[region_cnt]; + for (page_cnt = 0, i = 0; i < attr->region_count; i++) { + r = &cfg->region[i]; r->offset = page_cnt; - buf_size = hr_hw_page_align(attr->region[region_cnt].size + - first_region_padding); - r->count = DIV_ROUND_UP(buf_size, buf_pg_sz); - first_region_padding = 0; - page_cnt += r->count; - r->hopnum = to_hr_hem_hopnum(attr->region[region_cnt].hopnum, - r->count); - } + buf_size = hr_hw_page_align(attr->region[i].size + pgoff); + if (attr->type == MTR_PBL && mtr->umem) + r->count = ib_umem_num_dma_blocks(mtr->umem, buf_pg_sz); + else + r->count = DIV_ROUND_UP(buf_size, buf_pg_sz); - cfg->region_count = region_cnt; - if (cfg->region_count < 1 || cfg->buf_pg_shift < HNS_HW_PAGE_SHIFT) { - ibdev_err(ibdev, "failed to init mtr cfg, count %d shift %u.\n", - cfg->region_count, cfg->buf_pg_shift); - return -EINVAL; + pgoff = 0; + page_cnt += r->count; + r->hopnum = to_hr_hem_hopnum(attr->region[i].hopnum, r->count); } return 0; @@ -1054,7 +1096,6 @@ int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, unsigned int ba_page_shift, struct ib_udata *udata, unsigned long user_addr) { - u64 pgoff = udata ? user_addr & ~PAGE_MASK : 0; struct ib_device *ibdev = &hr_dev->ib_dev; int ret; @@ -1071,9 +1112,13 @@ int hns_roce_mtr_create(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr, "failed to alloc mtr bufs, ret = %d.\n", ret); return ret; } + + ret = get_best_page_shift(hr_dev, mtr, buf_attr); + if (ret) + goto err_init_buf; } - ret = mtr_init_buf_cfg(hr_dev, buf_attr, &mtr->hem_cfg, pgoff); + ret = mtr_init_buf_cfg(hr_dev, mtr, buf_attr); if (ret) goto err_init_buf;