From patchwork Thu Nov 9 05:44:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Daisuke Matsuda (Fujitsu)" X-Patchwork-Id: 163234 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b129:0:b0:403:3b70:6f57 with SMTP id q9csp235652vqs; Wed, 8 Nov 2023 21:48:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IHXqa4ACuqmREzeTwoeUvgC0HTlDcoNSzjHgKPzfGFXKOuWeE3RJwN7E7L5rcdyqszAxB8q X-Received: by 2002:a17:902:e84a:b0:1cc:5306:e883 with SMTP id t10-20020a170902e84a00b001cc5306e883mr4437286plg.42.1699508886049; Wed, 08 Nov 2023 21:48:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699508886; cv=none; d=google.com; s=arc-20160816; b=HrZY+hZTwKZbCrsRHjQVcxTy+tKKDpwKifClVl/PkrnZOBMpEtLsLYbnH0kW3vBYt2 E7o+WX3yb6HT0oySiah6R0VLcZrjDkYsBCWvBAAVaPrv/IDEPeuhEslk6h6gZt8Jw/o2 p9F7vvnxSktWzWe5gGZJeWlWJmn+bL8L8F3+Oj98GVC8EO7htISx59ET2hZynNIh8qZm cqCQ+vf/J2JIweUrgkEHM/xDTdgdv4ovh4T18em502Wjvm8Xn/FL2Ltrs0q5d/id1fe0 wygGoUlrSuMbwul07sHMlawIVBBy8ud3wCjMCIFxTyorXEu8KsDX0LqTabZULQGA7Tv2 6m+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=JE+n3OIlNwDA4uSPd5FbNPKWXFRlwkauhvpJGXoq6hw=; fh=Kza5GnIpHUYQ1cZy2y92CPl6XrzN03vBDGjf8wFwAS8=; b=v9kyEJMyxm7HNC3tH/G5QY70Ubvb4Rae5BPC7CnqPB0ZwqjFms3gVDLsdRjYdg7Fun AD5Kwt/Zi0ZZ3mXGIpN088pUGRkTMNnuxfIidVyvID5h0LSkNkYCyamU5ILHOAGJz93Y 3Drd+ETyNgQPmbrk/eT02yBQJrjSGNEIIpoR/74KJvQB0jqfIIFYy3rCHxJ2yzs4Q8Fx GUniupIDR5h9Y3TC5UFSiv60OWRXZAtM6Byt2F5OOlnN87qa6VKurBcDwPpgXVIF5IbL rUt6WQGrsM/Lo8oTuyBsWZ+8G6s9q3NFOXctA4cPz7wssLb8/94ArD+wnkASiG7Ha/j3 DhsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id u9-20020a17090341c900b001c3e9b0bae1si4456824ple.443.2023.11.08.21.48.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Nov 2023 21:48:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=fujitsu.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 5A20D802E883; Wed, 8 Nov 2023 21:48:03 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232565AbjKIFrS (ORCPT + 32 others); Thu, 9 Nov 2023 00:47:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232589AbjKIFq7 (ORCPT ); Thu, 9 Nov 2023 00:46:59 -0500 X-Greylist: delayed 65 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 08 Nov 2023 21:46:50 PST Received: from esa5.hc1455-7.c3s2.iphmx.com (esa5.hc1455-7.c3s2.iphmx.com [68.232.139.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D60002709; Wed, 8 Nov 2023 21:46:50 -0800 (PST) X-IronPort-AV: E=McAfee;i="6600,9927,10888"; a="138532744" X-IronPort-AV: E=Sophos;i="6.03,288,1694703600"; d="scan'208";a="138532744" Received: from unknown (HELO oym-r2.gw.nic.fujitsu.com) ([210.162.30.90]) by esa5.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2023 14:45:44 +0900 Received: from oym-m1.gw.nic.fujitsu.com (oym-nat-oym-m1.gw.nic.fujitsu.com [192.168.87.58]) by oym-r2.gw.nic.fujitsu.com (Postfix) with ESMTP id 11D7CDC146; Thu, 9 Nov 2023 14:45:41 +0900 (JST) Received: from m3003.s.css.fujitsu.com (sqmail-3003.b.css.fujitsu.com [10.128.233.114]) by oym-m1.gw.nic.fujitsu.com (Postfix) with ESMTP id 4537FD9C60; Thu, 9 Nov 2023 14:45:40 +0900 (JST) Received: from localhost.localdomain (unknown [10.118.237.107]) by m3003.s.css.fujitsu.com (Postfix) with ESMTP id 006152005323; Thu, 9 Nov 2023 14:45:39 +0900 (JST) From: Daisuke Matsuda To: linux-rdma@vger.kernel.org, leon@kernel.org, jgg@ziepe.ca, zyjzyj2000@gmail.com Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com, yangx.jy@fujitsu.com, lizhijian@fujitsu.com, y-goto@fujitsu.com, Daisuke Matsuda Subject: [PATCH for-next v7 7/7] RDMA/rxe: Add support for the traditional Atomic operations with ODP Date: Thu, 9 Nov 2023 14:44:52 +0900 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 08 Nov 2023 21:48:03 -0800 (PST) X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782064229856141317 X-GMAIL-MSGID: 1782064229856141317 Enable 'fetch and add' and 'compare and swap' operations to be used with ODP. This is comprised of the following steps: 1. Verify that the page is present with write permission. 2. If OK, execute the operation and exit. 3. If not, then trigger page fault to map the page. 4. Update the entry in the MR xarray. 5. Execute the operation. Signed-off-by: Daisuke Matsuda --- drivers/infiniband/sw/rxe/rxe.c | 1 + drivers/infiniband/sw/rxe/rxe_loc.h | 9 ++++++++ drivers/infiniband/sw/rxe/rxe_mr.c | 7 +++++- drivers/infiniband/sw/rxe/rxe_odp.c | 33 ++++++++++++++++++++++++++++ drivers/infiniband/sw/rxe/rxe_resp.c | 5 ++++- 5 files changed, 53 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index 207a022156f0..abd3267c2873 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -88,6 +88,7 @@ static void rxe_init_device_param(struct rxe_dev *rxe) rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_RECV; rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_WRITE; rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_READ; + rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_ATOMIC; rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; } } diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index eeaeff8a1398..0bae9044f362 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -194,6 +194,9 @@ int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, int access_flags, struct rxe_mr *mr); int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, enum rxe_mr_copy_dir dir); +int rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode, + u64 compare, u64 swap_add, u64 *orig_val); + #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ static inline int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, @@ -207,6 +210,12 @@ rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, { return -EOPNOTSUPP; } +static inline int +rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode, + u64 compare, u64 swap_add, u64 *orig_val) +{ + return RESPST_ERR_UNSUPPORTED_OPCODE; +} #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index f0ce87c0fc7d..0dc452ab772b 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -498,7 +498,12 @@ int rxe_mr_do_atomic_op(struct rxe_mr *mr, u64 iova, int opcode, } page_offset = rxe_mr_iova_to_page_offset(mr, iova); index = rxe_mr_iova_to_index(mr, iova); - page = xa_load(&mr->page_list, index); + + if (mr->umem->is_odp) + page = xa_untag_pointer(xa_load(&mr->page_list, index)); + else + page = xa_load(&mr->page_list, index); + if (!page) return RESPST_ERR_RKEY_VIOLATION; } diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c index 5aa09b9c1095..45b54ba15210 100644 --- a/drivers/infiniband/sw/rxe/rxe_odp.c +++ b/drivers/infiniband/sw/rxe/rxe_odp.c @@ -254,3 +254,36 @@ int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, return err; } + +int rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode, + u64 compare, u64 swap_add, u64 *orig_val) +{ + struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem); + int err; + + spin_lock(&mr->page_list.xa_lock); + + /* Atomic operations manipulate a single char. */ + if (rxe_odp_check_pages(mr, iova, sizeof(char), 0)) { + spin_unlock(&mr->page_list.xa_lock); + + /* umem_mutex is locked on success */ + err = rxe_odp_do_pagefault_and_lock(mr, iova, sizeof(char), 0); + if (err < 0) + return err; + + /* + * The spinlock is always locked under mutex_lock except + * for MR initialization. No worry about deadlock. + */ + spin_lock(&mr->page_list.xa_lock); + mutex_unlock(&umem_odp->umem_mutex); + } + + err = rxe_mr_do_atomic_op(mr, iova, opcode, compare, + swap_add, orig_val); + + spin_unlock(&mr->page_list.xa_lock); + + return err; +} diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 9159f1bdfc6f..af3e669679a0 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -693,7 +693,10 @@ static enum resp_states atomic_reply(struct rxe_qp *qp, u64 iova = qp->resp.va + qp->resp.offset; if (mr->umem->is_odp) - err = RESPST_ERR_UNSUPPORTED_OPCODE; + err = rxe_odp_mr_atomic_op(mr, iova, pkt->opcode, + atmeth_comp(pkt), + atmeth_swap_add(pkt), + &res->atomic.orig_val); else err = rxe_mr_do_atomic_op(mr, iova, pkt->opcode, atmeth_comp(pkt),