From patchwork Tue Sep 19 14:42:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 141935 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp3447661vqi; Tue, 19 Sep 2023 07:52:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG+C23/zWT2RTFrujbkZ8SmZT5dcUVzOOcu+cBHHceXLfjfz8kO4iDnB/QYDFhkIGVJD285 X-Received: by 2002:a17:90a:9f93:b0:274:7db1:f50f with SMTP id o19-20020a17090a9f9300b002747db1f50fmr3575180pjp.15.1695135179340; Tue, 19 Sep 2023 07:52:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695135179; cv=none; d=google.com; s=arc-20160816; b=i76ZW+620nxvcmInK+6ElGrDtLMn8u3YYGoGVHLL4gneLrqZu635z0rHUkumx3TPBZ jkJdWRW+GQETmr9ghBaYCAMIs3g6ub2Qd+cwTMgaciz/52EU3AhjogRmqRSFFZlS7cBo U705U1TLCi9MVCgqPOe/xA95jQwp0M6PUCW7waqM0KVmViOFibrsOHqoeMW7+2sgoLVE enxaM2M/+gNEobdjh9lIsCrpfa8pKpF/nesGaqJgJ7xzjNbKqoqv5N4ZMz/jEJ4s2CtV I6fZZP9+C66GYtle6aSFsHCHdYdvA2BqpmTSK/hMDGXezsPuFWX9SzrmIhj557v1t57N ZKFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from; bh=FSf6NCZjvjXKvrH40pKOuJ/UrT6ORR+B+kehPXCpuRM=; fh=2Ks5OhTjwS8gZt4F+VGxBODwUEQ0u2wJBF3lQY5ZqAM=; b=wvotLxW6AFZC5XtXOV4UZVXyBA+phb1M3zUgp71pE1/ShT5gW42uze0TrdGyj8E5Vt SEywW1DbL0J154hc3nALdjKIWSnSuGt4LjjEH4oau60/8/19gMZ/PMN6y12MBwt+0Qlk 27yTSqS1RkYa49zXt9fg2KyWD5aik2CX1tn98GfFLNJjTiCHusJuYTNcdOkmqDWge0Wa Mqy30C9K4ecwOAFcVZk3tnTRSVml2edoWZOzHJzd+xoyKPxaSYaoFmV5FS0DbXlzG6KI IjPCvipVgOz4pOlsho34/Vj9ahbbX/an32nFVL7QJi3zZaByemP624GJpm0Y25ImVZiy V2qQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id s4-20020a17090a948400b00269584b6a10si3243661pjo.15.2023.09.19.07.52.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 07:52:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 5F43D80ECFBB; Tue, 19 Sep 2023 07:44:10 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233014AbjISOn5 (ORCPT + 26 others); Tue, 19 Sep 2023 10:43:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232912AbjISOnd (ORCPT ); Tue, 19 Sep 2023 10:43:33 -0400 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4596E49; Tue, 19 Sep 2023 07:43:12 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0VsRxWFd_1695134587; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0VsRxWFd_1695134587) by smtp.aliyun-inc.com; Tue, 19 Sep 2023 22:43:08 +0800 From: Wen Gu To: kgraul@linux.ibm.com, wenjia@linux.ibm.com, jaka@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 16/18] net/smc: avoid data copy from sndbuf to peer RMB in SMC-D Date: Tue, 19 Sep 2023 22:42:00 +0800 Message-Id: <1695134522-126655-17-git-send-email-guwen@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1695134522-126655-1-git-send-email-guwen@linux.alibaba.com> References: <1695134522-126655-1-git-send-email-guwen@linux.alibaba.com> X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 19 Sep 2023 07:44:10 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777478066178981089 X-GMAIL-MSGID: 1777478066178981089 This patch aims to avoid data copy from local sndbuf to peer RMB by mapping local sndbuf to peer RMB when DMBs have ISM_ATTR_DMB_MAP attribute. After this, local sndbuf and peer RMB share the same physical memory. +----------+ +----------+ | socket A | | socket B | +----------+ +----------+ | ^ | +---------+ | regard as | | regard as local sndbuf | B's | local RMB | | RMB | | |-------> | |-----------| +---------+ 1. From the perspective of RMB: a. Created or reused when connection is created. b. Unused and recycled to lgr buffer pool when connection is freed. c. Freed when link group is freed. 2. From the perspective of sndbuf: a. Mapped to peer RMB by the rtoken exchanged through CLC message. Then accessing local sndbuf is equivalent to accessing peer RMB. c. Unmapped from peer RMB and freed when connection is freed. Won't be recycled to lgr buffer pool. Therefore, the data written to local sndbuf will directly reach peer RMB. Signed-off-by: Wen Gu --- net/smc/af_smc.c | 14 +++++++++++ net/smc/smc_core.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++- net/smc/smc_core.h | 1 + 3 files changed, 84 insertions(+), 1 deletion(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index bc4300e..fd0b91f 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -1436,6 +1436,12 @@ static int smc_connect_ism(struct smc_sock *smc, } smc_conn_save_peer_info(smc, aclc); + + if (smc_ism_dmb_mappable(smc->conn.lgr->smcd)) { + rc = smcd_buf_attach(smc); + if (rc) + goto connect_abort; + } smc_close_init(smc); smc_rx_init(smc); smc_tx_init(smc); @@ -2537,6 +2543,14 @@ static void smc_listen_work(struct work_struct *work) mutex_unlock(&smc_server_lgr_pending); } smc_conn_save_peer_info(new_smc, cclc); + + if (ini->is_smcd && + smc_ism_dmb_mappable(new_smc->conn.lgr->smcd)) { + rc = smcd_buf_attach(new_smc); + if (rc) + goto out_decl; + } + smc_listen_out_connected(new_smc); SMC_STAT_SERV_SUCC_INC(sock_net(newclcsock->sk), ini); goto out_free; diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index c36500a..bae2116 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -1153,6 +1153,20 @@ static void smcr_buf_unuse(struct smc_buf_desc *buf_desc, bool is_rmb, } } +static void smcd_buf_detach(struct smc_connection *conn) +{ + struct smcd_dev *smcd = conn->lgr->smcd; + u64 peer_token = conn->peer_token; + + if (!conn->sndbuf_desc) + return; + + smc_ism_detach_dmb(smcd, peer_token); + + kfree(conn->sndbuf_desc); + conn->sndbuf_desc = NULL; +} + static void smc_buf_unuse(struct smc_connection *conn, struct smc_link_group *lgr) { @@ -1197,6 +1211,10 @@ void smc_conn_free(struct smc_connection *conn) if (!list_empty(&lgr->list)) smc_ism_unset_conn(conn); tasklet_kill(&conn->rx_tsklet); + + /* detach sndbuf from peer RMB */ + if (smc_ism_dmb_mappable(lgr->smcd)) + smcd_buf_detach(conn); } else { smc_cdc_wait_pend_tx_wr(conn); if (current_work() != &conn->abort_work) @@ -2458,15 +2476,23 @@ void smc_rmb_sync_sg_for_cpu(struct smc_connection *conn) */ int smc_buf_create(struct smc_sock *smc, bool is_smcd) { + bool sndbuf_created = false; int rc; + if (is_smcd && + smc_ism_dmb_mappable(smc->conn.lgr->smcd)) + goto create_rmb; + /* create send buffer */ rc = __smc_buf_create(smc, is_smcd, false); if (rc) return rc; + sndbuf_created = true; + +create_rmb: /* create rmb */ rc = __smc_buf_create(smc, is_smcd, true); - if (rc) { + if (rc && sndbuf_created) { down_write(&smc->conn.lgr->sndbufs_lock); list_del(&smc->conn.sndbuf_desc->list); up_write(&smc->conn.lgr->sndbufs_lock); @@ -2476,6 +2502,48 @@ int smc_buf_create(struct smc_sock *smc, bool is_smcd) return rc; } +int smcd_buf_attach(struct smc_sock *smc) +{ + struct smc_connection *conn = &smc->conn; + struct smcd_dev *smcd = conn->lgr->smcd; + u64 peer_token = conn->peer_token; + struct smc_buf_desc *buf_desc; + int rc; + + buf_desc = kzalloc(sizeof(*buf_desc), GFP_KERNEL); + if (!buf_desc) + return -ENOMEM; + + /* map local sndbuf desc to peer RMB, so operations on local + * sndbuf are equivalent to operations on peer RMB. + */ + rc = smc_ism_attach_dmb(smcd, peer_token, buf_desc); + if (rc) { + rc = SMC_CLC_DECL_MEM; + goto free; + } + + smc->sk.sk_sndbuf = buf_desc->len; + buf_desc->cpu_addr = (u8 *)buf_desc->cpu_addr + sizeof(struct smcd_cdc_msg); + buf_desc->len -= sizeof(struct smcd_cdc_msg); + conn->sndbuf_desc = buf_desc; + conn->sndbuf_desc->used = 1; + atomic_set(&conn->sndbuf_space, conn->sndbuf_desc->len); + return 0; + +free: + if (conn->rmb_desc) { + /* free local RMB as well */ + down_write(&conn->lgr->rmbs_lock); + list_del(&conn->rmb_desc->list); + up_write(&conn->lgr->rmbs_lock); + smc_buf_free(conn->lgr, true, conn->rmb_desc); + conn->rmb_desc = NULL; + } + kfree(buf_desc); + return rc; +} + static inline int smc_rmb_reserve_rtoken_idx(struct smc_link_group *lgr) { int i; diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h index d57eb9b..2cba119 100644 --- a/net/smc/smc_core.h +++ b/net/smc/smc_core.h @@ -551,6 +551,7 @@ void smc_smcd_terminate(struct smcd_dev *dev, struct smcd_gid *peer_gid, void smc_smcd_terminate_all(struct smcd_dev *dev); void smc_smcr_terminate_all(struct smc_ib_device *smcibdev); int smc_buf_create(struct smc_sock *smc, bool is_smcd); +int smcd_buf_attach(struct smc_sock *smc); int smc_uncompress_bufsize(u8 compressed); int smc_rmb_rtoken_handling(struct smc_connection *conn, struct smc_link *link, struct smc_clc_msg_accept_confirm *clc);