From patchwork Fri Mar 31 16:08:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77834 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp676557vqo; Fri, 31 Mar 2023 09:12:39 -0700 (PDT) X-Google-Smtp-Source: AKy350bskidqCC+uoUWE+ZDXBfc5SjL5kr3yQWmkG3nvFAzviqiAiWj8zQFweu3AtHIUU4EERDGZ X-Received: by 2002:aa7:c948:0:b0:4fe:94a2:81be with SMTP id h8-20020aa7c948000000b004fe94a281bemr24256330edt.7.1680279159735; Fri, 31 Mar 2023 09:12:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279159; cv=none; d=google.com; s=arc-20160816; b=rnxOmBsOgSudmybEdfcDq/OYGnhXGhCTlldi6mpTf4KieKMqyuvfovAdZt8VMtMb+x xDPos88Y74MEMPbD1IsqHPPc6FGOLIccQC3rj/dxinVeYDc+U6weCmsIcmKioyV83fvV q9bIgCxiFwO1myiPasID5dfvHHFRVZZ/83NshO8BaKBNTszGGxbYESrUfS5eCA3uDfCK mrUVJgccJ+hvV38B6CBjkLM5uZRouKA71lUHTEh9eEmFNFz0QoZThQdWSxI86ImEKplw 77f2+/GYtU/dLoItYPUxf0yvRizz57KAA6lMoXpKYF5R727fhIeEP41zVeDl/J9HISIS pdew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=fsOLHSRSLMInNcD449MTPXsGwp1kWYvONeYqTpf/Ez4=; b=pPzAWMD7fwm6LPVUknVuqNoMFCUgvWowurP4OsfSlZDFZRu29ty78oEd/aVDJ7sYi3 peqKD4lMV3SlVvscdtVvBGGWS6rl/wzRvMTx5c1nkerAv9s2CdoF/ZI98vT8VtnYCwyY +q17PgS3HR8mR5t18XBSE5Olp6pfa0hWKapZURbGdFtW+dThoDm0uJr7sUU6/FcJdYo/ sSTdYy1w5hHfTRB7HLY4vmkWNSvr+xaLt5oNh5jXWiWY1UEY7fNQS+J8Z5lNb2MbWywe cGojs7sti64i4U1vlA9gfVddyS+KNsSZVWxNzwGafAwvSpAt1hYaK4WQQRNNPwu544aU EmKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=e+orkJzF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w12-20020a50fa8c000000b0050245e7f729si2479353edr.615.2023.03.31.09.12.14; Fri, 31 Mar 2023 09:12:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=e+orkJzF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230183AbjCaQKe (ORCPT + 99 others); Fri, 31 Mar 2023 12:10:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231915AbjCaQKU (ORCPT ); Fri, 31 Mar 2023 12:10:20 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 409121FD0F for ; Fri, 31 Mar 2023 09:09:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fsOLHSRSLMInNcD449MTPXsGwp1kWYvONeYqTpf/Ez4=; b=e+orkJzFD76gmRpxn+WfSAxlz7D1UIfey4LCbc+UnzrKaZI7sAkekmGuhKPriGs/8aBWjn GNXiES0l004Z7wkuBP3/SwMcOTRzI7JipycWosaEmAocG1Jz+hFSK9494ZhhKc6sJ0zrf3 6nCJfl0IhdZud6zHT4MSW+BqXL9nAIg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-613-JBunTOtONMCZ2eDMDHE0OA-1; Fri, 31 Mar 2023 12:09:25 -0400 X-MC-Unique: JBunTOtONMCZ2eDMDHE0OA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 56800855425; Fri, 31 Mar 2023 16:09:24 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF52D4042AC0; Fri, 31 Mar 2023 16:09:21 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Steve French , Shyam Prasad N , Rohith Surabattula , linux-cachefs@redhat.com, linux-cifs@vger.kernel.org Subject: [PATCH v3 01/55] netfs: Fix netfs_extract_iter_to_sg() for ITER_UBUF/IOVEC Date: Fri, 31 Mar 2023 17:08:20 +0100 Message-Id: <20230331160914.1608208-2-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900399976586820?= X-GMAIL-MSGID: =?utf-8?q?1761900399976586820?= Fix netfs_extract_iter_to_sg() for ITER_UBUF and ITER_IOVEC to set the size of the page to the part of the page extracted, not the remaining amount of data in the extracted page array at that point. This doesn't yet affect anything as cifs, the only current user, only passes in non-user-backed iterators. Fixes: 018584697533 ("netfs: Add a function to extract an iterator into a scatterlist") Signed-off-by: David Howells cc: Jeff Layton cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Jeff Layton --- fs/netfs/iterator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c index e9a45dea748a..8a4c86687429 100644 --- a/fs/netfs/iterator.c +++ b/fs/netfs/iterator.c @@ -139,7 +139,7 @@ static ssize_t netfs_extract_user_to_sg(struct iov_iter *iter, size_t seg = min_t(size_t, PAGE_SIZE - off, len); *pages++ = NULL; - sg_set_page(sg, page, len, off); + sg_set_page(sg, page, seg, off); sgtable->nents++; sg++; len -= seg; From patchwork Fri Mar 31 16:08:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77866 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp691903vqo; Fri, 31 Mar 2023 09:35:45 -0700 (PDT) X-Google-Smtp-Source: AKy350bE5L5eim0K1pdU3LPy2DiDZtO44tz3BJ51JSsh2eGM5P5NR5OPt2Dk78gi3Q/bsO8KJzcH X-Received: by 2002:a17:906:3149:b0:8ab:b03d:a34f with SMTP id e9-20020a170906314900b008abb03da34fmr9696136eje.12.1680280545441; Fri, 31 Mar 2023 09:35:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280545; cv=none; d=google.com; s=arc-20160816; b=j2DjYM4WpBK3g29zEtyi4+aa7hJow3ScgAF1w7JwZFX+ZGUzQfjA+Y4bJvcC/v5ReZ Wru1CTLU7wH90MRJJDjT8x8hKqo/Et2WV3LjKXOLLBfznrCqTRV9zlF1zwE8OFwEJdQm DvEulu8wtSmOvzoWrMNHzALPvtKHqYC29YrmjvNAluK4kG45inYeMDJcUXXQjYcwhA8T FFywSOeI4DBEFigtTGlKACMnbmUyqFMeVX5gEQIUekjn8keFo82mX0sWHmTTSzbmO6bm rd0UGvLf89zq3bD8+3vcH1XkctIpnXS7q0mhBKCbTqKXYS24LBvFQlBqunkWMsVkLAaK JbEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=s7FY6IkB7NRnaaunO8kQ3wM8dgoy+C5dVBJnz6ApgXk=; b=zrZzjb5VREyqWn02b0cL+mgShhsCi4r35S/utyixAID4nFjwHnnohRyBeNeBg+v5Na Wj0BLNb+ukaMjWH9Ot7fBJHxO6gvffaYt1ykn6WKf++c+NpS5FwwmVsGUnNnYakpHize 33rYN6r2Q3ts6KpGFki7kZvpardnjAeGEU0SmhjdgPRClJqsjSur/W/G/db8e35CeZJz qS3XI3moP7sSrXyYDPynILAwMyu7fCyai/mzVtKXMc2qQjuh9cgL6cfswZdT8r1gmDic q+jlPXA+cGHx/LMNTr7ke2KuldWO+h9Pi0kTgJDeG39jmmYWctds/DPC2+HbIlOIyrgs FZBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BmHl6c3y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x20-20020aa7d394000000b004d49f845574si2385163edq.223.2023.03.31.09.35.21; Fri, 31 Mar 2023 09:35:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BmHl6c3y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232792AbjCaQKk (ORCPT + 99 others); Fri, 31 Mar 2023 12:10:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232267AbjCaQKW (ORCPT ); Fri, 31 Mar 2023 12:10:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 878C920C3E for ; Fri, 31 Mar 2023 09:09:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s7FY6IkB7NRnaaunO8kQ3wM8dgoy+C5dVBJnz6ApgXk=; b=BmHl6c3yHz2nqdMfBY4tlSLuwP9JAUQzzXaAVgfezYAWe7RG3UNG717rcDynLaltMcTi+o 9ad9mokSWfMeL3PUDtwW6ii4cAFpdQ2ziMV1QstPl7FjuEnAQZlA3+K3OsJZM8JR8Cm0RO UpY8y9+A8+bEFdBrKm8kjhgj5uuag3g= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-364-gA1T6ksvPKuG1wGWcmZ53Q-1; Fri, 31 Mar 2023 12:09:28 -0400 X-MC-Unique: gA1T6ksvPKuG1wGWcmZ53Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 12E9D1C05159; Fri, 31 Mar 2023 16:09:27 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 15A39140E94F; Fri, 31 Mar 2023 16:09:24 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org Subject: [PATCH v3 02/55] iov_iter: Remove last_offset member Date: Fri, 31 Mar 2023 17:08:21 +0100 Message-Id: <20230331160914.1608208-3-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901853300283447?= X-GMAIL-MSGID: =?utf-8?q?1761901853300283447?= With the removal of ITER_PIPE, the last_offset member of struct iov_iter is no longer used, so remove it and un-unionise the remaining member. Signed-off-by: David Howells cc: Jens Axboe cc: Matthew Wilcox cc: Alexander Viro cc: Jeff Layton cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org cc: netdev@vger.kernel.org Reviewed-by: Jeff Layton --- include/linux/uio.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 74598426edb4..2d8a70cb9b26 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -43,10 +43,7 @@ struct iov_iter { bool nofault; bool data_source; bool user_backed; - union { - size_t iov_offset; - int last_offset; - }; + size_t iov_offset; size_t count; union { const struct iovec *iov; From patchwork Fri Mar 31 16:08:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77879 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp693040vqo; Fri, 31 Mar 2023 09:37:39 -0700 (PDT) X-Google-Smtp-Source: AKy350bg2ynWWC+f5wi38+QNoTvnxeDYkwTWhyObVcLNOowZP+lPIE0FS/pZXM7qeTuohKyW1LqK X-Received: by 2002:a05:6402:50a:b0:4ab:d1f4:4b88 with SMTP id m10-20020a056402050a00b004abd1f44b88mr28245935edv.41.1680280659549; Fri, 31 Mar 2023 09:37:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280659; cv=none; d=google.com; s=arc-20160816; b=bd3IYOO/TXX+5OpCABTS13ZvqEar42K6sH75O5EQjNmE+xre9i9mEyysc8gji34KVz VjudBgnnikJsIqtd4fxHAosli0+xTo5OhGgrVBKZA6xKVfG/5D96Tp6nslYVltrlLiDJ 69YiynGvx8stW+Wf/zBELKFl5ntRLi6ctVwT7UgOctSLD3PxABqeDETlfnuEOVIpn0ed BoNhFgTiqMRiRvrTsDxLmuxFpNcbfGIUu91jmsJ36OXvW/r2Sovs0Q1T3x2nd7E3x0qn 5/YbyjQwBPGsFb2euaF9i7usnf9ToVmB3kQqz3eU9XWhQKa7k8CVuT2Ps6OPLqVTC0A9 dNQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2Rqs8gKWnfX7IQPFFFiE2mTLNitfAQsbWxPFIoMuGvc=; b=Jy613X0Ivf2xmPBiFqErRlOvOXck4da5TB1kghXPYvjhwKb6WPAMUGPoWqbjRCNKBd B8xO0ebe79k/8Lq5EMVa6SxeKINHovMco3ujJonqcLiv8JC0mkf6TgFIGMbPe75FWVi4 NeAqGE34fbe+xMD0ZpdZPvDwzb7el3wxsblRYhzBojdjJ7ah7bASNmsMQtWCZJ4vYLfO I/s5Z5u4Yt22IaJj13RulRcZMi/gEv+84LneSiiHApU5Z9cA39S2Wzvv3KUNMhYYSueS WJJf1vDDfolC8Mft+j5F1D2nRcmjKQ/96K8p245YI2uwE/mKp0xXdMcCqtFRKZ+Tadk2 cyfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MxqfgNLx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r16-20020aa7cb90000000b004fc66bd07dbsi2349963edt.369.2023.03.31.09.37.15; Fri, 31 Mar 2023 09:37:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MxqfgNLx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232802AbjCaQKx (ORCPT + 99 others); Fri, 31 Mar 2023 12:10:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232323AbjCaQKW (ORCPT ); Fri, 31 Mar 2023 12:10:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4F351FD39 for ; Fri, 31 Mar 2023 09:09:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2Rqs8gKWnfX7IQPFFFiE2mTLNitfAQsbWxPFIoMuGvc=; b=MxqfgNLxMm9gKK/+p6p1teoAPP1+kLymEw+J+4pGffaKoiUuFxtSudl5M9zGcbhoV+uKND XZwehZ76Q4NK9a3izW33ygjzXw8/V5ODcbYo+FTLUxaDb4cszBbJCNe2ElcvGjJyhkn1QF Du9lPDZ1mLBrm2LQWaFgSKsT2BplOmQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-543-0AgUVjoEOGGNrIUOLBBj-A-1; Fri, 31 Mar 2023 12:09:30 -0400 X-MC-Unique: 0AgUVjoEOGGNrIUOLBBj-A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A4A9029ABA19; Fri, 31 Mar 2023 16:09:29 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A92CCC15BB8; Fri, 31 Mar 2023 16:09:27 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH v3 03/55] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag Date: Fri, 31 Mar 2023 17:08:22 +0100 Message-Id: <20230331160914.1608208-4-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901972916011865?= X-GMAIL-MSGID: =?utf-8?q?1761901972916011865?= Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a network protocol that it should splice pages from the source iterator rather than copying the data if it can. This flag is added to a list that is cleared by sendmsg and recvmsg syscalls on entry. This is intended as a replacement for the ->sendpage() op, allowing a way to splice in several multipage folios in one go. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org Reviewed-by: Willem de Bruijn --- include/linux/socket.h | 3 +++ net/socket.c | 2 ++ 2 files changed, 5 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index 13c3a237b9c9..bd1cc3238851 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -327,6 +327,7 @@ struct ucred { */ #define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */ +#define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */ #define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */ #define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file descriptor received through @@ -337,6 +338,8 @@ struct ucred { #define MSG_CMSG_COMPAT 0 /* We never have 32 bit fixups */ #endif +/* Flags to be cleared on entry by sendmsg and sendmmsg syscalls */ +#define MSG_INTERNAL_SENDMSG_FLAGS (MSG_SPLICE_PAGES) /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */ #define SOL_IP 0 diff --git a/net/socket.c b/net/socket.c index 6bae8ce7059e..0c39ce57d603 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2139,6 +2139,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags, msg.msg_name = (struct sockaddr *)&address; msg.msg_namelen = addr_len; } + flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; msg.msg_flags = flags; @@ -2486,6 +2487,7 @@ static int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys, } msg_sys->msg_flags = flags; + flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; if (sock->file->f_flags & O_NONBLOCK) msg_sys->msg_flags |= MSG_DONTWAIT; /* From patchwork Fri Mar 31 16:08:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77835 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp676562vqo; Fri, 31 Mar 2023 09:12:40 -0700 (PDT) X-Google-Smtp-Source: AKy350azVbaRTZ1DQAxyqhSE7A0Y57MaUIY8wpW9KkCGhge2L8B/d+nBpYhAgXNwdI9cg8gwKTbI X-Received: by 2002:a17:902:7c11:b0:19d:181f:511 with SMTP id x17-20020a1709027c1100b0019d181f0511mr21184524pll.30.1680279159840; Fri, 31 Mar 2023 09:12:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279159; cv=none; d=google.com; s=arc-20160816; b=HGvu0skDbqsBEwTWhE/UMPyugFtBrL2V8Rj3ph1ShDiKJYxydDcKEAUvTSe2cwKKIW pF4nrI8+E+g76BnCJ+zIBcnYNgLaaw5f7U99DHmUi7EL1rjoReJxuWWwxC2jChaOXWQX U7JfSB63Ndr1wNEivx2Mk96aXfhbVh4DbZDPN93N2/hhrE4NCihrxOoiHi3jZgRR4HTQ KXf6tM+wykj4zirjbVkBmTxyZiy9A6y7btVP0DneMPcWiXpVpUDf4boZFhkY5vEi9iDM 89LT03SWzwxr1rqwuvxZWb0VvEdAiKdA83XPywacaPeQl3XI+zRKv23v3gnnBVGtQ0IT uDKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zHarMfhYqdBKP5ReoH3//xqNxFuro7P2+Q66shvU368=; b=bPpSZtiGcCWTqCTZabg33eHCXsXWEaBVlyj81tkLzuyA5Y1Yx9RlMQm5O3zhT9Qxzd UDWM6RaeJYbtUMv64ROOsPfJpKJigOo85ohESABP0Q/v8w7X5zOEdqrUL/nr8xNLgZ7k HISFiCDaK3yE8yuagPOeVCFGwcWg1OKbcjHcoVBVmakI35dRvdr6AOVRWnMo+KbQc1kp jpFmHhENAfSHeu+2FIJ0i8G5+q9vEwODESkY9XYZi9yzZPEPiDoCSuG3Ft2vCAEvmdCB +xLWiWjuQmNGhx8cgWlZ2Zuz8r9E5+BIXzSusGhI1byVTPgPCZ9AgYWBhvt1ADZo9vNv o2wA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DxaEmxzy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d17-20020a170903231100b00192721d6a97si2563886plh.499.2023.03.31.09.12.26; Fri, 31 Mar 2023 09:12:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DxaEmxzy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232341AbjCaQK4 (ORCPT + 99 others); Fri, 31 Mar 2023 12:10:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232346AbjCaQK2 (ORCPT ); Fri, 31 Mar 2023 12:10:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEAC71EFE4 for ; Fri, 31 Mar 2023 09:09:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278977; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zHarMfhYqdBKP5ReoH3//xqNxFuro7P2+Q66shvU368=; b=DxaEmxzytM4LGdHygZWXz6P01Q54vwfctTDLFbuxwK7GmvC4K5d0KP6p/2leF44/4kNOYo 471Coc9df16lSjnip+5mmWEMWWct+x+16pBxeTmoCayXv12DJ9WG7IJdIMjmLAKam5TPKs 6TQv5el1ZlWpE8YvXhx1HwxOGw3t6T4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-577-hhxNPGCHPQCBERrmjvntkg-1; Fri, 31 Mar 2023 12:09:33 -0400 X-MC-Unique: hhxNPGCHPQCBERrmjvntkg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 98114185A7A2; Fri, 31 Mar 2023 16:09:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4D7882166B33; Fri, 31 Mar 2023 16:09:30 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [PATCH v3 04/55] mm: Move the page fragment allocator from page_alloc.c into its own file Date: Fri, 31 Mar 2023 17:08:23 +0100 Message-Id: <20230331160914.1608208-5-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900400549721744?= X-GMAIL-MSGID: =?utf-8?q?1761900400549721744?= Move the page fragment allocator from page_alloc.c into its own file preparatory to changing it. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- mm/Makefile | 2 +- mm/page_alloc.c | 126 ----------------------------------------- mm/page_frag_alloc.c | 131 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 132 insertions(+), 127 deletions(-) create mode 100644 mm/page_frag_alloc.c diff --git a/mm/Makefile b/mm/Makefile index 8e105e5b3e29..4e6dc12b4cbd 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -52,7 +52,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ readahead.o swap.o truncate.o vmscan.o shmem.o \ util.o mmzone.o vmstat.o backing-dev.o \ mm_init.o percpu.o slab_common.o \ - compaction.o \ + compaction.o page_frag_alloc.o \ interval_tree.o list_lru.o workingset.o \ debug.o gup.o mmap_lock.o $(mmu-y) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ac1fc986af44..c08847308907 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5694,132 +5694,6 @@ void free_pages(unsigned long addr, unsigned int order) EXPORT_SYMBOL(free_pages); -/* - * Page Fragment: - * An arbitrary-length arbitrary-offset area of memory which resides - * within a 0 or higher order page. Multiple fragments within that page - * are individually refcounted, in the page's reference counter. - * - * The page_frag functions below provide a simple allocation framework for - * page fragments. This is used by the network stack and network device - * drivers to provide a backing region of memory for use as either an - * sk_buff->head, or to be used in the "frags" portion of skb_shared_info. - */ -static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) -{ - struct page *page = NULL; - gfp_t gfp = gfp_mask; - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - PAGE_FRAG_CACHE_MAX_ORDER); - nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; -#endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; - - return page; -} - -void __page_frag_cache_drain(struct page *page, unsigned int count) -{ - VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); - - if (page_ref_sub_and_test(page, count)) - free_the_page(page, compound_order(page)); -} -EXPORT_SYMBOL(__page_frag_cache_drain); - -void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) -{ - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; - - if (unlikely(!nc->va)) { -refill: - page = __page_frag_cache_refill(nc, gfp_mask); - if (!page) - return NULL; - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); - - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page_is_pfmemalloc(page); - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = size; - } - - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) - goto refill; - - if (unlikely(nc->pfmemalloc)) { - free_the_page(page, compound_order(page)); - goto refill; - } - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* OK, page count is 0, we can safely set it */ - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); - - /* reset page count bias and offset to start of new frag */ - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - offset = size - fragsz; - if (unlikely(offset < 0)) { - /* - * The caller is trying to allocate a fragment - * with fragsz > PAGE_SIZE but the cache isn't big - * enough to satisfy the request, this may - * happen in low memory conditions. - * We don't release the cache page because - * it could make memory pressure worse - * so we simply return NULL here. - */ - return NULL; - } - } - - nc->pagecnt_bias--; - offset &= align_mask; - nc->offset = offset; - - return nc->va + offset; -} -EXPORT_SYMBOL(page_frag_alloc_align); - -/* - * Frees a page fragment allocated out of either a compound or order 0 page. - */ -void page_frag_free(void *addr) -{ - struct page *page = virt_to_head_page(addr); - - if (unlikely(put_page_testzero(page))) - free_the_page(page, compound_order(page)); -} -EXPORT_SYMBOL(page_frag_free); - static void *make_alloc_exact(unsigned long addr, unsigned int order, size_t size) { diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c new file mode 100644 index 000000000000..bee95824ef8f --- /dev/null +++ b/mm/page_frag_alloc.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Page fragment allocator + * + * Page Fragment: + * An arbitrary-length arbitrary-offset area of memory which resides within a + * 0 or higher order page. Multiple fragments within that page are + * individually refcounted, in the page's reference counter. + * + * The page_frag functions provide a simple allocation framework for page + * fragments. This is used by the network stack and network device drivers to + * provide a backing region of memory for use as either an sk_buff->head, or to + * be used in the "frags" portion of skb_shared_info. + */ + +#include +#include +#include + +static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) +{ + struct page *page = NULL; + gfp_t gfp = gfp_mask; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + PAGE_FRAG_CACHE_MAX_ORDER); + nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; +#endif + if (unlikely(!page)) + page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); + + nc->va = page ? page_address(page) : NULL; + + return page; +} + +void __page_frag_cache_drain(struct page *page, unsigned int count) +{ + VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); + + if (page_ref_sub_and_test(page, count - 1)) + __free_pages(page, compound_order(page)); +} +EXPORT_SYMBOL(__page_frag_cache_drain); + +void *page_frag_alloc_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask) +{ + unsigned int size = PAGE_SIZE; + struct page *page; + int offset; + + if (unlikely(!nc->va)) { +refill: + page = __page_frag_cache_refill(nc, gfp_mask); + if (!page) + return NULL; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* Even if we own the page, we do not use atomic_set(). + * This would break get_page_unless_zero() users. + */ + page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); + + /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page_is_pfmemalloc(page); + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + nc->offset = size; + } + + offset = nc->offset - fragsz; + if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + + if (page_ref_count(page) != nc->pagecnt_bias) + goto refill; + if (unlikely(nc->pfmemalloc)) { + page_ref_sub(page, nc->pagecnt_bias - 1); + __free_pages(page, compound_order(page)); + goto refill; + } + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* OK, page count is 0, we can safely set it */ + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); + + /* reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = size - fragsz; + if (unlikely(offset < 0)) { + /* + * The caller is trying to allocate a fragment + * with fragsz > PAGE_SIZE but the cache isn't big + * enough to satisfy the request, this may + * happen in low memory conditions. + * We don't release the cache page because + * it could make memory pressure worse + * so we simply return NULL here. + */ + return NULL; + } + } + + nc->pagecnt_bias--; + offset &= align_mask; + nc->offset = offset; + + return nc->va + offset; +} +EXPORT_SYMBOL(page_frag_alloc_align); + +/* + * Frees a page fragment allocated out of either a compound or order 0 page. + */ +void page_frag_free(void *addr) +{ + struct page *page = virt_to_head_page(addr); + + __free_pages(page, compound_order(page)); +} +EXPORT_SYMBOL(page_frag_free); From patchwork Fri Mar 31 16:08:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77843 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp684794vqo; Fri, 31 Mar 2023 09:25:07 -0700 (PDT) X-Google-Smtp-Source: AKy350ZTl499VuSwcmZUL9VTU6DgY+O+QJqm+AL2zUemBtCU75wusTtzCAiaPcQggv4tT5kY0V7n X-Received: by 2002:a17:906:cecf:b0:932:2412:8abf with SMTP id si15-20020a170906cecf00b0093224128abfmr29496594ejb.62.1680279907400; Fri, 31 Mar 2023 09:25:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279907; cv=none; d=google.com; s=arc-20160816; b=0skJfTB2pEvjI/x23NXVyWxWbHJGGBy9gupCT/Bz7ep6cStmjvJ32G07I40uCLQ+EC UxJ6OHrW3nv+EcRioLpwwd37PA/Z5+ddtMXrmUNtVBLq6L2h2kyEvrqSZnRx6nFJiCQL Tl3WT9HlzIfgP/f+f2VeLcyt+R8lzQQsfmjJ5XJvGdxxQN600cx/oqRl4B/EXxsfg40d 4R6XcUihQVP0NT4XXwHFmA3CvjtUnjjONI0hbklfl4jqXTyiNb//6zDSE10AB/vEyQQV MP45i8vSJ5g4tGeDmrmO3zB4Zc27wBogTT+mRKtsicMB5U5twkV/Tv62Ry7K5dcstOrM 60tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=w9vYmidIh5WhninpbPh1DpHK4Q4l9z+4/ehUwmRVCvg=; b=IQCmbrwfIf4+LxUpaKNpB7ifxdC3nHHMOtp73RrgOmE2gXx129dXREvmdDXtxoL5Qj QBt5k/mCz2xLxIydSt+6uJya4eAONOHz1qE8Am/UEYEOf36JS5SFadAtXhkf9RY+mwGi uzAgYl31l6R5Zvz528+EzGLPSvkU+z6ZyncQ7K8NyVA0wtIGV+MbQr7L0wU5Mq8fVUk8 t1qRQmnjxoQksznmUdORbdGCXH75uM48PiHc1G3Xvq2sFq1j+2+igfIdIjjjAYB6+R9T /kaMAzf23QMiGD1O3g/g5n1M0s9K+5Ql1UUKqXPS+QVRSdXVK+8PyMSq/dUiKLgIGKKz DpTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eJhfSu9o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n11-20020a170906724b00b00930e86db8fdsi2376003ejk.938.2023.03.31.09.24.41; Fri, 31 Mar 2023 09:25:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eJhfSu9o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231253AbjCaQLU (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbjCaQK7 (ORCPT ); Fri, 31 Mar 2023 12:10:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CDE720C38 for ; Fri, 31 Mar 2023 09:09:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278979; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w9vYmidIh5WhninpbPh1DpHK4Q4l9z+4/ehUwmRVCvg=; b=eJhfSu9ok8DNRp0umqUN/oVMHSAbH6+4V8T5oMiMAAxoRk2UhG1BdAiP9gDIuJ8haePSzs FYKNjbuAJ7UVbnCNH+vIarh23cgAo8X1NDGs39HeMlmspzNsB87Yqd6yvuNRHEuiNP9tsr HMf9VciNoHYe2oBR4edjjX7RSvrM9E4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-645-elJaY9qoPaOnZnoS3texTg-1; Fri, 31 Mar 2023 12:09:36 -0400 X-MC-Unique: elJaY9qoPaOnZnoS3texTg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1E2D6185A7A2; Fri, 31 Mar 2023 16:09:35 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 39DD614171B6; Fri, 31 Mar 2023 16:09:33 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 05/55] mm: Make the page_frag_cache allocator use multipage folios Date: Fri, 31 Mar 2023 17:08:24 +0100 Message-Id: <20230331160914.1608208-6-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901184544795991?= X-GMAIL-MSGID: =?utf-8?q?1761901184544795991?= Change the page_frag_cache allocator to use multipage folios rather than groups of pages. This reduces page_frag_free to just a folio_put() or put_page(). Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/linux/mm_types.h | 13 ++---- mm/page_frag_alloc.c | 88 +++++++++++++++++++--------------------- 2 files changed, 45 insertions(+), 56 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0722859c3647..49a70b3f44a9 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio) } struct page_frag_cache { - void * va; -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - __u16 offset; - __u16 size; -#else - __u32 offset; -#endif + struct folio *folio; + unsigned int offset; /* we maintain a pagecount bias, so that we dont dirty cache line * containing page->_refcount every time we allocate a fragment. */ - unsigned int pagecnt_bias; - bool pfmemalloc; + unsigned int pagecnt_bias; + bool pfmemalloc; }; typedef unsigned long vm_flags_t; diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c index bee95824ef8f..c3792b68ce32 100644 --- a/mm/page_frag_alloc.c +++ b/mm/page_frag_alloc.c @@ -16,33 +16,34 @@ #include #include -static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) +/* + * Allocate a new folio for the frag cache. + */ +static struct folio *page_frag_cache_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) { - struct page *page = NULL; + struct folio *folio = NULL; gfp_t gfp = gfp_mask; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - PAGE_FRAG_CACHE_MAX_ORDER); - nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; + gfp_mask |= __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; + folio = folio_alloc(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER); #endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; + if (unlikely(!folio)) + folio = folio_alloc(gfp, 0); - return page; + if (folio) + nc->folio = folio; + return folio; } void __page_frag_cache_drain(struct page *page, unsigned int count) { - VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); + struct folio *folio = page_folio(page); - if (page_ref_sub_and_test(page, count - 1)) - __free_pages(page, compound_order(page)); + VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); + + folio_put_refs(folio, count); } EXPORT_SYMBOL(__page_frag_cache_drain); @@ -50,54 +51,47 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align_mask) { - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; + struct folio *folio = nc->folio; + size_t offset; - if (unlikely(!nc->va)) { + if (unlikely(!folio)) { refill: - page = __page_frag_cache_refill(nc, gfp_mask); - if (!page) + folio = page_frag_cache_refill(nc, gfp_mask); + if (!folio) return NULL; -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif /* Even if we own the page, we do not use atomic_set(). * This would break get_page_unless_zero() users. */ - page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); + folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page_is_pfmemalloc(page); + nc->pfmemalloc = folio_is_pfmemalloc(folio); nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = size; + nc->offset = folio_size(folio); } - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (page_ref_count(page) != nc->pagecnt_bias) + offset = nc->offset; + if (unlikely(fragsz > offset)) { + /* Reuse the folio if everyone we gave it to has finished with it. */ + if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) { + nc->folio = NULL; goto refill; + } + if (unlikely(nc->pfmemalloc)) { - page_ref_sub(page, nc->pagecnt_bias - 1); - __free_pages(page, compound_order(page)); + __folio_put(folio); + nc->folio = NULL; goto refill; } -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif /* OK, page count is 0, we can safely set it */ - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); + folio_set_count(folio, PAGE_FRAG_CACHE_MAX_SIZE + 1); /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - offset = size - fragsz; - if (unlikely(offset < 0)) { + offset = folio_size(folio); + if (unlikely(fragsz > offset)) { /* * The caller is trying to allocate a fragment * with fragsz > PAGE_SIZE but the cache isn't big @@ -107,15 +101,17 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, * it could make memory pressure worse * so we simply return NULL here. */ + nc->offset = offset; return NULL; } } nc->pagecnt_bias--; + offset -= fragsz; offset &= align_mask; nc->offset = offset; - return nc->va + offset; + return folio_address(folio) + offset; } EXPORT_SYMBOL(page_frag_alloc_align); @@ -124,8 +120,6 @@ EXPORT_SYMBOL(page_frag_alloc_align); */ void page_frag_free(void *addr) { - struct page *page = virt_to_head_page(addr); - - __free_pages(page, compound_order(page)); + folio_put(virt_to_folio(addr)); } EXPORT_SYMBOL(page_frag_free); From patchwork Fri Mar 31 16:08:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77852 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp689087vqo; Fri, 31 Mar 2023 09:31:30 -0700 (PDT) X-Google-Smtp-Source: AKy350ZE/4W4UizZWCkR51VTcC9nT3n4M1CjtnUfTb8RvelQ/TAOO/cLes6H6ZppjgIGKl3hhAVf X-Received: by 2002:a17:90b:4c10:b0:22c:816e:d67d with SMTP id na16-20020a17090b4c1000b0022c816ed67dmr31746447pjb.24.1680280290093; Fri, 31 Mar 2023 09:31:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280290; cv=none; d=google.com; s=arc-20160816; b=BuhXOLvGsBBEnseqDuO4GDkgiicXFc7FpSAorVqyk+n+vSy7CMSd59BPUAHDgYARRG oG9MBYBED1Nx4dbKH4g8cVG4bGviFNSDWdrXueHa7PalHohbWERrlq0TxrrU/BfNgq7q WS3LUt3PigHXLTgZGpRJwOwBMaF5ivbArboGdVFqFJ6UTvXLPrL6RiHKvbwVRtnAfrFO 9JfI3ktcbQSrfnmmAridRjxrsBpn17OzMXWYeqvJDCBEPAn8vVQobNwHRTSFqR9f9n2x vRNHybaFDNlh6Rpv63Gv3Hc8bFQdBR7yL9yCQ/TQK1ArHMo15hjdxQJCBPQt+B3AhfDe LSAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/nU0M+T1DCbw3wZvEUGAhwFOIfw7rmUPAJ5895bBp2k=; b=NBDx2FxW7CQK84xY12EocmKoRHpiSUvhjpTHSKIJ017WSw51wxPRnlFgdVZJYQ4elD IuLW16Ep2r7xVvn7ZLB1TOTbHqx5h2+uZstbhOlD4jzVWKp15V+csfxUpx7FH8DG+bvc UagGWzr0YsISluvu7CjMQC6tAt7SGKOYjr9zAAWtz6VpcvGKQciGKJM7oaJV9VCKfHr1 cTdNHc73pO1RHNVuvff+LiWl7aFoN5MEzh8vDBp9ynjfGEf4zSMaGTQ+rI1SlTHMTbJg 561w6GvRzGO3zF3Z+q0mBAnlAoTPFZD1czSsh/zBT2KyvLOJ1P0r3T71YPjtxr6Nce8F 3VSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UFIOp80y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 193-20020a6303ca000000b004fc25f024b7si2907979pgd.169.2023.03.31.09.31.15; Fri, 31 Mar 2023 09:31:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UFIOp80y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231735AbjCaQLZ (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232346AbjCaQLD (ORCPT ); Fri, 31 Mar 2023 12:11:03 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 858EA20D86 for ; Fri, 31 Mar 2023 09:09:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/nU0M+T1DCbw3wZvEUGAhwFOIfw7rmUPAJ5895bBp2k=; b=UFIOp80ySS6CV88hF2dfHXqvOuk7j1onldLHFCtJ7zpkxdT3YVEbSxCNRCSzJ6kFtt4U/R hroRUEt9D4QHweni1q/kZaLjlYq95u9tmWc6plkDK9jQY4GlULs7N/2Zk0Sj6XrrBYGFxo 0xpDs7hz487lCj5sAJO6Er4Az7v5AcU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-22-9O5OK0CuMdG6Z0qaBSmbcg-1; Fri, 31 Mar 2023 12:09:41 -0400 X-MC-Unique: 9O5OK0CuMdG6Z0qaBSmbcg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8590E29ABA18; Fri, 31 Mar 2023 16:09:39 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id D5EB1202701F; Fri, 31 Mar 2023 16:09:35 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Lorenzo Bianconi , Felix Fietkau , John Crispin , Sean Wang , Mark Lee , Keith Busch , Christoph Hellwig , Sagi Grimberg , Chaitanya Kulkarni , linux-nvme@lists.infradead.org, linux-mediatek@lists.infradead.org Subject: [PATCH v3 06/55] mm: Make the page_frag_cache allocator use per-cpu Date: Fri, 31 Mar 2023 17:08:25 +0100 Message-Id: <20230331160914.1608208-7-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901585521353619?= X-GMAIL-MSGID: =?utf-8?q?1761901585521353619?= Make the page_frag_cache allocator have a separate allocation bucket for each cpu to avoid racing. This means that no lock is required, other than preempt disablement, to allocate from it, though if a softirq wants to access it, then softirq disablement will need to be added. Make the NVMe and mediatek drivers pass in NULL to page_frag_cache() and use the default allocation buckets rather than defining their own. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Lorenzo Bianconi cc: Felix Fietkau cc: John Crispin cc: Sean Wang cc: Mark Lee cc: Keith Busch cc: Christoph Hellwig cc: Sagi Grimberg cc: Chaitanya Kulkarni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org cc: linux-nvme@lists.infradead.org cc: linux-mediatek@lists.infradead.org --- drivers/net/ethernet/mediatek/mtk_wed_wo.c | 19 +-- drivers/net/ethernet/mediatek/mtk_wed_wo.h | 2 - drivers/nvme/host/tcp.c | 19 +-- drivers/nvme/target/tcp.c | 22 +-- include/linux/gfp.h | 17 +- mm/page_frag_alloc.c | 182 +++++++++++++++------ net/core/skbuff.c | 32 ++-- 7 files changed, 164 insertions(+), 129 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c index 69fba29055e9..859f34447f2f 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c @@ -143,7 +143,7 @@ mtk_wed_wo_queue_refill(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q, dma_addr_t addr; void *buf; - buf = page_frag_alloc(&q->cache, q->buf_size, GFP_ATOMIC); + buf = page_frag_alloc(NULL, q->buf_size, GFP_ATOMIC); if (!buf) break; @@ -286,7 +286,6 @@ mtk_wed_wo_queue_free(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) static void mtk_wed_wo_queue_tx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) { - struct page *page; int i; for (i = 0; i < q->n_desc; i++) { @@ -297,20 +296,11 @@ mtk_wed_wo_queue_tx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) skb_free_frag(entry->buf); entry->buf = NULL; } - - if (!q->cache.va) - return; - - page = virt_to_page(q->cache.va); - __page_frag_cache_drain(page, q->cache.pagecnt_bias); - memset(&q->cache, 0, sizeof(q->cache)); } static void mtk_wed_wo_queue_rx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) { - struct page *page; - for (;;) { void *buf = mtk_wed_wo_dequeue(wo, q, NULL, true); @@ -319,13 +309,6 @@ mtk_wed_wo_queue_rx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) skb_free_frag(buf); } - - if (!q->cache.va) - return; - - page = virt_to_page(q->cache.va); - __page_frag_cache_drain(page, q->cache.pagecnt_bias); - memset(&q->cache, 0, sizeof(q->cache)); } static void diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.h b/drivers/net/ethernet/mediatek/mtk_wed_wo.h index dbcf42ce9173..6f940db67fb8 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.h +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.h @@ -210,8 +210,6 @@ struct mtk_wed_wo_queue_entry { struct mtk_wed_wo_queue { struct mtk_wed_wo_queue_regs regs; - struct page_frag_cache cache; - struct mtk_wed_wo_queue_desc *desc; dma_addr_t desc_dma; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 7723a4989524..fa32969b532f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -147,8 +147,6 @@ struct nvme_tcp_queue { __le32 exp_ddgst; __le32 recv_ddgst; - struct page_frag_cache pf_cache; - void (*state_change)(struct sock *); void (*data_ready)(struct sock *); void (*write_space)(struct sock *); @@ -470,9 +468,8 @@ static int nvme_tcp_init_request(struct blk_mq_tag_set *set, struct nvme_tcp_queue *queue = &ctrl->queues[queue_idx]; u8 hdgst = nvme_tcp_hdgst_len(queue); - req->pdu = page_frag_alloc(&queue->pf_cache, - sizeof(struct nvme_tcp_cmd_pdu) + hdgst, - GFP_KERNEL | __GFP_ZERO); + req->pdu = page_frag_alloc(NULL, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!req->pdu) return -ENOMEM; @@ -1288,9 +1285,8 @@ static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) struct nvme_tcp_request *async = &ctrl->async_req; u8 hdgst = nvme_tcp_hdgst_len(queue); - async->pdu = page_frag_alloc(&queue->pf_cache, - sizeof(struct nvme_tcp_cmd_pdu) + hdgst, - GFP_KERNEL | __GFP_ZERO); + async->pdu = page_frag_alloc(NULL, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!async->pdu) return -ENOMEM; @@ -1300,7 +1296,6 @@ static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid) { - struct page *page; struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue = &ctrl->queues[qid]; unsigned int noreclaim_flag; @@ -1311,12 +1306,6 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid) if (queue->hdr_digest || queue->data_digest) nvme_tcp_free_crypto(queue); - if (queue->pf_cache.va) { - page = virt_to_head_page(queue->pf_cache.va); - __page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias); - queue->pf_cache.va = NULL; - } - noreclaim_flag = memalloc_noreclaim_save(); sock_release(queue->sock); memalloc_noreclaim_restore(noreclaim_flag); diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 66e8f9fd0ca7..d6cc557cc539 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -143,8 +143,6 @@ struct nvmet_tcp_queue { struct nvmet_tcp_cmd connect; - struct page_frag_cache pf_cache; - void (*data_ready)(struct sock *); void (*state_change)(struct sock *); void (*write_space)(struct sock *); @@ -1312,25 +1310,25 @@ static int nvmet_tcp_alloc_cmd(struct nvmet_tcp_queue *queue, c->queue = queue; c->req.port = queue->port->nport; - c->cmd_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->cmd_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->cmd_pdu = page_frag_alloc(NULL, sizeof(*c->cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->cmd_pdu) return -ENOMEM; c->req.cmd = &c->cmd_pdu->cmd; - c->rsp_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->rsp_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->rsp_pdu = page_frag_alloc(NULL, sizeof(*c->rsp_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->rsp_pdu) goto out_free_cmd; c->req.cqe = &c->rsp_pdu->cqe; - c->data_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->data_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->data_pdu = page_frag_alloc(NULL, sizeof(*c->data_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->data_pdu) goto out_free_rsp; - c->r2t_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->r2t_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->r2t_pdu = page_frag_alloc(NULL, sizeof(*c->r2t_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->r2t_pdu) goto out_free_data; @@ -1438,7 +1436,6 @@ static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue *queue) static void nvmet_tcp_release_queue_work(struct work_struct *w) { - struct page *page; struct nvmet_tcp_queue *queue = container_of(w, struct nvmet_tcp_queue, release_work); @@ -1460,9 +1457,6 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w) if (queue->hdr_digest || queue->data_digest) nvmet_tcp_free_crypto(queue); ida_free(&nvmet_tcp_queue_ida, queue->idx); - - page = virt_to_head_page(queue->pf_cache.va); - __page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias); kfree(queue); } diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 65a78773dcca..b208ca315882 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -304,14 +304,17 @@ extern void free_pages(unsigned long addr, unsigned int order); struct page_frag_cache; extern void __page_frag_cache_drain(struct page *page, unsigned int count); -extern void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask); - -static inline void *page_frag_alloc(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask) +extern void *page_frag_alloc_align(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp, + unsigned long align_mask); +extern void *page_frag_memdup(struct page_frag_cache __percpu *frag_cache, + const void *p, size_t fragsz, gfp_t gfp, + unsigned long align_mask); + +static inline void *page_frag_alloc(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp) { - return page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); + return page_frag_alloc_align(frag_cache, fragsz, gfp, ULONG_MAX); } extern void page_frag_free(void *addr); diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c index c3792b68ce32..7844398afe26 100644 --- a/mm/page_frag_alloc.c +++ b/mm/page_frag_alloc.c @@ -16,25 +16,23 @@ #include #include +static DEFINE_PER_CPU(struct page_frag_cache, page_frag_default_allocator); + /* * Allocate a new folio for the frag cache. */ -static struct folio *page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) +static struct folio *page_frag_cache_refill(gfp_t gfp) { - struct folio *folio = NULL; - gfp_t gfp = gfp_mask; + struct folio *folio; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; - folio = folio_alloc(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER); + folio = folio_alloc(gfp | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC, + PAGE_FRAG_CACHE_MAX_ORDER); + if (folio) + return folio; #endif - if (unlikely(!folio)) - folio = folio_alloc(gfp, 0); - if (folio) - nc->folio = folio; - return folio; + return folio_alloc(gfp, 0); } void __page_frag_cache_drain(struct page *page, unsigned int count) @@ -47,41 +45,68 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); -void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) +/** + * page_frag_alloc_align - Allocate some memory for use in zerocopy + * @frag_cache: The frag cache to use (or NULL for the default) + * @fragsz: The size of the fragment desired + * @gfp: Allocation flags under which to make an allocation + * @align_mask: The required alignment + * + * Allocate some memory for use with zerocopy where protocol bits have to be + * mixed in with spliced/zerocopied data. Unlike memory allocated from the + * slab, this memory's lifetime is purely dependent on the folio's refcount. + * + * The way it works is that a folio is allocated and fragments are broken off + * sequentially and returned to the caller with a ref until the folio no longer + * has enough spare space - at which point the allocator's ref is dropped and a + * new folio is allocated. The folio remains in existence until the last ref + * held by, say, an sk_buff is discarded and then the page is returned to the + * page allocator. + * + * Returns a pointer to the memory on success and -ENOMEM on allocation + * failure. + * + * The allocated memory should be disposed of with folio_put(). + */ +void *page_frag_alloc_align(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp, unsigned long align_mask) { - struct folio *folio = nc->folio; + struct page_frag_cache *nc; + struct folio *folio, *spare = NULL; size_t offset; + void *p; - if (unlikely(!folio)) { -refill: - folio = page_frag_cache_refill(nc, gfp_mask); - if (!folio) - return NULL; - - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); + if (!frag_cache) + frag_cache = &page_frag_default_allocator; + if (WARN_ON_ONCE(fragsz == 0)) + fragsz = 1; + align_mask &= ~3UL; - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = folio_is_pfmemalloc(folio); - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = folio_size(folio); + nc = get_cpu_ptr(frag_cache); +reload: + folio = nc->folio; + offset = nc->offset; +try_again: + + /* Make the allocation if there's sufficient space. */ + if (fragsz <= offset) { + nc->pagecnt_bias--; + offset = (offset - fragsz) & align_mask; + nc->offset = offset; + p = folio_address(folio) + offset; + put_cpu_ptr(frag_cache); + if (spare) + folio_put(spare); + return p; } - offset = nc->offset; - if (unlikely(fragsz > offset)) { - /* Reuse the folio if everyone we gave it to has finished with it. */ - if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) { - nc->folio = NULL; + /* Insufficient space - see if we can refurbish the current folio. */ + if (folio) { + if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) goto refill; - } if (unlikely(nc->pfmemalloc)) { __folio_put(folio); - nc->folio = NULL; goto refill; } @@ -91,27 +116,56 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; offset = folio_size(folio); - if (unlikely(fragsz > offset)) { - /* - * The caller is trying to allocate a fragment - * with fragsz > PAGE_SIZE but the cache isn't big - * enough to satisfy the request, this may - * happen in low memory conditions. - * We don't release the cache page because - * it could make memory pressure worse - * so we simply return NULL here. - */ - nc->offset = offset; + if (unlikely(fragsz > offset)) + goto frag_too_big; + goto try_again; + } + +refill: + if (!spare) { + nc->folio = NULL; + put_cpu_ptr(frag_cache); + + spare = page_frag_cache_refill(gfp); + if (!spare) return NULL; - } + + nc = get_cpu_ptr(frag_cache); + /* We may now be on a different cpu and/or someone else may + * have refilled it + */ + nc->pfmemalloc = folio_is_pfmemalloc(spare); + if (nc->folio) + goto reload; } - nc->pagecnt_bias--; - offset -= fragsz; - offset &= align_mask; + nc->folio = spare; + folio = spare; + spare = NULL; + + /* Even if we own the page, we do not use atomic_set(). This would + * break get_page_unless_zero() users. + */ + folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); + + /* Reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = folio_size(folio); + goto try_again; + +frag_too_big: + /* + * The caller is trying to allocate a fragment with fragsz > PAGE_SIZE + * but the cache isn't big enough to satisfy the request, this may + * happen in low memory conditions. We don't release the cache page + * because it could make memory pressure worse so we simply return NULL + * here. + */ nc->offset = offset; - - return folio_address(folio) + offset; + put_cpu_ptr(frag_cache); + if (spare) + folio_put(spare); + return NULL; } EXPORT_SYMBOL(page_frag_alloc_align); @@ -123,3 +177,25 @@ void page_frag_free(void *addr) folio_put(virt_to_folio(addr)); } EXPORT_SYMBOL(page_frag_free); + +/** + * page_frag_memdup - Allocate a page fragment and duplicate some data into it + * @frag_cache: The frag cache to use (or NULL for the default) + * @fragsz: The amount of memory to copy (maximum 1/2 page). + * @p: The source data to copy + * @gfp: Allocation flags under which to make an allocation + * @align_mask: The required alignment + */ +void *page_frag_memdup(struct page_frag_cache __percpu *frag_cache, + const void *p, size_t fragsz, gfp_t gfp, + unsigned long align_mask) +{ + void *q; + + q = page_frag_alloc_align(frag_cache, fragsz, gfp, align_mask); + if (!q) + return q; + + return memcpy(q, p, fragsz); +} +EXPORT_SYMBOL(page_frag_memdup); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index eb7d33b41e71..0506e4cf1ed9 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -222,13 +222,13 @@ static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp_mask) #endif struct napi_alloc_cache { - struct page_frag_cache page; struct page_frag_1k page_small; unsigned int skb_count; void *skb_cache[NAPI_SKB_CACHE_SIZE]; }; static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); +static DEFINE_PER_CPU(struct page_frag_cache, napi_frag_cache); static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); /* Double check that napi_get_frags() allocates skbs with @@ -250,11 +250,9 @@ void napi_get_frags_check(struct napi_struct *napi) void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) { - struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); - fragsz = SKB_DATA_ALIGN(fragsz); - return page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, align_mask); + return page_frag_alloc_align(&napi_frag_cache, fragsz, GFP_ATOMIC, align_mask); } EXPORT_SYMBOL(__napi_alloc_frag_align); @@ -264,15 +262,12 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) fragsz = SKB_DATA_ALIGN(fragsz); if (in_hardirq() || irqs_disabled()) { - struct page_frag_cache *nc = this_cpu_ptr(&netdev_alloc_cache); - - data = page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, align_mask); + data = page_frag_alloc_align(&netdev_alloc_cache, + fragsz, GFP_ATOMIC, align_mask); } else { - struct napi_alloc_cache *nc; - local_bh_disable(); - nc = this_cpu_ptr(&napi_alloc_cache); - data = page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, align_mask); + data = page_frag_alloc_align(&napi_frag_cache, + fragsz, GFP_ATOMIC, align_mask); local_bh_enable(); } return data; @@ -656,7 +651,6 @@ EXPORT_SYMBOL(__alloc_skb); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_t gfp_mask) { - struct page_frag_cache *nc; struct sk_buff *skb; bool pfmemalloc; void *data; @@ -681,14 +675,12 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_mask |= __GFP_MEMALLOC; if (in_hardirq() || irqs_disabled()) { - nc = this_cpu_ptr(&netdev_alloc_cache); - data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + data = page_frag_alloc(&netdev_alloc_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); } else { local_bh_disable(); - nc = this_cpu_ptr(&napi_alloc_cache.page); - data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + data = page_frag_alloc(&napi_frag_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); local_bh_enable(); } @@ -776,8 +768,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, } else { len = SKB_HEAD_ALIGN(len); - data = page_frag_alloc(&nc->page, len, gfp_mask); - pfmemalloc = nc->page.pfmemalloc; + data = page_frag_alloc(&napi_frag_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); } if (unlikely(!data)) From patchwork Fri Mar 31 16:08:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77853 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp689749vqo; Fri, 31 Mar 2023 09:32:28 -0700 (PDT) X-Google-Smtp-Source: AKy350ba0cE+YXFhrP/lEZn47vDT7gJRcgrpwv+gdYT6CDBOJcmjyrIXDwoKp8EqjUn6O29co74l X-Received: by 2002:a17:903:189:b0:19c:f476:4793 with SMTP id z9-20020a170903018900b0019cf4764793mr31635825plg.51.1680280347780; Fri, 31 Mar 2023 09:32:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280347; cv=none; d=google.com; s=arc-20160816; b=Gb3TQfIcx+OB93gruJiHZOSxouVIfohiwNM4lsMScfJiUs2suWTFRoq7pixzWvn+Ol qhjZEg1vQ6eQoXBnh4Sr2ynwXOwkw26waoR7Cm+Sg7Wg6vijDYGAHgg8DMqLcFeSwW4k 5Ov5ZmGbBuJEOYYduaof+NxBuwFizK6i66SxoSgrsXvQzxaC/tCUc95nXsZKKABvBWLT zxWEmrrYzNp6uohFRTk1FfHjJF1ty+F1ZmYPQlwKxgg0Ol18VmiRXGMflCgMflXhgagm 1BfrQc+T33nKE5balJffeeyjfEPpYL1iamow0z+Zq0nkxkAbPhRsXBJWP1mVFFxAAr8z qTbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HlTSQeuXI3aut1lZSX+tmY2WXPBg0R+eepIbeSsvlZU=; b=nmfCbUSULS8f7+PpBJgOGTlG5Tu969nHqOyN1dMJ8wg4e68KIdGfpMDEQ8ZG+YiGNQ oZAQkMYSTUna95mSo+1CaPEp0nh7Skr+8zNUFCaMzZCyli9/7FXPoGkLDKV8RfCCc5r8 sTZtr9nxgfEUy58qFGfU5x+FqL7ztEZaaHcH13zGX3t5dR/ozGmccBw0NXCkbRqEhCCX gmPWKgJZ0JlAIc8Uh+zs42ZZ+L7/0a6A4XOCvFMvXIm29MSv2TOBZziYM4oL1ICGESXT ccMsMmbfLx8DnrjpY5zsF/UKvNriKf+FazPMe2Un9Z5SOJpwq51MtADXeKqKLLAe1SZW vw2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AvJ+jw9M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 187-20020a6303c4000000b0051322a9ea4dsi2962866pgd.180.2023.03.31.09.32.14; Fri, 31 Mar 2023 09:32:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AvJ+jw9M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231588AbjCaQL3 (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232391AbjCaQLE (ORCPT ); Fri, 31 Mar 2023 12:11:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 535C320D9E for ; Fri, 31 Mar 2023 09:09:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278987; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HlTSQeuXI3aut1lZSX+tmY2WXPBg0R+eepIbeSsvlZU=; b=AvJ+jw9MuXyLGKoQJ9nUXgKnQifgyreVAxGFZ5Qj/BUqBr8lqlfjP3cSzNCqh5XT8oJ00X RDWvpodkQhRVDhb+ubbnPxKwvqBU61TVo11aTURrqOlQW1muJxsuf9k6bngfbcO6G4MdEh 5xGKqgujlE7YIjYuU8F9dXJOJyoeqYI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-407-4An1B-SyOQeBgjPKwSEzAg-1; Fri, 31 Mar 2023 12:09:43 -0400 X-MC-Unique: 4An1B-SyOQeBgjPKwSEzAg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0B59D101A54F; Fri, 31 Mar 2023 16:09:42 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 27382C15BA0; Fri, 31 Mar 2023 16:09:40 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 07/55] tcp: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:26 +0100 Message-Id: <20230331160914.1608208-8-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901645988524176?= X-GMAIL-MSGID: =?utf-8?q?1761901645988524176?= Make TCP's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 67 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 288693981b00..910b327c236e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) int flags, err, copied = 0; int mss_now = 0, size_goal, copied_syn = 0; int process_backlog = 0; - bool zc = false; + int zc = 0; long timeo; flags = msg->msg_flags; @@ -1231,17 +1231,22 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (msg->msg_ubuf) { uarg = msg->msg_ubuf; net_zcopy_get(uarg); - zc = sk->sk_route_caps & NETIF_F_SG; + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); if (!uarg) { err = -ENOBUFS; goto out_err; } - zc = sk->sk_route_caps & NETIF_F_SG; - if (!zc) + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; + else uarg_to_msgzc(uarg)->zerocopy = 0; } + } else if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES) && size) { + if (sk->sk_route_caps & NETIF_F_SG) + zc = 2; } if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) && @@ -1304,7 +1309,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto do_error; while (msg_data_left(msg)) { - int copy = 0; + ssize_t copy = 0; skb = tcp_write_queue_tail(sk); if (skb) @@ -1345,7 +1350,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (copy > msg_data_left(msg)) copy = msg_data_left(msg); - if (!zc) { + if (zc == 0) { bool merge = true; int i = skb_shinfo(skb)->nr_frags; struct page_frag *pfrag = sk_page_frag(sk); @@ -1390,7 +1395,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) page_ref_inc(pfrag->page); } pfrag->offset += copy; - } else { + } else if (zc == 1) { /* First append to a fragless skb builds initial * pure zerocopy skb */ @@ -1411,6 +1416,54 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (err < 0) goto do_error; copy = err; + } else if (zc == 2) { + /* Splice in data. */ + struct page *page = NULL, **pages = &page; + size_t off = 0, part; + bool can_coalesce; + int i = skb_shinfo(skb)->nr_frags; + + copy = iov_iter_extract_pages(&msg->msg_iter, &pages, + copy, 1, 0, &off); + if (copy <= 0) { + err = copy ?: -EIO; + goto do_error; + } + + can_coalesce = skb_can_coalesce(skb, i, page, off); + if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { + tcp_mark_push(tp, skb); + iov_iter_revert(&msg->msg_iter, copy); + goto new_segment; + } + if (tcp_downgrade_zcopy_pure(sk, skb)) { + iov_iter_revert(&msg->msg_iter, copy); + goto wait_for_space; + } + + part = tcp_wmem_schedule(sk, copy); + iov_iter_revert(&msg->msg_iter, copy - part); + if (!part) + goto wait_for_space; + copy = part; + + if (can_coalesce) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + get_page(page); + skb_fill_page_desc_noacc(skb, i, page, off, copy); + } + page = NULL; + + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; + + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + sk_wmem_queued_add(sk, copy); + sk_mem_charge(sk, copy); + } if (!copied) From patchwork Fri Mar 31 16:08:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77836 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp676998vqo; Fri, 31 Mar 2023 09:13:22 -0700 (PDT) X-Google-Smtp-Source: AKy350Ydg9zWh7fOMok2EyPBx0JG+71ze5qbqDBLI7WUTT3zqE6MxSXJT5VBBWzynTlKDejGgqJy X-Received: by 2002:a17:906:fb93:b0:92b:8fe7:344 with SMTP id lr19-20020a170906fb9300b0092b8fe70344mr25522367ejb.16.1680279202246; Fri, 31 Mar 2023 09:13:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279202; cv=none; d=google.com; s=arc-20160816; b=l/b1aBzZE4h69fFNEekiZ36vLXfPOGEvcJ8IZnEFKHCKztiKgPHijo6ir0UFKZ6huV WRh9nr+hmwYcnASHx2tZCG8Bo/ECC4R9Tdk4SfUzXBbW+0SZCTHrkv1WW5dxOfME9ero i+ayZEdi+oI4Rqh/+oPOdIhc7kqxnDFtfh4ddpU9coKMzQxPxg12mjvQgzmBYkW+CG9S /Hyk086DmaYFdpFQAu7AmAq+W/GpVuapjBbm1mwohTd94H8fukK1LZlOtjCGfLGL6zf/ r2rv1Yy2AjGSg0BG9E4pWQucYGHwD5HKIHyLnCegtBneX9iYh5+JmoLE8wp48E2IHz0+ gvjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4TWX2aSvGOZ9Qof5Bdvg67w1lje6zmFD7Oc0vC/yB3w=; b=vR25p9qwShH+chZLzIhAQm1aOMfRlAK+vvgpEyqGWC4Bw5foKIm0cFr3J8P9BLJtn9 /3KHB8NyX9mczRgjW+vY+Z1DRXmiB5wU9Jd2TfpFK/JvwZ28ra9uTQGPiI5/QIPr+qwH 5sRoWHXBIFlnQtGx37HWRqFEDwh/XfVk99b5DctKj5KfAm7Z2zfPuNZqDkrT+0lkVtcV d46URqxOQE8NSIt8O69afLl47My8UVw8gymmIfyUQt19DivibhF0UCVbKCtWirHteWCQ gSiADN9c9UScfOYx/XKkdGrdJgf1QOQi8qO3FSQtsMoUCTwhogqVmTCTx9IrXlGpoe4H HVGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XXbiukOt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h7-20020a170906854700b009338f516161si1945254ejy.707.2023.03.31.09.12.56; Fri, 31 Mar 2023 09:13:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XXbiukOt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232746AbjCaQLa (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231539AbjCaQLN (ORCPT ); Fri, 31 Mar 2023 12:11:13 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7C221FD38 for ; Fri, 31 Mar 2023 09:09:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4TWX2aSvGOZ9Qof5Bdvg67w1lje6zmFD7Oc0vC/yB3w=; b=XXbiukOtvz1J1zkF48k7v0AqZ04Q+6zKkqtU/1udGreG8tVCWShgfuT7Uom4FBM6zU4Jqz RErn/aJnJ3UjkiQ8+c1wDBHg7tHBOtnLTOQWyUGP6Q+JYOtltUidDwrSbvu2wntwvxyfSy upB7+zITk3kmFu6XCidl6AI2kOBNf7k= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-496-oMyBzLXdOyeTBAtGXK8UWA-1; Fri, 31 Mar 2023 12:09:45 -0400 X-MC-Unique: oMyBzLXdOyeTBAtGXK8UWA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A431F85A5A3; Fri, 31 Mar 2023 16:09:44 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C01FC492C3E; Fri, 31 Mar 2023 16:09:42 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 08/55] tcp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Fri, 31 Mar 2023 17:08:27 +0100 Message-Id: <20230331160914.1608208-9-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900444740107905?= X-GMAIL-MSGID: =?utf-8?q?1761900444740107905?= If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make tcp_sendmsg() copy it. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 910b327c236e..6ef0518eb706 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1417,10 +1417,10 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto do_error; copy = err; } else if (zc == 2) { - /* Splice in data. */ + /* Splice in data if we can; copy if we can't. */ struct page *page = NULL, **pages = &page; size_t off = 0, part; - bool can_coalesce; + bool can_coalesce, put = false; int i = skb_shinfo(skb)->nr_frags; copy = iov_iter_extract_pages(&msg->msg_iter, &pages, @@ -1447,12 +1447,34 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto wait_for_space; copy = part; + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, copy, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, copy); + err = copy ?: -ENOMEM; + goto do_error; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + can_coalesce = false; + } + if (can_coalesce) { skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); } else { - get_page(page); + if (!put) + get_page(page); + put = false; skb_fill_page_desc_noacc(skb, i, page, off, copy); } + if (put) + put_page(page); page = NULL; if (!(flags & MSG_NO_SHARED_FRAGS)) From patchwork Fri Mar 31 16:08:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77875 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692398vqo; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) X-Google-Smtp-Source: AKy350YsB5PO2mlmdee1AT4EKlZUL4h+10pUD0uV1QOrE9jJudyvSGn2OzeBYZo1wxgylsJKpMvp X-Received: by 2002:a17:907:1628:b0:931:af6a:ad0f with SMTP id hb40-20020a170907162800b00931af6aad0fmr34973950ejc.76.1680280589533; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280589; cv=none; d=google.com; s=arc-20160816; b=WtGRs1lC8E7JxI6+Gjmf2zJPsoZERax4/f724wqKSFgRjLum3hvGk/Vb00OfEOcG3j v1o4OK5eakwLFC0gSemTQL867o11wgU90u3uXhMV9EojSJWxYJka7LETzitwFJ4riiqj 19w5FmXw8MSCqAoCCIEveZCYbjCHd4V3pZx4D4S2IfdJs7qgTKaACqU4V//gfhZFd+J+ itzo8oHwAyQmQ/3r4+g3+PQq8HTrCcvSTgn3PglQ9HT/UYdLvSoDw2CHDgjMeAhrA/6f nk1OWvXpVvMOOMWDQFx7dXJQDNcd6ZaiMO18JyLJf1MaHLekq6kQo7tYrhSyKetVfmvy PJFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nhfSw6vyJXoYjtAsHCy4fHcMYpuQf7adDQC2jq05dOE=; b=kNTTt4gklwE5SQ4AqpMb5l81r+kGIroo/LJOyJylfKcHWlc5wZ/1EMYX25CUWehjpb fwqS/WTpAeJ9Itdj+IyS6D41SZUByV02FKEPic9310DW2BMGNh3hqlMde1u/jaox3NHl JprkVZ9IQQQDk+QadP5uM+VBY847KAkzxWmpBo1UrfcY/NTaR/NljNreqtBbTvAuC2P1 I3WsUmgFKZgewgQJdLcC3T67Md9BG8koh9wHUDZK3URMOdEUs47eRq66JH6l/4oU6vs0 0OP/idxibnj6qns1L5ks5Ciy9rdMbzQ/d3fpfh41Z8DdmV+M8OdBFQiG1AwCHHq3WbZb acFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YLTDCMtU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020aa7d949000000b005002af128dfsi2350234eds.100.2023.03.31.09.36.05; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YLTDCMtU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232956AbjCaQLf (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232193AbjCaQLS (ORCPT ); Fri, 31 Mar 2023 12:11:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 155FC21A86 for ; Fri, 31 Mar 2023 09:09:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278993; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nhfSw6vyJXoYjtAsHCy4fHcMYpuQf7adDQC2jq05dOE=; b=YLTDCMtUd2B2YL6zU2K8Qgu0i/vHHQdNdXqmgQwm+mlhrY9oywxWcNgexK2hxpwg7ptCY+ rj/qHe5HLbTCrRXjkoJWAJJblTTjyhKrL8aVAFbsZefNnAmlPXku0GoRpG9VnEMW/1SKur dXTdKe7R7Q2oywRLKeJcZYpGYLXkv+M= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-138-rSYPRKkjNEe0Qp7x83krYQ-1; Fri, 31 Mar 2023 12:09:48 -0400 X-MC-Unique: rSYPRKkjNEe0Qp7x83krYQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2AC8285A5A3; Fri, 31 Mar 2023 16:09:47 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 46B784020C82; Fri, 31 Mar 2023 16:09:45 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 09/55] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:28 +0100 Message-Id: <20230331160914.1608208-10-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901899350060778?= X-GMAIL-MSGID: =?utf-8?q?1761901899350060778?= Convert do_tcp_sendpages() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. do_tcp_sendpages() can then be inlined in subsequent patches into its callers. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 158 +++---------------------------------------------- 1 file changed, 7 insertions(+), 151 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6ef0518eb706..edcf3a60c1b0 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,163 +971,19 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, - struct page *page, int offset, size_t *size) -{ - struct sk_buff *skb = tcp_write_queue_tail(sk); - struct tcp_sock *tp = tcp_sk(sk); - bool can_coalesce; - int copy, i; - - if (!skb || (copy = size_goal - skb->len) <= 0 || - !tcp_skb_can_collapse_to(skb)) { -new_segment: - if (!sk_stream_memory_free(sk)) - return NULL; - - skb = tcp_stream_alloc_skb(sk, 0, sk->sk_allocation, - tcp_rtx_and_write_queues_empty(sk)); - if (!skb) - return NULL; - -#ifdef CONFIG_TLS_DEVICE - skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED); -#endif - tcp_skb_entail(sk, skb); - copy = size_goal; - } - - if (copy > *size) - copy = *size; - - i = skb_shinfo(skb)->nr_frags; - can_coalesce = skb_can_coalesce(skb, i, page, offset); - if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { - tcp_mark_push(tp, skb); - goto new_segment; - } - if (tcp_downgrade_zcopy_pure(sk, skb)) - return NULL; - - copy = tcp_wmem_schedule(sk, copy); - if (!copy) - return NULL; - - if (can_coalesce) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); - } else { - get_page(page); - skb_fill_page_desc_noacc(skb, i, page, offset, copy); - } - - if (!(flags & MSG_NO_SHARED_FRAGS)) - skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; - - skb->len += copy; - skb->data_len += copy; - skb->truesize += copy; - sk_wmem_queued_add(sk, copy); - sk_mem_charge(sk, copy); - WRITE_ONCE(tp->write_seq, tp->write_seq + copy); - TCP_SKB_CB(skb)->end_seq += copy; - tcp_skb_pcount_set(skb, 0); - - *size = copy; - return skb; -} - ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct tcp_sock *tp = tcp_sk(sk); - int mss_now, size_goal; - int err; - ssize_t copied; - long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); - - if (IS_ENABLED(CONFIG_DEBUG_VM) && - WARN_ONCE(!sendpage_ok(page), - "page must not be a Slab one and have page_count > 0")) - return -EINVAL; - - /* Wait for a connection to finish. One exception is TCP Fast Open - * (passive side) where data is allowed to be sent before a connection - * is fully established. - */ - if (((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) && - !tcp_passive_fastopen(sk)) { - err = sk_stream_wait_connect(sk, &timeo); - if (err != 0) - goto out_err; - } + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - mss_now = tcp_send_mss(sk, &size_goal, flags); - copied = 0; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - err = -EPIPE; - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) - goto out_err; - - while (size > 0) { - struct sk_buff *skb; - size_t copy = size; - - skb = tcp_build_frag(sk, size_goal, flags, page, offset, ©); - if (!skb) - goto wait_for_space; - - if (!copied) - TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH; - - copied += copy; - offset += copy; - size -= copy; - if (!size) - goto out; - - if (skb->len < size_goal || (flags & MSG_OOB)) - continue; - - if (forced_push(tp)) { - tcp_mark_push(tp, skb); - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_PUSH); - } else if (skb == tcp_send_head(sk)) - tcp_push_one(sk, mss_now); - continue; - -wait_for_space: - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - tcp_push(sk, flags & ~MSG_MORE, mss_now, - TCP_NAGLE_PUSH, size_goal); - - err = sk_stream_wait_memory(sk, &timeo); - if (err != 0) - goto do_error; - - mss_now = tcp_send_mss(sk, &size_goal, flags); - } - -out: - if (copied) { - tcp_tx_timestamp(sk, sk->sk_tsflags); - if (!(flags & MSG_SENDPAGE_NOTLAST)) - tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); - } - return copied; - -do_error: - tcp_remove_empty_skb(sk); - if (copied) - goto out; -out_err: - /* make sure we wake any epoll edge trigger waiter */ - if (unlikely(tcp_rtx_and_write_queues_empty(sk) && err == -EAGAIN)) { - sk->sk_write_space(sk); - tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED); - } - return sk_stream_error(sk, flags, err); + return tcp_sendmsg_locked(sk, &msg, size); } EXPORT_SYMBOL_GPL(do_tcp_sendpages); From patchwork Fri Mar 31 16:08:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77837 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp677793vqo; Fri, 31 Mar 2023 09:14:34 -0700 (PDT) X-Google-Smtp-Source: AKy350Z1mILYrhQgBwABmy6H/7GK8SLvZ4aG7GGpPWVKaYPrlE0K8pu3+v308l3fvZfFQ6QKCkmC X-Received: by 2002:a17:906:8a41:b0:92b:6f92:7705 with SMTP id gx1-20020a1709068a4100b0092b6f927705mr24627223ejc.40.1680279273864; Fri, 31 Mar 2023 09:14:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279273; cv=none; d=google.com; s=arc-20160816; b=b4K3NZZ7+OzNfbvMKrrtFAnKLVmMvmQoOZfcLTTdsWW0vE+lkt/Q26+CItuexuJOWi ZXJjUqFuPxGMCsEpH8djEuIf9EQEl2XvZKgcuZhLnLsZplQQSvxuS6mhBA0WI8l0BuR2 2VL/M1TjUhydcEd3hU37yvNPKaNjJ+86MJ/6uNzigXkr8iyca6TvLNb2Z25ychX+P+Pn CKFEGy1gMB0noqp2At1x8WoeH2fDtAibqnhiFOQW3ImddPF6mFzO0YAtaR8wZhM0CsjQ 8Mm/BqtPLBJ/xfZ3GV59ggjrDW1Sz1TbKD2yO8EP/90Q6T1BKFqe2RitVe/zlHwTSbYA RweQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=V4A9+ThbnxeOTamldoFRxBPwB7/WQbFmJXJUgDasL0c=; b=vXG8zlYqqHyZC17qtZil3UVeEQxi4JpUqEJ0coNVWTHm0lL7V2L+4rDR8bTCsMFuPg lTF51zNWs0WEIhHpbVihqaSxrfUE+1Rxgg9zpGzORGw/IoqnL9dDPeU7sHftELUmMUgy SqIOQtPpHY4fzsi3AQA9BbNqgg1YKTkE5wLGLDUcJN48M9qE2Qx5rRFkQ/NupEz8RnoP AzzcgIOaLfOOxxrOAL+kpj/Y1IXzXKj7Qfg98BJaxZSakQmgzMqlh8IOxGvGa0JCHvY3 M9WAt4myMuiLYxZFglX5brGzsX9ARKY5O7Kb+oZp9F/d8oQ0XTOVhYo54zPbnGnlZWYC DLcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=II2y+WEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l24-20020a1709066b9800b009230fbd1babsi2097500ejr.892.2023.03.31.09.14.08; Fri, 31 Mar 2023 09:14:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=II2y+WEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233122AbjCaQMP (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232897AbjCaQLV (ORCPT ); Fri, 31 Mar 2023 12:11:21 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3894421AA7 for ; Fri, 31 Mar 2023 09:09:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4A9+ThbnxeOTamldoFRxBPwB7/WQbFmJXJUgDasL0c=; b=II2y+WEuRpmyNF09XU0RvehiVZB4qlJYjAIux3AixgyEV4COdtMQ8gnLSp85dTgF+iQz6w tmGA+/b5gLVfl3cobtlaCMY2KEOSVI8GBqd3r8fG34vafdCXwlf9Kot0haOLzJ45jJuJs+ BPcAFCLs7VxII4oStIHQcMuYvNeCAic= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-316-ldaH4sxnN8OIiGvOqC_Jew-1; Fri, 31 Mar 2023 12:09:51 -0400 X-MC-Unique: ldaH4sxnN8OIiGvOqC_Jew-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1C5C6101A550; Fri, 31 Mar 2023 16:09:50 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DFCB151FF; Fri, 31 Mar 2023 16:09:47 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Fastabend , Jakub Sitnicki , bpf@vger.kernel.org Subject: [PATCH v3 10/55] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg Date: Fri, 31 Mar 2023 17:08:29 +0100 Message-Id: <20230331160914.1608208-11-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900519701696404?= X-GMAIL-MSGID: =?utf-8?q?1761900519701696404?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: John Fastabend cc: Jakub Sitnicki cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org cc: bpf@vger.kernel.org --- net/ipv4/tcp_bpf.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index cf26d65ca389..7f17134637eb 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -72,11 +72,13 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, { bool apply = apply_bytes; struct scatterlist *sge; + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct page *page; int size, ret = 0; u32 off; while (1) { + struct bio_vec bvec; bool has_tx_ulp; sge = sk_msg_elem(msg, msg->sg.start); @@ -88,16 +90,18 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, tcp_rate_check_app_limited(sk); retry: has_tx_ulp = tls_sw_has_ctx_tx(sk); - if (has_tx_ulp) { - flags |= MSG_SENDPAGE_NOPOLICY; - ret = kernel_sendpage_locked(sk, - page, off, size, flags); - } else { - ret = do_tcp_sendpages(sk, page, off, size, flags); - } + if (has_tx_ulp) + msghdr.msg_flags |= MSG_SENDPAGE_NOPOLICY; + if (flags & MSG_SENDPAGE_NOTLAST) + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, size, off); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret <= 0) return ret; + if (apply) apply_bytes -= ret; msg->sg.size -= ret; @@ -398,7 +402,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) long timeo; int flags; - /* Don't let internal do_tcp_sendpages() flags through */ + /* Don't let internal sendpage flags through */ flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED); flags |= MSG_NO_SHARED_FRAGS; From patchwork Fri Mar 31 16:08:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692964vqo; Fri, 31 Mar 2023 09:37:30 -0700 (PDT) X-Google-Smtp-Source: AKy350YVG196G7jVUX0u4zAfQAlRYh89Iqmdby0FHwRo+Yvd6s+P6EAyRuUs96IfWC4c1sLDEhF5 X-Received: by 2002:aa7:c846:0:b0:4fb:8d3c:3b86 with SMTP id g6-20020aa7c846000000b004fb8d3c3b86mr25569860edt.1.1680280649715; Fri, 31 Mar 2023 09:37:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280649; cv=none; d=google.com; s=arc-20160816; b=IwRjG54MCx4dsf8/SDOrgiaN1yZ/OkjJ89zMwG8emSWJY/9BVIDsi1AgPMtHJJF10E L/iLzvxND4wnp6RwOKoKwdcmVEEqqhduvIDABapILZDKSWaHzl++hEBdq5NOs2b+2mct oQZqIXggH45b+nmQ5POoLll8bqombeKXy4F7/Wuq57FPTe2RLC7Kxnll5HJENTe/4I2l ZPzl6k2JaFMValwJZKb7FQ+gDVmjJ81EWJe74jlI2O9AK0OOWESNazUaDUDJtY/zgfaH hoHikT5WU6zqSE8vK9XIku6SltWfNz9jo0nv1WFfQLkoMqKMLKFGDhGK+QIx1ovAD8Hw cJbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=zIuQjs5jvPmuCSbxsUkeao4fAAFUK29jRRzXpGrKpgi/AqU6bHJTAC9nsYbugpNt4a HROwJfRzlCoYxYXkUxwiAW1FzL0jm8aHvz9ZNUYQ7xaZiiaHSbejrAWdvtVZzwKnhmsJ 82TTAeOaH1zC3sDQnAakqlT47eVNu54zxBgE7KJQgxKYWlbl7WeWrp5xAA8hkT3qr/DS MewCOFeg9QBW1DJdBT+e+Qaj/JS4G+WcwSoHinZOShUCs3fOqpHQOHj/81nskd30rVCC KAaF8WIXxxE2CIIuSGEDtnlI5zWPS/QW5aDOrj4IiBl4rgSoKTqfw/fdnHKbuQhl5b5z yVhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ixaZ0zb7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id by27-20020a0564021b1b00b004ac746e3f9asi1540629edb.441.2023.03.31.09.37.06; Fri, 31 Mar 2023 09:37:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ixaZ0zb7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232921AbjCaQMT (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231215AbjCaQLY (ORCPT ); Fri, 31 Mar 2023 12:11:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F33E21AAB for ; Fri, 31 Mar 2023 09:09:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=ixaZ0zb7R4b26jwcrQaPh7XQZGbmmtvh5YzR+bL5uIjudCutaWBVBrH3p6c2fZXOnu18mS 1nfzaNeiwSY8beuIk/TbfE1tVQtd9MthZvRwl3eJlehr9geZge2SW5UgPWa/pI8UXju1XJ cdbAchrEuhEwXFUg9oAgl5BtLYtsjZo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-68-mHwARYpIOEOh1orxqULtjA-1; Fri, 31 Mar 2023 12:09:54 -0400 X-MC-Unique: mHwARYpIOEOh1orxqULtjA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3C9393823A13; Fri, 31 Mar 2023 16:09:53 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id B30A12166B33; Fri, 31 Mar 2023 16:09:50 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Steffen Klassert , Herbert Xu Subject: [PATCH v3 11/55] espintcp: Inline do_tcp_sendpages() Date: Fri, 31 Mar 2023 17:08:30 +0100 Message-Id: <20230331160914.1608208-12-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901962237161778?= X-GMAIL-MSGID: =?utf-8?q?1761901962237161778?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Steffen Klassert cc: Herbert Xu cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/xfrm/espintcp.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 872b80188e83..3504925babdb 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -205,14 +205,16 @@ static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg, static int espintcp_sendskmsg_locked(struct sock *sk, struct espintcp_msg *emsg, int flags) { + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct sk_msg *skmsg = &emsg->skmsg; struct scatterlist *sg; int done = 0; int ret; - flags |= MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags |= MSG_SENDPAGE_NOTLAST; sg = &skmsg->sg.data[skmsg->sg.start]; do { + struct bio_vec bvec; size_t size = sg->length - emsg->offset; int offset = sg->offset + emsg->offset; struct page *p; @@ -220,11 +222,13 @@ static int espintcp_sendskmsg_locked(struct sock *sk, emsg->offset = 0; if (sg_is_last(sg)) - flags &= ~MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret < 0) { emsg->offset = offset - sg->offset; skmsg->sg.start += done; From patchwork Fri Mar 31 16:08:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77868 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692152vqo; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) X-Google-Smtp-Source: AKy350Yd1MG6k8rvkgrFkRvbejEw9dwcX1irpDIhzH7GkR8i0hTJ3z/PPDfLl9QaMhkzeso2FAXN X-Received: by 2002:a17:906:31d0:b0:8f4:809e:faee with SMTP id f16-20020a17090631d000b008f4809efaeemr9676372ejf.19.1680280566567; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280566; cv=none; d=google.com; s=arc-20160816; b=luWckjN12mJT7FvDIuGChEQ52SR2j9MlwLnPmR+X2OlE75aRTp5H0Z+YnJZllLlxNX sIcHntMRarPjpyctrIZijQUiTuFmA9vtaU0DDtF+a9Iemz5tpLSKgwOEQwUNcnYXBlhO LbCA2WvTWlBumtq4FGeyY3YWaKnuBmGuLrH9KWMKtC2eCgAvZ4W/JzuMdWxezZ4V3KYz IeCY1EdFfovolrYQbj8gOErIzR5XmLnZlRkwGZ85en0pXkz0eG7eWMI8LLXPzmacbicI Q1WQ0GP7ngano2OlvpiF4IByTER/Zu60VOc/51K2DemWXRMguXu8KN8BV1pqj2eVn5mu eEoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l3swE5hQADS6B6hDKlip5qVV7zk3Jb5E005A1dmhwhE=; b=sB8i1oAARgMtODZE7YS1pc+fM9Uq2/oysWSUtvYhoYIMJc4e3e+5DkYxaoI8jmoqdx 23HiU7B8dW6mnLAVcYOI8jnANfr0oSFjcN7MmfFB18g6ASCQBqwfSgGOxhXC4UsSuizg ItCcKSuXy0kYRBZVG8ie5FNnp89CvQ7Umst8au9F9vCwazxMU36ie5PTZOESA8WiJFec trw01guh2D8af0oIcTtox8gi4HYj7Xkv/vrMgIR8nqrsuoYP/BpxHIUieA6Ap4VJjbPn zR3MQXJcwwnyyn4Z0rXzbJbKHcHgI1WkgS4FCEzKaDDfY+/iwURvMV6UK0xfXLwEmuyV sf5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=A5W4CMMA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a22-20020a170906469600b0093084f268d0si2001079ejr.52.2023.03.31.09.35.42; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=A5W4CMMA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232381AbjCaQLw (ORCPT + 99 others); Fri, 31 Mar 2023 12:11:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232267AbjCaQLS (ORCPT ); Fri, 31 Mar 2023 12:11:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EC2521AAD for ; Fri, 31 Mar 2023 09:09:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680278998; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l3swE5hQADS6B6hDKlip5qVV7zk3Jb5E005A1dmhwhE=; b=A5W4CMMA7AT2ECkIqP/Tc5MSiamkqNdFve2P14WOoBOy3R7Bzpk7tzjYmyjGQ3kgf9Uj4T MY+mAL7qaKlXXo+ZMlA1vZx0NYHmD0yayLK80aBoK2FzLWvTUZ1IJlZQAJB3gJmTbyN4n5 AKq2kEO/cBpVCDbpFIc/W7j86rR//tg= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-606-0osVrac6O6KfpaQgEswi9w-1; Fri, 31 Mar 2023 12:09:57 -0400 X-MC-Unique: 0osVrac6O6KfpaQgEswi9w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1A0F71C0A58E; Fri, 31 Mar 2023 16:09:56 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 049E61121314; Fri, 31 Mar 2023 16:09:53 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH v3 12/55] tls: Inline do_tcp_sendpages() Date: Fri, 31 Mar 2023 17:08:31 +0100 Message-Id: <20230331160914.1608208-13-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901875381772584?= X-GMAIL-MSGID: =?utf-8?q?1761901875381772584?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tls.h | 2 +- net/tls/tls_main.c | 24 +++++++++++++++--------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index 154949c7b0c8..d31521c36a84 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -256,7 +256,7 @@ struct tls_context { struct scatterlist *partially_sent_record; u16 partially_sent_offset; - bool in_tcp_sendpages; + bool splicing_pages; bool pending_open_record_frags; struct mutex tx_lock; /* protects partially_sent_* fields and diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 3735cb00905d..35b2f7ee2fa3 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -124,7 +124,10 @@ int tls_push_sg(struct sock *sk, u16 first_offset, int flags) { - int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SENDPAGE_NOTLAST | MSG_SPLICE_PAGES | flags, + }; int ret = 0; struct page *p; size_t size; @@ -133,16 +136,19 @@ int tls_push_sg(struct sock *sk, size = sg->length - offset; offset += sg->offset; - ctx->in_tcp_sendpages = true; + ctx->splicing_pages = true; while (1) { if (sg_is_last(sg)) - sendpage_flags = flags; + msg.msg_flags = flags; /* is sending application-limited? */ tcp_rate_check_app_limited(sk); p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, sendpage_flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = tcp_sendmsg_locked(sk, &msg, size); if (ret != size) { if (ret > 0) { @@ -154,7 +160,7 @@ int tls_push_sg(struct sock *sk, offset -= sg->offset; ctx->partially_sent_offset = offset; ctx->partially_sent_record = (void *)sg; - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return ret; } @@ -168,7 +174,7 @@ int tls_push_sg(struct sock *sk, size = sg->length; } - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return 0; } @@ -246,11 +252,11 @@ static void tls_write_space(struct sock *sk) { struct tls_context *ctx = tls_get_ctx(sk); - /* If in_tcp_sendpages call lower protocol write space handler + /* If splicing_pages call lower protocol write space handler * to ensure we wake up any waiting operations there. For example - * if do_tcp_sendpages where to call sk_wait_event. + * if splicing pages where to call sk_wait_event. */ - if (ctx->in_tcp_sendpages) { + if (ctx->splicing_pages) { ctx->sk_write_space(sk); return; } From patchwork Fri Mar 31 16:08:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77842 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp684165vqo; Fri, 31 Mar 2023 09:24:06 -0700 (PDT) X-Google-Smtp-Source: AKy350asUuVrQQw/UGy8gpKlbKiddovIVVpuXj8vlBx/ejLkGXYC+ZZU0GNnd5RkXZAPZ0DK/d3X X-Received: by 2002:a17:907:6d10:b0:947:eaed:1f87 with SMTP id sa16-20020a1709076d1000b00947eaed1f87mr61006ejc.19.1680279845679; Fri, 31 Mar 2023 09:24:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279845; cv=none; d=google.com; s=arc-20160816; b=K7NODTUsEIRmrctHYnd2CAysyLgaFeaZsd2arUI0mE8Tg+5oK6DXbe4EawxbMCRwuX uyecqdyFOWOsU6cfhwuC3tvzjalfObN98rtd0ck20NAgDfA3qpGe9Us2ayrpf+40G8q2 GVRSsJN22xthBaE+UfqGr6aEn6V/IFpKO1z2TmYi0gmbbCgnGW8y2Fbcu2DcbxjOGgvM 23S3b1btwlHZ1glv7jY9fwwgFGWKWWRWzANTT+Pqy4az5KiiD1jme+ZXmHPa04JoIh0y zqM3ge6tomxHDqee9yed/n7f6xZ04Q5M2x94x3kDMhmEOWiQ/GOaTzRLnMv0VIN+fLPn oMSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pP5V7G7GRqo/82ZWWApO+0CTJP1pdqvbg0H8r7V6l6c=; b=j/yWjvOKt2QscmBdJ6azCovUgkuk8tzK7zZJFACXWehgBX+YQAZFT78kfFtxUDq5B8 5apmT4ShUnoWS9q9v8O6fXEMcccaeM1yzqjOzAcAnHQVePuNVZNtFCj/dwD9cbfobipz b2Y0PTJis9DiGPQQUzN6ggbCIpW7Khm+eU5NKRgmIZKWn9VkRclOprJf/rgDVwqBJw4Z ap/Io2AvuUkk4tGOlLGO7YYemNHnQGVs8aMbvvVM3Rd8iQUzGmPi18c2//Bh1SV6hS27 beQXLC4euw/l4Zn/Bgk6UN3bQ6XltW/fW6jPwGHOM3aAzFk2f1xzk8VXk4Xl6um/36on ecxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KHClju6t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e18-20020a1709061e9200b0093161749da7si2087437ejj.369.2023.03.31.09.23.42; Fri, 31 Mar 2023 09:24:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KHClju6t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232883AbjCaQMW (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232939AbjCaQLW (ORCPT ); Fri, 31 Mar 2023 12:11:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33E7821AB3 for ; Fri, 31 Mar 2023 09:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pP5V7G7GRqo/82ZWWApO+0CTJP1pdqvbg0H8r7V6l6c=; b=KHClju6tbTQysZlDzL46wRvs9iEHTGQ1/MNRh/JhI/Ubq+gVty+RXvTg4si/RiXzZ6Oxkj oocuj4G0tsG6H2wSnLxe5Dsy+3ecWLG8h2v1oFog5qndI/VuIII5vAbREMnxM+i8EMYUSX 4CAN/MVUVakjUIuWYe2Oe29DCXILx7w= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-642-8gYrt4CYNxeFfvIQEREI6A-1; Fri, 31 Mar 2023 12:09:59 -0400 X-MC-Unique: 8gYrt4CYNxeFfvIQEREI6A-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D9D123823A0F; Fri, 31 Mar 2023 16:09:58 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id AFCB8492B00; Fri, 31 Mar 2023 16:09:56 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [PATCH v3 13/55] siw: Inline do_tcp_sendpages() Date: Fri, 31 Mar 2023 17:08:32 +0100 Message-Id: <20230331160914.1608208-14-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901119174076155?= X-GMAIL-MSGID: =?utf-8?q?1761901119174076155?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 05052b49107f..fa5de40d85d5 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, } /* - * 0copy TCP transmit interface: Use do_tcp_sendpages. + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. * * Using sendpage to push page by page appears to be less efficient * than using sendmsg, even if data are copied. @@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, size_t size) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = (MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST | + MSG_SPLICE_PAGES), + }; struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0, - flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST; + int i = 0, rv = 0, sent = 0; while (size) { size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); if (size + offset <= PAGE_SIZE) - flags = MSG_MORE | MSG_DONTWAIT; + msg.msg_flags = MSG_MORE | MSG_DONTWAIT; tcp_rate_check_app_limited(sk); + bvec_set_page(&bvec, page[i], bytes, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + try_page_again: lock_sock(sk); - rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags); + rv = tcp_sendmsg_locked(sk, &msg, size); release_sock(sk); if (rv > 0) { From patchwork Fri Mar 31 16:08:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77847 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp685954vqo; Fri, 31 Mar 2023 09:27:04 -0700 (PDT) X-Google-Smtp-Source: AKy350bSeGO+64saeXNyODV2V4hOtxdDCq5agTzJbUuv9QoPQ3vALcPG1COBt309VRY70hGfV8nz X-Received: by 2002:a50:e604:0:b0:4fa:c3eb:d2c9 with SMTP id y4-20020a50e604000000b004fac3ebd2c9mr26386223edm.12.1680280024650; Fri, 31 Mar 2023 09:27:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280024; cv=none; d=google.com; s=arc-20160816; b=ueE2usGEVe0l34vfbWEi2/FvJKwIJR71JiUmI+E7Ye1vDrJEqmmSRzxPOpkaa6r+Qy ugZdc8d58Asjy/QKUGqVpvw+EocPxSRf2Msu2CL2Buc+u2zx/R0db41fsHhAPZkS48d5 0ATuWwi6TujBKghOWRS59zl93BSfAE24T0JclsK0IQRqDAMadK4vZlETbC3wifAN5ncf ANbTRONycsgpwy6rZ+w5BrmoJpH8sgCSFKnPSHy/7eBzfq934MRuh3nfNEze8cEA75ad 5bDtJe759krar1QCCNNDjk4FHbt/7Fij7lbD/a41WTBzKyIJPlT1xwsy+AviJkczYNMM uwIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=s/WKKKfYkgZLf01K/tBKWgYukXTjbxBIh6Ytrwu4rIE=; b=ZL7n9FCfC5GDrYhtK+u6TbYm0iOeI9BxnzT+ecz2Y2UKVt8n8Aa9x+JU+PexlqDdwV J5FH1lkRsdwLjklHym9LFWbffVJt8o9uVteL6R3J9mf6yBoqQ6vHxX10p+URsyvPOMPs wIZKZ8PQ5YjLjz612cSBWKNDPlsHFzWaLIJv9w69saKZ/Z18hd9e8SWEzlJKCByhg+5v C18BiHc2OXhgnoJyOGuh46KK8gb3Dy6ah+cCpP+8Gh9cm2hGW54yJS4SpeIKu2W3euux 8I1UdbsZo8sjxQT3hPJbSVK9j3E8JO40KWj+us03evFsPBKXspeSHv8+PXZMN+B4BRDt +i+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GP3OQDLE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rl27-20020a170907217b00b008e457e36ac3si233196ejb.989.2023.03.31.09.26.40; Fri, 31 Mar 2023 09:27:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GP3OQDLE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233202AbjCaQM1 (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232974AbjCaQLi (ORCPT ); Fri, 31 Mar 2023 12:11:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7760F22900 for ; Fri, 31 Mar 2023 09:10:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s/WKKKfYkgZLf01K/tBKWgYukXTjbxBIh6Ytrwu4rIE=; b=GP3OQDLEi6FmF/HKx7YW5VcVy40uUpFDQ7+C5HIuwtua4SJztpIJAXADLgB6IFuH1t03Yb NijQezqzBLy/e+M3op0j9ptAa27JFtJ2yQG7FuaOteo5ELOOpY42hmdBWuk/aDcZh05mBn BvZF2xgMWntqvoqJxFjD12K8OsGv0pQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-637-LFWHst1IPamgtZLqqI4Kfw-1; Fri, 31 Mar 2023 12:10:02 -0400 X-MC-Unique: LFWHst1IPamgtZLqqI4Kfw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7D7C2858297; Fri, 31 Mar 2023 16:10:01 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9A3E1C15BA0; Fri, 31 Mar 2023 16:09:59 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 14/55] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() Date: Fri, 31 Mar 2023 17:08:33 +0100 Message-Id: <20230331160914.1608208-15-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901307319735730?= X-GMAIL-MSGID: =?utf-8?q?1761901307319735730?= Fold do_tcp_sendpages() into its last remaining caller, tcp_sendpage_locked(). Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tcp.h | 2 -- net/ipv4/tcp.c | 21 +++++++-------------- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index db9f828e9d1e..844bc8e6a714 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -333,8 +333,6 @@ int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags); int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags); -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int tcp_send_mss(struct sock *sk, int *size_goal, int flags); void tcp_push(struct sock *sk, int flags, int mss_now, int nonagle, int size_goal); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index edcf3a60c1b0..a8f8ccaed10e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,12 +971,17 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, + size_t size, int flags) { struct bio_vec bvec; struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + if (!(sk->sk_route_caps & NETIF_F_SG)) + return sock_no_sendpage_locked(sk, page, offset, size, flags); + + tcp_rate_check_app_limited(sk); /* is sending application-limited? */ + bvec_set_page(&bvec, page, size, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); @@ -985,18 +990,6 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, return tcp_sendmsg_locked(sk, &msg, size); } -EXPORT_SYMBOL_GPL(do_tcp_sendpages); - -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - return do_tcp_sendpages(sk, page, offset, size, flags); -} EXPORT_SYMBOL_GPL(tcp_sendpage_locked); int tcp_sendpage(struct sock *sk, struct page *page, int offset, From patchwork Fri Mar 31 16:08:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77869 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692148vqo; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) X-Google-Smtp-Source: AKy350az4HeoBI1ZCc5rv4ZMKRK7TQl3rl3UCHy7e23pSqk/NIAr0OEpAyg5PdrNSTgApLLDhYUA X-Received: by 2002:a17:90a:b88e:b0:23e:cea5:d37e with SMTP id o14-20020a17090ab88e00b0023ecea5d37emr29137678pjr.46.1680280566089; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280566; cv=none; d=google.com; s=arc-20160816; b=ePVVhleTQAwyvwAmqOtUJuPd4sN7nVGoe8KipyEjq8uxhPlNtzyZj4dzOocwqW/Yc1 0RfVACFlDzOdFzGu9oymOMGnRGN59FovZv5Ice6H4xHkU6AI0HcwBu1l40g7xOgX2PV9 AUA2MTUbhn9HcMeRBflAk0zs67lnrP7EGgLyjmN3YjRh7YUOC9Vdkjz9QbghSKtVlZLi gzJu0rRV7RZpfDfsRQhUxsx63mbd4fKC6k8jQRzWGhpkd0aeKJh4ZlNX/ojaWENHZ/sO 6FlWmYlUSxVLOFD+ZGZiZvAo8UbvIo5ifLEsHFvGZT/OVsdmomm27HmHyn4F1pwyvyeR UN+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zS5R68pySMwmjEGC4becLd7BE+l3zHJhW5dAqXR21oE=; b=fYVMyHAl3yZDou5OQYXlCASz23fgJfa8uDkCOeliKeQBWQfvduiYa2VpmdfMv18+pC ocEUE0KjxwiUlWbcv6glu8LnRd0Rt19s9dyAvvuHDKAzzhfxc9CgnzdamwUFhUDEGoex glOZ4cbdhF7ZBlc1baAr3+o5JNDgXvHAv4vrssPieWbeEk6j/S0dSSow/qbOiMPXdghE fdcKI0n+m45XTV6MJKSdGfRWQxYAfgZyGF5BVi9TMwUjwGKQDUh7NEGfmWGSNbEKz1J1 pVjstYgWHm68F/rUg6taCYL0R60dA9rBiNcXiBsUEIXQs6hyfq4KulSTast9WyyqEW6D uvyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TPSky1Nd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z19-20020a63e113000000b00513234112b3si2743380pgh.895.2023.03.31.09.35.53; Fri, 31 Mar 2023 09:36:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TPSky1Nd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233222AbjCaQMb (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231476AbjCaQLl (ORCPT ); Fri, 31 Mar 2023 12:11:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75F5122902 for ; Fri, 31 Mar 2023 09:10:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zS5R68pySMwmjEGC4becLd7BE+l3zHJhW5dAqXR21oE=; b=TPSky1NdyAiq7lS+vmlZvEFBH285b4EWlMouoU9hA+x32lJH62QP7uptDHrK2zHTS3ObKc Z7DvnaQkZyNSIe0jVjLducHq5fofun7eaTbVIvv/ejHgc1XaBqLWQRgpEHTTKn4SllZWNU 6ZoC3dvgdS4mam1kjlQHEAs9GRzGF1E= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-45-_6_3ZMiSNHmxuLq7kZsEhw-1; Fri, 31 Mar 2023 12:10:05 -0400 X-MC-Unique: _6_3ZMiSNHmxuLq7kZsEhw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1890F3C0D1AE; Fri, 31 Mar 2023 16:10:04 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1F5EC2166B33; Fri, 31 Mar 2023 16:10:02 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH v3 15/55] ip, udp: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:34 +0100 Message-Id: <20230331160914.1608208-16-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901874891661005?= X-GMAIL-MSGID: =?utf-8?q?1761901874891661005?= Make IP/UDP sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/ip_output.c | 102 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 99 insertions(+), 3 deletions(-) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 4e4e308c3230..e2eaba817c1f 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -956,6 +956,79 @@ csum_page(struct page *page, int offset, int copy) return csum; } +/* + * Allocate a packet for MSG_SPLICE_PAGES. + */ +static int __ip_splice_alloc(struct sock *sk, struct sk_buff **pskb, + unsigned int fragheaderlen, unsigned int maxfraglen, + unsigned int hh_len) +{ + struct sk_buff *skb_prev = *pskb, *skb; + unsigned int fraggap = skb_prev->len - maxfraglen; + unsigned int alloclen = fragheaderlen + hh_len + fraggap + 15; + + skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation); + if (unlikely(!skb)) + return -ENOBUFS; + + /* Fill in the control structures */ + skb->ip_summed = CHECKSUM_NONE; + skb->csum = 0; + skb_reserve(skb, hh_len); + + /* Find where to start putting bytes. */ + skb_put(skb, fragheaderlen + fraggap); + skb_reset_network_header(skb); + skb->transport_header = skb->network_header + fragheaderlen; + if (fraggap) { + skb->csum = skb_copy_and_csum_bits(skb_prev, maxfraglen, + skb_transport_header(skb), + fraggap); + skb_prev->csum = csum_sub(skb_prev->csum, skb->csum); + pskb_trim_unique(skb_prev, maxfraglen); + } + + /* Put the packet on the pending queue. */ + __skb_queue_tail(&sk->sk_write_queue, skb); + *pskb = skb; + return 0; +} + +/* + * Add (or copy) data pages for MSG_SPLICE_PAGES. + */ +static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, + void *from, int *pcopy) +{ + struct msghdr *msg = from; + struct page *page = NULL, **pages = &page; + ssize_t copy = *pcopy; + size_t off; + int err; + + copy = iov_iter_extract_pages(&msg->msg_iter, &pages, copy, 1, 0, &off); + if (copy <= 0) + return copy ?: -EIO; + + err = skb_append_pagefrags(skb, page, off, copy); + if (err < 0) { + iov_iter_revert(&msg->msg_iter, copy); + return err; + } + + if (skb->ip_summed == CHECKSUM_NONE) { + __wsum csum; + + csum = csum_page(page, off, copy); + skb->csum = csum_block_add(skb->csum, csum, skb->len); + } + + skb_len_add(skb, copy); + refcount_add(copy, &sk->sk_wmem_alloc); + *pcopy = copy; + return 0; +} + static int __ip_append_data(struct sock *sk, struct flowi4 *fl4, struct sk_buff_head *queue, @@ -977,7 +1050,7 @@ static int __ip_append_data(struct sock *sk, int err; int offset = 0; bool zc = false; - unsigned int maxfraglen, fragheaderlen, maxnonfragsize; + unsigned int maxfraglen, fragheaderlen, maxnonfragsize, initial_length; int csummode = CHECKSUM_NONE; struct rtable *rt = (struct rtable *)cork->dst; unsigned int wmem_alloc_delta = 0; @@ -1017,6 +1090,7 @@ static int __ip_append_data(struct sock *sk, (!exthdrlen || (rt->dst.dev->features & NETIF_F_HW_ESP_TX_CSUM))) csummode = CHECKSUM_PARTIAL; + initial_length = length; if ((flags & MSG_ZEROCOPY) && length) { struct msghdr *msg = from; @@ -1047,6 +1121,14 @@ static int __ip_append_data(struct sock *sk, skb_zcopy_set(skb, uarg, &extra_uref); } } + } else if ((flags & MSG_SPLICE_PAGES) && length) { + if (inet->hdrincl) + return -EPERM; + if (rt->dst.dev->features & NETIF_F_SG) + /* We need an empty buffer to attach stuff to */ + initial_length = transhdrlen; + else + flags &= ~MSG_SPLICE_PAGES; } cork->length += length; @@ -1074,6 +1156,16 @@ static int __ip_append_data(struct sock *sk, unsigned int alloclen, alloc_extra; unsigned int pagedlen; struct sk_buff *skb_prev; + + if (unlikely(flags & MSG_SPLICE_PAGES)) { + err = __ip_splice_alloc(sk, &skb, fragheaderlen, + maxfraglen, hh_len); + if (err < 0) + goto error; + continue; + } + initial_length = length; + alloc_new_skb: skb_prev = skb; if (skb_prev) @@ -1085,7 +1177,7 @@ static int __ip_append_data(struct sock *sk, * If remaining data exceeds the mtu, * we know we need more fragment(s). */ - datalen = length + fraggap; + datalen = initial_length + fraggap; if (datalen > mtu - fragheaderlen) datalen = maxfraglen - fragheaderlen; fraglen = datalen + fragheaderlen; @@ -1099,7 +1191,7 @@ static int __ip_append_data(struct sock *sk, * because we have no idea what fragment will be * the last. */ - if (datalen == length + fraggap) + if (datalen == initial_length + fraggap) alloc_extra += rt->dst.trailer_len; if ((flags & MSG_MORE) && @@ -1206,6 +1298,10 @@ static int __ip_append_data(struct sock *sk, err = -EFAULT; goto error; } + } else if (flags & MSG_SPLICE_PAGES) { + err = __ip_splice_pages(sk, skb, from, ©); + if (err < 0) + goto error; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; From patchwork Fri Mar 31 16:08:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77859 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690869vqo; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) X-Google-Smtp-Source: AKy350Y6v5JGnciassZeI1JSpgDCYUjBVrnu7f5aYcEG29J+urpKLl2MkmLYU4TFFsEC/R0vdICI X-Received: by 2002:a17:90b:4b12:b0:23f:635e:51e5 with SMTP id lx18-20020a17090b4b1200b0023f635e51e5mr29592405pjb.36.1680280448240; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280448; cv=none; d=google.com; s=arc-20160816; b=YG/bOmhDcRXiC7QjAuwqf8b8aPM1T1mE+I9jAPcnSpvvVnaHgh6BWckM6aR/QiEFZq Xi13M8WfGOxC6MfWvUGeYO/mU8u/8hLkLt9cD4/Q0X+1mPV2vbvMtkXlQgncko/IGHt6 DQDR7FG2dHaWaTGpm884+P1gkwBznCXkp0Ic2CPyocLZgvRrqIDMU+MNvGdVlZVqcvt5 DQV6qSZ+0VQJGbCKY9OIRr0YA9Ik4mLX4xwRiyGlRWAZXZmiPKS4gZ/FkjmCoil7+aM9 2BehGrzjn5KTlOxAzeUwc7FDgMT2XYuP0T2z+yJ3nkPB1T2zeaw8FKjfpnfSu/KNTWd+ Pe2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=t1cUbUkat/Gf1dtGRXqhXt4i30LjR64ymk07qbS2yB8=; b=p5HQiqKS6q1QptQQP5ZUGAG/khYjOPouADv8Q+4jIHjjx7MJFtfV2A5Ax1ExWb5oQU G/voE6j1f9zTb0t0BMRxrGPvrXgihKIBXVaEZwMPyeSbJ6+TM62LY1AKtZmcmXALF9tP qyG20gEWEbNVf/f1mmMlnt9s0sZXc4iLq7Gf3YQ5NTZim//+jy2R6vEj/MyVIe6T98d4 JKRbalgrcEmv53BqK8hhLDc1WTHSdEKa4Sa0VBiAlEcVLlyZ4xQjSO4WZkX160gJWUhP or4KXLA7GMYpNHo4TWFjdFkYQapNgp27ZK6eVl5nA+mAc5m1L4xVvGD19ieqBIXg2sNx x4mQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VT5gix0X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lt15-20020a17090b354f00b0023676bd55d1si6834361pjb.94.2023.03.31.09.33.56; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VT5gix0X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232519AbjCaQNH (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233042AbjCaQLy (ORCPT ); Fri, 31 Mar 2023 12:11:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47E232290F for ; Fri, 31 Mar 2023 09:10:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279015; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t1cUbUkat/Gf1dtGRXqhXt4i30LjR64ymk07qbS2yB8=; b=VT5gix0XvxmWH561LSpfZhNexzGqOlu1dikfz30/zUf7bNMu8AjRIMaphmWKrJxmGsFui6 DOfvi+TqWAhdSn5kFBGwpFb/LnUDbX2ISyk8D6qmIakcrn+Lq86ETtMAR+9mEvPrZMokL4 gWgOKs7lssYgijG38w5JcI3kPhpYMHI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-488-2g961z0wNy-7npZvx0P1sg-1; Fri, 31 Mar 2023 12:10:10 -0400 X-MC-Unique: 2g961z0wNy-7npZvx0P1sg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C8816185A78B; Fri, 31 Mar 2023 16:10:06 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD6FD1121314; Fri, 31 Mar 2023 16:10:04 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH v3 16/55] ip, udp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Fri, 31 Mar 2023 17:08:35 +0100 Message-Id: <20230331160914.1608208-17-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901751071521808?= X-GMAIL-MSGID: =?utf-8?q?1761901751071521808?= If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make __ip_append_data() copy it. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/ip_output.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index e2eaba817c1f..41a954ac9e1a 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1004,13 +1004,32 @@ static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, struct page *page = NULL, **pages = &page; ssize_t copy = *pcopy; size_t off; + bool put = false; int err; copy = iov_iter_extract_pages(&msg->msg_iter, &pages, copy, 1, 0, &off); if (copy <= 0) return copy ?: -EIO; + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, copy, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, copy); + return -ENOMEM; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } + err = skb_append_pagefrags(skb, page, off, copy); + if (put) + put_page(page); if (err < 0) { iov_iter_revert(&msg->msg_iter, copy); return err; From patchwork Fri Mar 31 16:08:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77857 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690753vqo; Fri, 31 Mar 2023 09:33:54 -0700 (PDT) X-Google-Smtp-Source: AKy350bylmrpbWYCEwZq2SHTwNP8vWxHCRTR5e3jqTEAlJHEbWpW/vG8EXBrcasNg2BDOLjIZ3O1 X-Received: by 2002:aa7:981c:0:b0:625:e3c0:8a58 with SMTP id e28-20020aa7981c000000b00625e3c08a58mr29536549pfl.4.1680280433808; Fri, 31 Mar 2023 09:33:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280433; cv=none; d=google.com; s=arc-20160816; b=oRv2ouDsm1ci6wK5STaCaU6xBYeV+R9aJsvLsxMYZXe6BEzp7+EIV/6q6BvSDKJqlP jcPqgD2yEFXYjW0/LBNX/yndcVn74DJEc712L/pberdbWi/y+5A232V+y2QerugaAvHS bZELmkvKKEVBsPZxXLAx7OIje/jocPOAn6rzsC+Di6yYNZFaBHtYe6Z8VFLz4hTtJpmD pYuuHL0gkf1r4bc4OQNqquiEKEl8ncLE8F5Za4hAr+PNuEaR1bbmFeLS+NtRGNNHSRMu otc7tsU65amnvcizoPsJTZ3NFdytLBuu6Ec6YAhb1+5lbQSEISJvKtPemvmylUateLgY 0kAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3uNZWqu+MQKIO3OTB8U4ZJVMqWS1JR9mPIUS8nXZLn4=; b=E+O+lXV+HNHwHB3nrCHQc7X2ZII8c31nfJVwySL63oWBLFn1z9NelihpYEpPwNJtgz 0beBI64RlouiM3J/1bGYhMppeH5oMw3/zfmSYkqMrINp6DRyU472nuR7RzPqL4jKWOdY cQpckVVBkWWCOksTIy0aAzkrCh/rjJ/ImazuLn8wWOsSW1jKFomdmrqBKzH5F3Trsgnr D6c8WSS+DHvIBlcO1Iu/S+lrEVFHO0yJcmtayPIfLYqyl7rRoDEScGFgibZ8EHsOwR1P Cz5My3T3OGqaZwVn/Y5szb5p4uZpUXid3ZeP7qawbSOIXJEFYzpdioHMKwSjEva4D5Jo pEcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U+7Jfq5Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o6-20020a63fb06000000b0050aea0375afsi2631173pgh.765.2023.03.31.09.33.41; Fri, 31 Mar 2023 09:33:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U+7Jfq5Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233262AbjCaQM5 (ORCPT + 99 others); Fri, 31 Mar 2023 12:12:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233033AbjCaQLy (ORCPT ); Fri, 31 Mar 2023 12:11:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B93452290C for ; Fri, 31 Mar 2023 09:10:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279013; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3uNZWqu+MQKIO3OTB8U4ZJVMqWS1JR9mPIUS8nXZLn4=; b=U+7Jfq5ZuKrCpPxiJXNUrOqO1DlQxtLQ/IPbl9ePSHNc98ifd7EU2W2w1PPkHHlr8vl5gy G0FsrfA/SE/kjcsTtSQ3YjKlfUhHlcI+kQta3L8S55GKqmej1E2BbJQ/h4i9sK7OmACWpq +d775q+uLQlUEYCXG/SL6f4bWsiQTeY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-635-jQiJhrbpN_2494oFduAemA-1; Fri, 31 Mar 2023 12:10:10 -0400 X-MC-Unique: jQiJhrbpN_2494oFduAemA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 66A7829ABA23; Fri, 31 Mar 2023 16:10:09 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B0F74020C82; Fri, 31 Mar 2023 16:10:07 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH v3 17/55] ip6, udp6: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:36 +0100 Message-Id: <20230331160914.1608208-18-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901736295674653?= X-GMAIL-MSGID: =?utf-8?q?1761901736295674653?= Make IP6/UDP6 sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible, copying the data if not. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/ip.h | 4 ++++ net/ipv4/ip_output.c | 11 ++++++----- net/ipv6/ip6_output.c | 28 +++++++++++++++++++++++++--- 3 files changed, 35 insertions(+), 8 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index c3fffaa92d6e..e27d2ceffcfa 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -211,6 +211,10 @@ int ip_local_out(struct net *net, struct sock *sk, struct sk_buff *skb); int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl, __u8 tos); void ip_init(void); +int __ip_splice_alloc(struct sock *sk, struct sk_buff **pskb, + unsigned int fragheaderlen, unsigned int maxfraglen, + unsigned int hh_len); +int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, void *from, int *pcopy); int ip_append_data(struct sock *sk, struct flowi4 *fl4, int getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb), diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 41a954ac9e1a..fa2546d944bc 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -959,9 +959,9 @@ csum_page(struct page *page, int offset, int copy) /* * Allocate a packet for MSG_SPLICE_PAGES. */ -static int __ip_splice_alloc(struct sock *sk, struct sk_buff **pskb, - unsigned int fragheaderlen, unsigned int maxfraglen, - unsigned int hh_len) +int __ip_splice_alloc(struct sock *sk, struct sk_buff **pskb, + unsigned int fragheaderlen, unsigned int maxfraglen, + unsigned int hh_len) { struct sk_buff *skb_prev = *pskb, *skb; unsigned int fraggap = skb_prev->len - maxfraglen; @@ -993,12 +993,12 @@ static int __ip_splice_alloc(struct sock *sk, struct sk_buff **pskb, *pskb = skb; return 0; } +EXPORT_SYMBOL_GPL(__ip_splice_alloc); /* * Add (or copy) data pages for MSG_SPLICE_PAGES. */ -static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, - void *from, int *pcopy) +int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, void *from, int *pcopy) { struct msghdr *msg = from; struct page *page = NULL, **pages = &page; @@ -1047,6 +1047,7 @@ static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, *pcopy = copy; return 0; } +EXPORT_SYMBOL_GPL(__ip_splice_pages); static int __ip_append_data(struct sock *sk, struct flowi4 *fl4, diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index c314fdde0097..c95d034cb45a 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1486,7 +1486,7 @@ static int __ip6_append_data(struct sock *sk, struct rt6_info *rt = (struct rt6_info *)cork->dst; struct ipv6_txoptions *opt = v6_cork->opt; int csummode = CHECKSUM_NONE; - unsigned int maxnonfragsize, headersize; + unsigned int maxnonfragsize, headersize, initial_length; unsigned int wmem_alloc_delta = 0; bool paged, extra_uref = false; @@ -1559,6 +1559,7 @@ static int __ip6_append_data(struct sock *sk, rt->dst.dev->features & (NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM)) csummode = CHECKSUM_PARTIAL; + initial_length = length; if ((flags & MSG_ZEROCOPY) && length) { struct msghdr *msg = from; @@ -1589,6 +1590,14 @@ static int __ip6_append_data(struct sock *sk, skb_zcopy_set(skb, uarg, &extra_uref); } } + } else if ((flags & MSG_SPLICE_PAGES) && length) { + if (inet_sk(sk)->hdrincl) + return -EPERM; + if (rt->dst.dev->features & NETIF_F_SG) + /* We need an empty buffer to attach stuff to */ + initial_length = transhdrlen; + else + flags &= ~MSG_SPLICE_PAGES; } /* @@ -1624,6 +1633,15 @@ static int __ip6_append_data(struct sock *sk, unsigned int fraggap; unsigned int alloclen, alloc_extra; unsigned int pagedlen; + + if (unlikely(flags & MSG_SPLICE_PAGES)) { + err = __ip_splice_alloc(sk, &skb, fragheaderlen, + maxfraglen, hh_len); + if (err < 0) + goto error; + continue; + } + initial_length = length; alloc_new_skb: /* There's no room in the current skb */ if (skb) @@ -1642,7 +1660,7 @@ static int __ip6_append_data(struct sock *sk, * If remaining data exceeds the mtu, * we know we need more fragment(s). */ - datalen = length + fraggap; + datalen = initial_length + fraggap; if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen) datalen = maxfraglen - fragheaderlen - rt->dst.trailer_len; @@ -1672,7 +1690,7 @@ static int __ip6_append_data(struct sock *sk, } alloclen += alloc_extra; - if (datalen != length + fraggap) { + if (datalen != initial_length + fraggap) { /* * this is not the last fragment, the trailer * space is regarded as data space. @@ -1778,6 +1796,10 @@ static int __ip6_append_data(struct sock *sk, err = -EFAULT; goto error; } + } else if (flags & MSG_SPLICE_PAGES) { + err = __ip_splice_pages(sk, skb, from, ©); + if (err < 0) + goto error; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; From patchwork Fri Mar 31 16:08:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77839 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp680229vqo; Fri, 31 Mar 2023 09:17:53 -0700 (PDT) X-Google-Smtp-Source: AKy350ZFrk0ykPpqFSnA9SNhlJDhvj8KnBjJQziaLOYa+l/254AdYKyknjoieROMNp6EZ55nA8V1 X-Received: by 2002:a17:907:a0cf:b0:947:792d:bf7e with SMTP id hw15-20020a170907a0cf00b00947792dbf7emr5354086ejc.58.1680279473578; Fri, 31 Mar 2023 09:17:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279473; cv=none; d=google.com; s=arc-20160816; b=dVwbzYBhxdc1Rw3MZE2/mj1DcGprYB+BNfz4aH/h06l+2SBXegwFcCoBNxFk4dtvCR +xwq5lHVxxC8YQ7k1XTfWNUhaeDajhenj2VqGgBBzRfCAKaTmn7ynArTr/Wtf74hOj/q R6Q2QVEF+gcU8iQx/gb2bF5rDh6jdOvUisaUB2U0u5NlSsSSTKWK9nKHNiIDf1KFORli PIbO3jl/w2EH1NLczZ9/G/hRGLzRX3lIYGL9Z/aMnSqjreOG7qNGBF1ql2HBhN9Az8Jn AGbVAbwmlMUHLZkiDeuiGxX299YX9KTX9eX3maWatcRHK8yZQAFHQ5YgKIAa8E7lqc1k jb9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5bgiBObt8qBauZcUWYsV1T0v3n3OqeoexntQtIJIREA=; b=zXAWOz1CTL8AE63eRj9ezxAwoEVwMRyQb96//zvCFaTmhzOVhPD6foKKCiFYgqfE41 h8qVBf4WNhRElP7NUkrLd43x66XcrMeNiaT9a75LZ+nRpueZXndU4ddViTmZrzso8O41 FTLORNem+9bG+e/cKthWL1oab2s9GkwE4nwC7Ijb6cS937QlPwkLT7dcu3CRqoC6YTNE plguSe18qlqrfeppJOXqMGu3mTWu/DmE0Nh9xREm9mjKhHXc9ozvLcOX1cj6TrLkzD1g j+G5hW86OKTLb8c9FQqKBLjLv4s8GWdBiA7Cqop0YKAK7DYDWVLwHD3KHO+c/w8KPRyK LgNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K1cA3h3w; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l24-20020a1709066b9800b009230fbd1babsi2097500ejr.892.2023.03.31.09.17.29; Fri, 31 Mar 2023 09:17:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K1cA3h3w; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233335AbjCaQNY (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233079AbjCaQMH (ORCPT ); Fri, 31 Mar 2023 12:12:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6225F22910 for ; Fri, 31 Mar 2023 09:10:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279025; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5bgiBObt8qBauZcUWYsV1T0v3n3OqeoexntQtIJIREA=; b=K1cA3h3wwHV95gaH9veWPOQc9gl3PayLJnV4p7n6zGKQrbu7Lxo4Ywu45Ju9WXJELMntBY xsZVFqXQGnFoZ9W9xFHWUY84b01Be5BLXK4yg+AbZh/wleb67tEQlu0LW4RZ5ZbKgwzuRo IhlGT/UhUxcjA6vFZX7aEAW95SNO6rc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-404-Y_P62SwVN4u6m6gIEWawdQ-1; Fri, 31 Mar 2023 12:10:15 -0400 X-MC-Unique: Y_P62SwVN4u6m6gIEWawdQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0ADB829ABA1E; Fri, 31 Mar 2023 16:10:12 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0FB4014171B6; Fri, 31 Mar 2023 16:10:09 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH v3 18/55] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:37 +0100 Message-Id: <20230331160914.1608208-19-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900729039889911?= X-GMAIL-MSGID: =?utf-8?q?1761900729039889911?= Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/udp.c | 50 +++++++++----------------------------------------- 1 file changed, 9 insertions(+), 41 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c605d171eb2d..097feb92e215 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1332,52 +1332,20 @@ EXPORT_SYMBOL(udp_sendmsg); int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct inet_sock *inet = inet_sk(sk); - struct udp_sock *up = udp_sk(sk); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE + }; int ret; - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - if (!up->pending) { - struct msghdr msg = { .msg_flags = flags|MSG_MORE }; - - /* Call udp_sendmsg to specify destination address which - * sendpage interface can't pass. - * This will succeed only when the socket is connected. - */ - ret = udp_sendmsg(sk, &msg, 0); - if (ret < 0) - return ret; - } + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; lock_sock(sk); - - if (unlikely(!up->pending)) { - release_sock(sk); - - net_dbg_ratelimited("cork failed\n"); - return -EINVAL; - } - - ret = ip_append_page(sk, &inet->cork.fl.u.ip4, - page, offset, size, flags); - if (ret == -EOPNOTSUPP) { - release_sock(sk); - return sock_no_sendpage(sk->sk_socket, page, offset, - size, flags); - } - if (ret < 0) { - udp_flush_pending_frames(sk); - goto out; - } - - up->len += size; - if (!(READ_ONCE(up->corkflag) || (flags&MSG_MORE))) - ret = udp_push_pending_frames(sk); - if (!ret) - ret = size; -out: + ret = udp_sendmsg(sk, &msg, size); release_sock(sk); return ret; } From patchwork Fri Mar 31 16:08:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77886 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp694467vqo; Fri, 31 Mar 2023 09:40:08 -0700 (PDT) X-Google-Smtp-Source: AKy350bVv5TqJqHYDroKaC7dBiL37sYcrbzf36pRASbP3Dy0VprG0G7L/Xr7HEWGxevcAzJVDAqq X-Received: by 2002:a17:906:edc7:b0:921:54da:831c with SMTP id sb7-20020a170906edc700b0092154da831cmr10875184ejb.31.1680280808695; Fri, 31 Mar 2023 09:40:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280808; cv=none; d=google.com; s=arc-20160816; b=fZlvNoFG68bCY2mbljG26mOS9kKnV5haIDYYQkaqWm3FBLochseoNBBeKfCvVp7mKP kyVxK/8JEQvm03t9/QeiPNDs6OkM9NonptjDtEXwqJGsg5zF1lcJGUlPktSObHBqNjZM JVtw2y0N+ejARvjXtRybhN1wG8h5jfjOndRnCNfx5b2a97Y1+uXb4uivCR46Gy9OWE6F VbPsRd6Q2KNUy/td/re296oauhk8KZ1OoIOtgkb63Xo/+8a7GwGgdQRHG31q9kcT3wij oe7U9hTV88/eAPh1i+hNGwwu24XFVDAhDiM51LGnkeop2hN5Hrv4J/YUOuQ1t0hQIfKF d7yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4OGPYhZN0j7iZw/KVrz34Q1/SXztHasPFDnntHCFNPo=; b=D7Y1j/OXuzCxJMNNsa6b23Bh1dilIk1ZGfRDJ8fwLrTnReu3ROuxwmyGcG9bdwBnCa RKUDjsitTG4UL8/jWKm9NTc0I3clzkbz4YkAkloLhnJNcpon7+r1rFaeb5TVwDjbS+k/ pq9QAxy2Jrh9kK2JXa5gQDbpvkWF3QsaEbu1ADfUvEPL8afSw4vhBvpWr/jx79kc+YcB eStfxP6SNNILPnhLPWFl1WSMm2zQB+RnlSmzBU80yY4+ulYD3g/HijVsY8DGYJbr9VdH d4L04jbPl5ICskUAl7mdpAjWFNUUVGs7BdZd2rS8GmEkq1GKj4+Zsg0xfxf+6tOeS90s /kMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AOzUNC31; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jx12-20020a170906ca4c00b0093dd2150283si693505ejb.715.2023.03.31.09.39.44; Fri, 31 Mar 2023 09:40:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AOzUNC31; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233308AbjCaQNO (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232611AbjCaQMD (ORCPT ); Fri, 31 Mar 2023 12:12:03 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B25522930 for ; Fri, 31 Mar 2023 09:10:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279023; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4OGPYhZN0j7iZw/KVrz34Q1/SXztHasPFDnntHCFNPo=; b=AOzUNC31Ku18ZgDpcN+6TtMgNDflZ2Y+MfH4wjlk9JR5uqnsvApPVJIqUmDoYD1pFTUB4A C96joJPs7IqyfeIqPyEFp+A75sLlRSUoitxPBMzQW6mF7d/d1c8ZQR4X9UmJSWGt5U8302 nhce1flDRkxfMqT03cbcJk8BArHFZJY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-79-meTwAxIdNDKMLFAudYCPiQ-1; Fri, 31 Mar 2023 12:10:18 -0400 X-MC-Unique: meTwAxIdNDKMLFAudYCPiQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85825886067; Fri, 31 Mar 2023 16:10:14 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0AA7202701F; Fri, 31 Mar 2023 16:10:12 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 19/55] af_unix: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:38 +0100 Message-Id: <20230331160914.1608208-20-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902129165950770?= X-GMAIL-MSGID: =?utf-8?q?1761902129165950770?= Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the source iterator if possible and copying the data in otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/unix/af_unix.c | 93 ++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 77 insertions(+), 16 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 347122c3575e..a9ad97f3c57f 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2151,6 +2151,53 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other } #endif +/* + * Extract pages from an iterator and add them to the socket buffer. + */ +static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, + struct iov_iter *iter, ssize_t maxsize) +{ + struct page *pages[8], **ppages = pages; + unsigned int i, nr; + ssize_t ret = 0; + + while (iter->count > 0) { + size_t off, len; + + nr = min_t(size_t, MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags, + ARRAY_SIZE(pages)); + if (nr == 0) + break; + + len = iov_iter_extract_pages(iter, &ppages, maxsize, nr, 0, &off); + if (len <= 0) { + if (!ret) + ret = len ?: -EIO; + break; + } + + i = 0; + do { + size_t part = min_t(size_t, PAGE_SIZE - off, len); + + if (skb_append_pagefrags(skb, pages[i++], off, part) < 0) { + if (!ret) + ret = -EMSGSIZE; + goto out; + } + off = 0; + ret += part; + maxsize -= part; + len -= part; + } while (len > 0); + if (maxsize <= 0) + break; + } + +out: + return ret; +} + static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { @@ -2194,19 +2241,25 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, while (sent < len) { size = len - sent; - /* Keep two messages in the pipe so it schedules better */ - size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + skb = sock_alloc_send_pskb(sk, 0, 0, + msg->msg_flags & MSG_DONTWAIT, + &err, 0); + } else { + /* Keep two messages in the pipe so it schedules better */ + size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); - /* allow fallback to order-0 allocations */ - size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); + /* allow fallback to order-0 allocations */ + size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); - data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); + data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); - data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); + data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); - skb = sock_alloc_send_pskb(sk, size - data_len, data_len, - msg->msg_flags & MSG_DONTWAIT, &err, - get_order(UNIX_SKB_FRAGS_SZ)); + skb = sock_alloc_send_pskb(sk, size - data_len, data_len, + msg->msg_flags & MSG_DONTWAIT, &err, + get_order(UNIX_SKB_FRAGS_SZ)); + } if (!skb) goto out_err; @@ -2218,13 +2271,21 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, } fds_sent = true; - skb_put(skb, size - data_len); - skb->data_len = data_len; - skb->len = size; - err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); - if (err) { - kfree_skb(skb); - goto out_err; + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size); + skb->data_len += size; + skb->len += size; + skb->truesize += size; + refcount_add(size, &sk->sk_wmem_alloc); + } else { + skb_put(skb, size - data_len); + skb->data_len = data_len; + skb->len = size; + err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); + if (err) { + kfree_skb(skb); + goto out_err; + } } unix_state_lock(other); From patchwork Fri Mar 31 16:08:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77838 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp678079vqo; Fri, 31 Mar 2023 09:14:59 -0700 (PDT) X-Google-Smtp-Source: AKy350abZSA0tie/09KIUtj7uZAU+hCU8x09vDEKrLbKS9f5qSYuJm1rUsoXN/g47lcp77pPh2D0 X-Received: by 2002:a17:906:738a:b0:933:4ca3:9678 with SMTP id f10-20020a170906738a00b009334ca39678mr25937982ejl.24.1680279299219; Fri, 31 Mar 2023 09:14:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279299; cv=none; d=google.com; s=arc-20160816; b=Cq+rUHI8dvH6JZVUj2relLJg/3vAArNbljTQlnBZWceQKrzCMCN7q/gzAazuigXALe CWQzZoB5lJXquQRkaLIeFZh5rE9LERkJc102gbSE1i70gIZ4Gf8X/ZBH9sMO/9SSVrVE R1yGp6/wSyanVKH6ZVxY2j2oKRRvPCp4qBV+XNI47F3v8LSXV4SsxXHEWiquYc4k0LdS /EvcW89M3LgEIi1wlB3Uud9QIUPvMHWpJBD6VWPWkqNzdckWAsu3HSFelZ/OCVI+IyYu 0NA9WWQMTVx6XA6uuCSa2NKbl7Z4pj0Z8l5CyF4S2kFTsFRsonTZCZxhUfB4YiHsZaR7 FuVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=b8cNPLuNlp8TF1aeLVa/nfCczmmxDIoFIua9rMiLDvM=; b=AZdsbM4yTX5VOdCx9NjPXOfl4mhJcGenxyZIk/HgYWvqwAjVNn4FjidVktWB4ytzNg 2R9yQMe4Lt/a1EHY+o4FgP3rN6PWouY1hmnuwn55YmFOPPRV9pcrpPjxFBIwPjmvqo2R OibSh8m21PZrTgPZzeZzGdBed1xGwJXQoX/M0q1leUejUYP5enfiHw2LsVuKcw7sI8xc aSwbu801Q8NXRvvAFpZKBA8WT56SEj25yjtvf+R4Z/ZrRPB6UQqfEOim1H7JQsPuUTIZ E2Qg0s7Gu+1Zybx0VaruQ6JMZsyHGygZa2lJ0aO9fnT37zccexvFNpuzTM3WiPDiJ53b fa8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=i+6j7xCt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y8-20020a1709060a8800b00931d797789fsi1525291ejf.1003.2023.03.31.09.14.35; Fri, 31 Mar 2023 09:14:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=i+6j7xCt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233301AbjCaQNL (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232273AbjCaQLy (ORCPT ); Fri, 31 Mar 2023 12:11:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5782222915 for ; Fri, 31 Mar 2023 09:10:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b8cNPLuNlp8TF1aeLVa/nfCczmmxDIoFIua9rMiLDvM=; b=i+6j7xCtdJqiSz2FYPSO7EGtxfP3JiYaZO775ZB7En4lsrX2yX+oO8VFgwvKw/MWR/JuXH kI59WKf9IOQZ1Lsel6XpR3fd1nr1SE/XHVSUvnV1k3cAeLnbL/QIVvfR+FVsB5JdOqs/9r 4FeweNgmro/HjiadQykrpzoiR949kfo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-203-cb5mzzjVOOCi4dFlfSrt2Q-1; Fri, 31 Mar 2023 12:10:18 -0400 X-MC-Unique: cb5mzzjVOOCi4dFlfSrt2Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 27F62185A78B; Fri, 31 Mar 2023 16:10:17 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 44F731121314; Fri, 31 Mar 2023 16:10:15 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 20/55] af_unix: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Fri, 31 Mar 2023 17:08:39 +0100 Message-Id: <20230331160914.1608208-21-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900546695804277?= X-GMAIL-MSGID: =?utf-8?q?1761900546695804277?= If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make unix_extract_bvec_to_skb() copy it. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/unix/af_unix.c | 44 +++++++++++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 11 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a9ad97f3c57f..88b91005567e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2154,12 +2154,12 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other /* * Extract pages from an iterator and add them to the socket buffer. */ -static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, - struct iov_iter *iter, ssize_t maxsize) +static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, struct iov_iter *iter, + ssize_t maxsize, gfp_t gfp) { struct page *pages[8], **ppages = pages; unsigned int i, nr; - ssize_t ret = 0; + ssize_t spliced = 0, ret = 0; while (iter->count > 0) { size_t off, len; @@ -2171,31 +2171,52 @@ static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, len = iov_iter_extract_pages(iter, &ppages, maxsize, nr, 0, &off); if (len <= 0) { - if (!ret) - ret = len ?: -EIO; + ret = len ?: -EIO; break; } i = 0; do { + struct page *page = pages[i++]; size_t part = min_t(size_t, PAGE_SIZE - off, len); + bool put = false; + + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, part, gfp, + ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(iter, len); + ret = -ENOMEM; + goto out; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } - if (skb_append_pagefrags(skb, pages[i++], off, part) < 0) { - if (!ret) - ret = -EMSGSIZE; + ret = skb_append_pagefrags(skb, page, off, part); + if (put) + put_page(page); + if (ret < 0) { + iov_iter_revert(iter, len); goto out; } off = 0; - ret += part; + spliced += part; maxsize -= part; len -= part; } while (len > 0); + if (maxsize <= 0) break; } out: - return ret; + return spliced ?: ret; } static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, @@ -2272,7 +2293,8 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, fds_sent = true; if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { - size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size); + size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size, + sk->sk_allocation); skb->data_len += size; skb->len += size; skb->truesize += size; From patchwork Fri Mar 31 16:08:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77840 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp680336vqo; Fri, 31 Mar 2023 09:18:04 -0700 (PDT) X-Google-Smtp-Source: AKy350Zj4GCH9+QPO3urVtXbvkbG1DlbWgFc9DZ4RwwwVl98ON/j0MoXxAyv9EJCGnYyydSP/daz X-Received: by 2002:a17:906:2f96:b0:932:8cd:1021 with SMTP id w22-20020a1709062f9600b0093208cd1021mr27099291eji.33.1680279484216; Fri, 31 Mar 2023 09:18:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279484; cv=none; d=google.com; s=arc-20160816; b=LWuYKmfHm0CrNbBFWYaM70qOWFDwHrTb/NydHOpvPhZIC2G6EvghQS7YhHnEFS3Bks ID0YgkyVO2re3CVAsslGUMjFIWmNYCxFskNN8U0UhJ38jioQs3xMWVOhUfv9XZpNYJT6 kE3HijKuVq570+A3NjcGFLPDXcqwaG9KeYTNPcvQyNxiycPRxHdtP8MsJeedcnOqsPnm gLNJ8WykAME/9/NFoS3bVyOuSHVReVbYfe8Tu0zcmri1XNT21UeT1mxHMEQ6QbG54QDT T9VsF1Ywur8Pqar4Do9VdXDPch9IC2vMOAIb9XdzOhLYZG9Sb9VUDSQvS5WG51FEz2R9 z3vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jPGukvQXP4Tai7PUqcbnIswr++Y11hB7rT1QDpn/afo=; b=einQ+QHsdHeOOPybjY+mtMGStsioDyFISOejtVi0eGwAN47KG1HNsqpfELXNzXul6j 9raRrONVTGjgKgWiOdtFON0oAqqgdXUx9t6Yxx91XpsNIkHnIIDz+xlpqWazw5UodGd7 RrMwgrAw9pZa7L90tX3aYHukatH84HMO6W95QOZLjM5urgABxoRWfSlEw51Zx+1oClQI Dw4k3hukRaY5/K7YI+IgZnoif+GVCrBWA5uXYNrIlxScNa49jZVZfD+aZR9vFIUGSsqm W1s95jfHFb/fXvzOzAnEDRmFlw1LNi5xaREAwn8lI+6nJYEr8R4dk2xQ7xFpj9B4v1km 8niA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="I6rvE/IC"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h7-20020a170906854700b009338f516161si1945254ejy.707.2023.03.31.09.17.40; Fri, 31 Mar 2023 09:18:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="I6rvE/IC"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233318AbjCaQNV (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232787AbjCaQME (ORCPT ); Fri, 31 Mar 2023 12:12:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37CD720C26 for ; Fri, 31 Mar 2023 09:10:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jPGukvQXP4Tai7PUqcbnIswr++Y11hB7rT1QDpn/afo=; b=I6rvE/IC/LL4TiFd/qKKqmU+E21eucxMouEDeyiKjj4hIIA8oDOpUMnMUymcbkWNhgMma+ xz5ADZnP82wD80OFbMx5qPsW0+JRL+x5zSIderbBxs9qV0a4XZr7JvCB8SYhw00tM4prrU Ai+YidZ9QtZRzakAljlqPcZwk67SibI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-570-w91nJgUEPQ-xufIcE0rgFg-1; Fri, 31 Mar 2023 12:10:23 -0400 X-MC-Unique: w91nJgUEPQ-xufIcE0rgFg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CC107886063; Fri, 31 Mar 2023 16:10:19 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7DEB14171B6; Fri, 31 Mar 2023 16:10:17 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 21/55] crypto: af_alg: Pin pages rather than ref'ing if appropriate Date: Fri, 31 Mar 2023 17:08:40 +0100 Message-Id: <20230331160914.1608208-22-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761900740142987187?= X-GMAIL-MSGID: =?utf-8?q?1761900740142987187?= Convert AF_ALG to use iov_iter_extract_pages() instead of iov_iter_get_pages(). This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO-read rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O would otherwise end up only visible to the child process and not the parent). Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 10 +++++++--- include/crypto/if_alg.h | 1 + 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 5f7252a5b7b4..7caff10df643 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -533,14 +533,17 @@ static const struct net_proto_family alg_family = { int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len) { + struct page **pages = sgl->pages; size_t off; ssize_t n; int npages, i; - n = iov_iter_get_pages2(iter, sgl->pages, len, ALG_MAX_PAGES, &off); + n = iov_iter_extract_pages(iter, &pages, len, ALG_MAX_PAGES, 0, &off); if (n < 0) return n; + sgl->need_unpin = iov_iter_extract_will_pin(iter); + npages = DIV_ROUND_UP(off + n, PAGE_SIZE); if (WARN_ON(npages == 0)) return -EINVAL; @@ -573,8 +576,9 @@ void af_alg_free_sg(struct af_alg_sgl *sgl) { int i; - for (i = 0; i < sgl->npages; i++) - put_page(sgl->pages[i]); + if (sgl->need_unpin) + for (i = 0; i < sgl->npages; i++) + unpin_user_page(sgl->pages[i]); } EXPORT_SYMBOL_GPL(af_alg_free_sg); diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h index 7e76623f9ec3..46494b33f5bc 100644 --- a/include/crypto/if_alg.h +++ b/include/crypto/if_alg.h @@ -59,6 +59,7 @@ struct af_alg_sgl { struct scatterlist sg[ALG_MAX_PAGES + 1]; struct page *pages[ALG_MAX_PAGES]; unsigned int npages; + bool need_unpin; }; /* TX SGL entry */ From patchwork Fri Mar 31 16:08:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77865 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp691607vqo; Fri, 31 Mar 2023 09:35:21 -0700 (PDT) X-Google-Smtp-Source: AKy350bIK7kTDlRJyQTKss2jH7BhAKJoRUcOPtGMjXtK8+CKREYa3MLPxf8/2TeDDB+Q18Weugft X-Received: by 2002:a17:906:9397:b0:93f:fbe:ed17 with SMTP id l23-20020a170906939700b0093f0fbeed17mr23858558ejx.62.1680280521640; Fri, 31 Mar 2023 09:35:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280521; cv=none; d=google.com; s=arc-20160816; b=aZcDoWDoNLioqcsI+wjmQ9QMVcFaIjB+AMo0Ps6LZ6+NpoHyB0Co2vfE3U0z1nRIje 0r+rs08Gml9+Qehg6JKWuG/FvM+RBqb2s7cnaqGl5MUQ9qFgKLAbDzFZolH2bEqSU3Y3 yaGtl8/xcOe7Dz1VS+Tq0XdAiMeW6Gn2WDzRd86Moed9mI7H8v4CRd9VBexyXyHzlqzW FTDupNWnjU3E1GjRK/uXfltoE7AjvSxPgx6PzdEKRLebsaauk2Le8al1e2lf4ymY/skw 4ASw1malTUvkXkg658UhvSxBB3HFJthevEuAKfszAIWtGd7xSs8ZRglkeBuB01bJHLmJ XsDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xCO1FTHFb6ixqT+m8PCC52zI8ZJyvPRkEVPay5Q0l+o=; b=UqnN6gOqOyKzEwS+FPy8tIbPK9cpg29tZ7fr92onaHZp6Ar13b0IfhqErUxCWVkwes +n746b1DMNDJaqMO0MI8oHn0ktGOz9BThpZsWPwdH14K2wL6mbKtGmLQ1ZHuRnv4+e6L 5GuS9G7qdRsqvzMeNkd+jsFqfQeY7IZG9lKDd84gGP89PwDBu3u7R9VV5DBAbAXCrtUe US0Ztp/+w4zLf9pSwc5JqhnDddK7Dz3mrsk1kxSaiTOJiPmtifoD2HeipDUb0CKRWGxz tFqJ0gqvg+/OQJ5skN2wcf6K3W7amScJVEwg8MBVjwZIB9qf8GVFWE0N0zslj3KLcMKf jLRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Kf7LLzDt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x16-20020a1709064bd000b009476ffdc5cfsi2146079ejv.862.2023.03.31.09.34.57; Fri, 31 Mar 2023 09:35:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Kf7LLzDt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233312AbjCaQNS (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233065AbjCaQME (ORCPT ); Fri, 31 Mar 2023 12:12:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 095892291C for ; Fri, 31 Mar 2023 09:10:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xCO1FTHFb6ixqT+m8PCC52zI8ZJyvPRkEVPay5Q0l+o=; b=Kf7LLzDtF4JSz6zyFMZ2RV+b/6mNIfJIoTC9aRtGldXbKv4PnXQyHNh0cVru8cakys46ZU pDa42/RU3r163iuLQTVHEqaMUOjMrQFIAaLBQp+Gm+BsfvxPWX5+Nd870njIcygOgJhIBW FK+ua1aitqJgmyZrqmUa+qifqlM6JbU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-601-E9J8PItkP7uGZG6CiLxyVg-1; Fri, 31 Mar 2023 12:10:25 -0400 X-MC-Unique: E9J8PItkP7uGZG6CiLxyVg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9CDB988606A; Fri, 31 Mar 2023 16:10:22 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8BDFC492B00; Fri, 31 Mar 2023 16:10:20 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 22/55] crypto: af_alg: Use netfs_extract_iter_to_sg() to create scatterlists Date: Fri, 31 Mar 2023 17:08:41 +0100 Message-Id: <20230331160914.1608208-23-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901828050269477?= X-GMAIL-MSGID: =?utf-8?q?1761901828050269477?= Use netfs_extract_iter_to_sg() to decant the destination iterator into a scatterlist in af_alg_get_rsgl(). af_alg_make_sg() can then be removed. [!] Note that if this fits, netfs_extract_iter_to_sg() should move to core code. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 55 ++++++++++------------------------------- crypto/algif_aead.c | 12 ++++----- crypto/algif_hash.c | 18 ++++++++++---- crypto/algif_skcipher.c | 2 +- include/crypto/if_alg.h | 6 ++--- 5 files changed, 35 insertions(+), 58 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 7caff10df643..1dafd088ad45 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -531,45 +532,11 @@ static const struct net_proto_family alg_family = { .owner = THIS_MODULE, }; -int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len) -{ - struct page **pages = sgl->pages; - size_t off; - ssize_t n; - int npages, i; - - n = iov_iter_extract_pages(iter, &pages, len, ALG_MAX_PAGES, 0, &off); - if (n < 0) - return n; - - sgl->need_unpin = iov_iter_extract_will_pin(iter); - - npages = DIV_ROUND_UP(off + n, PAGE_SIZE); - if (WARN_ON(npages == 0)) - return -EINVAL; - /* Add one extra for linking */ - sg_init_table(sgl->sg, npages + 1); - - for (i = 0, len = n; i < npages; i++) { - int plen = min_t(int, len, PAGE_SIZE - off); - - sg_set_page(sgl->sg + i, sgl->pages[i], plen, off); - - off = 0; - len -= plen; - } - sg_mark_end(sgl->sg + npages - 1); - sgl->npages = npages; - - return n; -} -EXPORT_SYMBOL_GPL(af_alg_make_sg); - static void af_alg_link_sg(struct af_alg_sgl *sgl_prev, struct af_alg_sgl *sgl_new) { - sg_unmark_end(sgl_prev->sg + sgl_prev->npages - 1); - sg_chain(sgl_prev->sg, sgl_prev->npages + 1, sgl_new->sg); + sg_unmark_end(sgl_prev->sgt.sgl + sgl_prev->sgt.nents - 1); + sg_chain(sgl_prev->sgt.sgl, sgl_prev->sgt.nents + 1, sgl_new->sgt.sgl); } void af_alg_free_sg(struct af_alg_sgl *sgl) @@ -577,8 +544,8 @@ void af_alg_free_sg(struct af_alg_sgl *sgl) int i; if (sgl->need_unpin) - for (i = 0; i < sgl->npages; i++) - unpin_user_page(sgl->pages[i]); + for (i = 0; i < sgl->sgt.nents; i++) + unpin_user_page(sg_page(&sgl->sgt.sgl[i])); } EXPORT_SYMBOL_GPL(af_alg_free_sg); @@ -1292,8 +1259,8 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags, while (maxsize > len && msg_data_left(msg)) { struct af_alg_rsgl *rsgl; + ssize_t err; size_t seglen; - int err; /* limit the amount of readable buffers */ if (!af_alg_readable(sk)) @@ -1310,16 +1277,20 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags, return -ENOMEM; } - rsgl->sgl.npages = 0; + rsgl->sgl.sgt.sgl = rsgl->sgl.sgl; + rsgl->sgl.sgt.nents = 0; + rsgl->sgl.sgt.orig_nents = 0; list_add_tail(&rsgl->list, &areq->rsgl_list); - /* make one iovec available as scatterlist */ - err = af_alg_make_sg(&rsgl->sgl, &msg->msg_iter, seglen); + err = netfs_extract_iter_to_sg(&msg->msg_iter, seglen, + &rsgl->sgl.sgt, ALG_MAX_PAGES, 0); if (err < 0) { rsgl->sg_num_bytes = 0; return err; } + rsgl->sgl.need_unpin = iov_iter_extract_will_pin(&msg->msg_iter); + /* chain the new scatterlist with previous one */ if (areq->last_rsgl) af_alg_link_sg(&areq->last_rsgl->sgl, &rsgl->sgl); diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 42493b4d8ce4..f6aa3856d8d5 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -210,7 +210,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, */ /* Use the RX SGL as source (and destination) for crypto op. */ - rsgl_src = areq->first_rsgl.sgl.sg; + rsgl_src = areq->first_rsgl.sgl.sgt.sgl; if (ctx->enc) { /* @@ -224,7 +224,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, * RX SGL: AAD || PT || Tag */ err = crypto_aead_copy_sgl(null_tfm, tsgl_src, - areq->first_rsgl.sgl.sg, processed); + areq->first_rsgl.sgl.sgt.sgl, processed); if (err) goto free; af_alg_pull_tsgl(sk, processed, NULL, 0); @@ -242,7 +242,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* Copy AAD || CT to RX SGL buffer for in-place operation. */ err = crypto_aead_copy_sgl(null_tfm, tsgl_src, - areq->first_rsgl.sgl.sg, outlen); + areq->first_rsgl.sgl.sgt.sgl, outlen); if (err) goto free; @@ -268,8 +268,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* RX SGL present */ struct af_alg_sgl *sgl_prev = &areq->last_rsgl->sgl; - sg_unmark_end(sgl_prev->sg + sgl_prev->npages - 1); - sg_chain(sgl_prev->sg, sgl_prev->npages + 1, + sg_unmark_end(sgl_prev->sgt.sgl + sgl_prev->sgt.nents - 1); + sg_chain(sgl_prev->sgt.sgl, sgl_prev->sgt.nents + 1, areq->tsgl); } else /* no RX SGL present (e.g. authentication only) */ @@ -278,7 +278,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* Initialize the crypto operation */ aead_request_set_crypt(&areq->cra_u.aead_req, rsgl_src, - areq->first_rsgl.sgl.sg, used, ctx->iv); + areq->first_rsgl.sgl.sgt.sgl, used, ctx->iv); aead_request_set_ad(&areq->cra_u.aead_req, ctx->aead_assoclen); aead_request_set_tfm(&areq->cra_u.aead_req, tfm); diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index 1d017ec5c63c..f051fa624bd7 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -14,6 +14,7 @@ #include #include #include +#include #include struct hash_ctx { @@ -91,13 +92,20 @@ static int hash_sendmsg(struct socket *sock, struct msghdr *msg, if (len > limit) len = limit; - len = af_alg_make_sg(&ctx->sgl, &msg->msg_iter, len); + ctx->sgl.sgt.sgl = ctx->sgl.sgl; + ctx->sgl.sgt.nents = 0; + ctx->sgl.sgt.orig_nents = 0; + + len = netfs_extract_iter_to_sg(&msg->msg_iter, len, + &ctx->sgl.sgt, ALG_MAX_PAGES, 0); if (len < 0) { err = copied ? 0 : len; goto unlock; } - ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, NULL, len); + ctx->sgl.need_unpin = iov_iter_extract_will_pin(&msg->msg_iter); + + ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl, NULL, len); err = crypto_wait_req(crypto_ahash_update(&ctx->req), &ctx->wait); @@ -141,8 +149,8 @@ static ssize_t hash_sendpage(struct socket *sock, struct page *page, flags |= MSG_MORE; lock_sock(sk); - sg_init_table(ctx->sgl.sg, 1); - sg_set_page(ctx->sgl.sg, page, size, offset); + sg_init_table(ctx->sgl.sgl, 1); + sg_set_page(ctx->sgl.sgl, page, size, offset); if (!(flags & MSG_MORE)) { err = hash_alloc_result(sk, ctx); @@ -151,7 +159,7 @@ static ssize_t hash_sendpage(struct socket *sock, struct page *page, } else if (!ctx->more) hash_free_result(sk, ctx); - ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, ctx->result, size); + ahash_request_set_crypt(&ctx->req, ctx->sgl.sgl, ctx->result, size); if (!(flags & MSG_MORE)) { if (ctx->more) diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index ee8890ee8f33..a251cd6bd5b9 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -105,7 +105,7 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, /* Initialize the crypto operation */ skcipher_request_set_tfm(&areq->cra_u.skcipher_req, tfm); skcipher_request_set_crypt(&areq->cra_u.skcipher_req, areq->tsgl, - areq->first_rsgl.sgl.sg, len, ctx->iv); + areq->first_rsgl.sgl.sgt.sgl, len, ctx->iv); if (msg->msg_iocb && !is_sync_kiocb(msg->msg_iocb)) { /* AIO operation */ diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h index 46494b33f5bc..34224e77f5a2 100644 --- a/include/crypto/if_alg.h +++ b/include/crypto/if_alg.h @@ -56,9 +56,8 @@ struct af_alg_type { }; struct af_alg_sgl { - struct scatterlist sg[ALG_MAX_PAGES + 1]; - struct page *pages[ALG_MAX_PAGES]; - unsigned int npages; + struct sg_table sgt; + struct scatterlist sgl[ALG_MAX_PAGES + 1]; bool need_unpin; }; @@ -164,7 +163,6 @@ int af_alg_release(struct socket *sock); void af_alg_release_parent(struct sock *sk); int af_alg_accept(struct sock *sk, struct socket *newsock, bool kern); -int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len); void af_alg_free_sg(struct af_alg_sgl *sgl); static inline struct alg_sock *alg_sk(struct sock *sk) From patchwork Fri Mar 31 16:08:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77851 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp688170vqo; Fri, 31 Mar 2023 09:30:23 -0700 (PDT) X-Google-Smtp-Source: AKy350a1pd49fav7n366XvpczIN4Nrb1rxSLMyv3Ai9LFt1Qy5Z6y6crDBo/yzO7nTnuS2gKnDov X-Received: by 2002:a17:90b:1c08:b0:23f:2d2c:abcf with SMTP id oc8-20020a17090b1c0800b0023f2d2cabcfmr30228286pjb.7.1680280222479; Fri, 31 Mar 2023 09:30:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280222; cv=none; d=google.com; s=arc-20160816; b=CEosGSXZIObWytIzDCA1w/1KpNINObYv9zWlRxEXb665DyRF/yliigve3a8hHM1eMY UtxZsdgD7l+zriroTVPVcmRySq5JA19mS19ybbwcMwITt354HF9/RQeMAcWEJzEjibSx eLbg14eJaExwKNwXlpNkRE9ygGP0eGMUmPkdHdEWnDZg6m16T6MxnnnyLI9yYm3P1nWM UdG6QaOfUoZFLjDM4DjmBz9afk5ahOWNU+7udoy8FNul5Fk79Xqd1KcxwUSme85YgkSh HWlq1xWLCv0ak9TLSJftLyfoQ1tfR/4kWcfP3Rlz1LSOXCy+pSTJ2mgIu8tGJpKnWFKh cX9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Cnsl2PBKcn/WeJcDUlvQg+kDRV/xI85q/f39rakA9vk=; b=YtCG97OrL8odrxUaUW3CV5lWJqveypbbdpMj/2uo7h+nVtwVolou60wLknBdrNKR+i SRAhH8MGPmkHfZX2YRuyTh+1NEYOXdHBlqsMfLf60ZOYvLydP82DLpw62M8pTfXudrS5 JPC8PTiaN/T7rJ4t4C1HEgUsHwdG7hU4pn6pwUBofpefACODtpu+GgWwCtnEJFbwu7Fs jMHgXWCq45wjMpspMkJa2nMAOHzSmtAqJ1NpfyA1rPLQYjxaGIY5amQAKotFb9a3zJSt tyD+6P5OzHJ0YIhKpSWeDbUUAEjtxdiBro23UD+h+CQ2XuCxqHH+2Ub0bC0UzPY84w1K RKtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WVsHY7/e"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t14-20020a17090a6a0e00b00240ac49e589si2420877pjj.3.2023.03.31.09.30.09; Fri, 31 Mar 2023 09:30:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WVsHY7/e"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232643AbjCaQNk (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233092AbjCaQMJ (ORCPT ); Fri, 31 Mar 2023 12:12:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B942722E8A for ; Fri, 31 Mar 2023 09:10:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cnsl2PBKcn/WeJcDUlvQg+kDRV/xI85q/f39rakA9vk=; b=WVsHY7/esFeDgA3a5ZynTJD2M2WnYztv6HEMRfbYdakD+9gYE9ZIDYgOYmrJeHv9sJYDgT 2+kxGYmK6MjC7vmw5fnVLJadI7nDOP7ViOT41r4fiEa3mopHcQG0ElSSkoBO/NGuC7vHfP SVB5Jc8q/U5RyBntkIMAwddJbYCmoWA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-670-w26OBUVGPfurU15AUpPLuQ-1; Fri, 31 Mar 2023 12:10:26 -0400 X-MC-Unique: w26OBUVGPfurU15AUpPLuQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4B9C3380452A; Fri, 31 Mar 2023 16:10:25 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A525492C3E; Fri, 31 Mar 2023 16:10:23 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 23/55] crypto: af_alg: Indent the loop in af_alg_sendmsg() Date: Fri, 31 Mar 2023 17:08:42 +0100 Message-Id: <20230331160914.1608208-24-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901514656106610?= X-GMAIL-MSGID: =?utf-8?q?1761901514656106610?= Put the loop in af_alg_sendmsg() into an if-statement to indent it to make the next patch easier to review as that will add another branch to handle MSG_SPLICE_PAGES to the if-statement. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 50 +++++++++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 1dafd088ad45..483821e310e9 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1031,35 +1031,37 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, if (sgl->cur) sg_unmark_end(sg + sgl->cur - 1); - do { - struct page *pg; - unsigned int i = sgl->cur; + if (1 /* TODO check MSG_SPLICE_PAGES */) { + do { + struct page *pg; + unsigned int i = sgl->cur; - plen = min_t(size_t, len, PAGE_SIZE); + plen = min_t(size_t, len, PAGE_SIZE); - pg = alloc_page(GFP_KERNEL); - if (!pg) { - err = -ENOMEM; - goto unlock; - } + pg = alloc_page(GFP_KERNEL); + if (!pg) { + err = -ENOMEM; + goto unlock; + } - sg_assign_page(sg + i, pg); + sg_assign_page(sg + i, pg); - err = memcpy_from_msg(page_address(sg_page(sg + i)), - msg, plen); - if (err) { - __free_page(sg_page(sg + i)); - sg_assign_page(sg + i, NULL); - goto unlock; - } + err = memcpy_from_msg(page_address(sg_page(sg + i)), + msg, plen); + if (err) { + __free_page(sg_page(sg + i)); + sg_assign_page(sg + i, NULL); + goto unlock; + } - sg[i].length = plen; - len -= plen; - ctx->used += plen; - copied += plen; - size -= plen; - sgl->cur++; - } while (len && sgl->cur < MAX_SGL_ENTS); + sg[i].length = plen; + len -= plen; + ctx->used += plen; + copied += plen; + size -= plen; + sgl->cur++; + } while (len && sgl->cur < MAX_SGL_ENTS); + } if (!size) sg_mark_end(sg + sgl->cur - 1); From patchwork Fri Mar 31 16:08:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77872 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692369vqo; Fri, 31 Mar 2023 09:36:27 -0700 (PDT) X-Google-Smtp-Source: AKy350ZeoIYZLTgU2xaHnQOnrgQTl6CvaBkqXi/gpc9ziiE1Zl5zdl6BxdfzQmY3p/+8YtebPxPL X-Received: by 2002:a17:906:54cc:b0:932:c315:b0d with SMTP id c12-20020a17090654cc00b00932c3150b0dmr27098149ejp.34.1680280587574; Fri, 31 Mar 2023 09:36:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280587; cv=none; d=google.com; s=arc-20160816; b=qBa5l/UNnvtiYfAZhrM2PWSX13c9g7ySxrfUommviTm4KguTWMzL2+OjmgppWzcuJD AwdghWmjDj9LwqGTQCH95HNnYOCBluEHJdNf+OBUvcFWFym6P4Xv9hgY3j2D4ontq37j QHQKrfU3O4gAZML9G5+OWXwff8j4sZf7xjUOP7WB+ra1VXuN5RxL+W/c7tktSNmv0MRT AIR58HmM2F17Nd0AxXYglxjyNjRk3LKPiG8eJkPn6rhjUMmrNt+ML0zKMdWfpal/3Op/ izbXRt+r+ppsP7GeVSGLKhRHZHACufYjskBKP5SzG6tp60bnwWUPurmdH3BJpU70/3US GDEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bQ5QnEnteMQfZ4j7G0XrCpLzGl+D3sZ9ge7q0vobANw=; b=g8Y+CV2t6J/SChCJMvjE5S8L/AFAXmzNUfYkO8mTQXl70uIiBNmKxc4ttx3T5MkcrI FTP59nBWzCqnkvb5quhhyUvbJqIyhMugKDd6gpUZ/nS8ekbVgG5c6uKrbT50pSJ7FUc9 w6eteiMol3Um9aQlybCfjhtaiWBGN6gHdhShY7HmvGTVLJf58//nWpXhgNqI1z01UWRR t4doODloh21kIUb/cKVKBtNXt9Uw+7JDmjIL6JzxCDuSHPdMPNzXZSaDgGtmyb4yKn1/ I0VOZQlETQG+kSjlcyQfZJRH4tArKmNniBPkN1Mk/gXDJfPDdj1szg/Z71BobD+HE7aw /3PA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KFD11+19; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bh13-20020a170906a0cd00b00947bc373744si1331728ejb.359.2023.03.31.09.36.03; Fri, 31 Mar 2023 09:36:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KFD11+19; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233087AbjCaQN1 (ORCPT + 99 others); Fri, 31 Mar 2023 12:13:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233081AbjCaQMI (ORCPT ); Fri, 31 Mar 2023 12:12:08 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F9AA1EFF3 for ; Fri, 31 Mar 2023 09:10:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279032; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bQ5QnEnteMQfZ4j7G0XrCpLzGl+D3sZ9ge7q0vobANw=; b=KFD11+19/rtgjRN53suD2tLpatNQeswN17GfhuPYoQfldzUtNGUHn742LO/iB43qgnfdY1 4hkg9ALV/GdE5HdvXqGdeRlpTGPJizTRy8BiVG4bUulyFghafRvUQzVg/RLzcez9pR+jEQ eGt9HPcA/6EoAkektRVzvXqxgJ9oxE0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-451-zyE999oJOe6JnagrFD2R3w-1; Fri, 31 Mar 2023 12:10:29 -0400 X-MC-Unique: zyE999oJOe6JnagrFD2R3w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1847B85A5B1; Fri, 31 Mar 2023 16:10:28 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0691D1121315; Fri, 31 Mar 2023 16:10:25 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 24/55] crypto: af_alg: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:43 +0100 Message-Id: <20230331160914.1608208-25-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901897228638195?= X-GMAIL-MSGID: =?utf-8?q?1761901897228638195?= Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. [!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib. This probably needs moving to core code somewhere. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/Kconfig | 1 + crypto/af_alg.c | 28 ++++++++++++++++++++++++++-- crypto/algif_aead.c | 22 +++++++++++----------- crypto/algif_skcipher.c | 8 ++++---- 4 files changed, 42 insertions(+), 17 deletions(-) diff --git a/crypto/Kconfig b/crypto/Kconfig index 9c86f7045157..8c04ecbb4395 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1297,6 +1297,7 @@ menu "Userspace interface" config CRYPTO_USER_API tristate + select NETFS_SUPPORT # for netfs_extract_iter_to_sg() config CRYPTO_USER_API_HASH tristate "Hash algorithms" diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 483821e310e9..3088ab298632 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -941,6 +941,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, bool init = false; int err = 0; + if ((msg->msg_flags & MSG_SPLICE_PAGES) && + !iov_iter_is_bvec(&msg->msg_iter)) + return -EINVAL; + if (msg->msg_controllen) { err = af_alg_cmsg_send(msg, &con); if (err) @@ -986,7 +990,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, while (size) { struct scatterlist *sg; size_t len = size; - size_t plen; + ssize_t plen; /* use the existing memory in an allocated page */ if (ctx->merge) { @@ -1031,7 +1035,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, if (sgl->cur) sg_unmark_end(sg + sgl->cur - 1); - if (1 /* TODO check MSG_SPLICE_PAGES */) { + if (msg->msg_flags & MSG_SPLICE_PAGES) { + struct sg_table sgtable = { + .sgl = sg, + .nents = sgl->cur, + .orig_nents = sgl->cur, + }; + + plen = netfs_extract_iter_to_sg(&msg->msg_iter, len, + &sgtable, MAX_SGL_ENTS, 0); + if (plen < 0) { + err = plen; + goto unlock; + } + + for (; sgl->cur < sgtable.nents; sgl->cur++) + get_page(sg_page(&sg[sgl->cur])); + len -= plen; + ctx->used += plen; + copied += plen; + size -= plen; + } else { do { struct page *pg; unsigned int i = sgl->cur; diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index f6aa3856d8d5..b16111a3025a 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,8 +9,8 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be + * filled by user space with the data submitted via sendpage. Filling up + * the TX SGL does not cause a crypto operation -- the data will only be * tracked by the kernel. Upon receipt of one recvmsg call, the caller must * provide a buffer which is tracked with the RX SGL. * @@ -113,19 +113,19 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, } /* - * Data length provided by caller via sendmsg/sendpage that has not - * yet been processed. + * Data length provided by caller via sendmsg that has not yet been + * processed. */ used = ctx->used; /* - * Make sure sufficient data is present -- note, the same check is - * also present in sendmsg/sendpage. The checks in sendpage/sendmsg - * shall provide an information to the data sender that something is - * wrong, but they are irrelevant to maintain the kernel integrity. - * We need this check here too in case user space decides to not honor - * the error message in sendmsg/sendpage and still call recvmsg. This - * check here protects the kernel integrity. + * Make sure sufficient data is present -- note, the same check is also + * present in sendmsg. The checks in sendmsg shall provide an + * information to the data sender that something is wrong, but they are + * irrelevant to maintain the kernel integrity. We need this check + * here too in case user space decides to not honor the error message + * in sendmsg and still call recvmsg. This check here protects the + * kernel integrity. */ if (!aead_sufficient_data(sk)) return -EINVAL; diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index a251cd6bd5b9..b1f321b9f846 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg. Filling up the TX + * SGL does not cause a crypto operation -- the data will only be tracked by + * the kernel. Upon receipt of one recvmsg call, the caller must provide a + * buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed From patchwork Fri Mar 31 16:08:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77861 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690943vqo; Fri, 31 Mar 2023 09:34:14 -0700 (PDT) X-Google-Smtp-Source: AKy350ZTmY1BzdCwUUtDERqCYBGWWZnvQ3Pi7W7HZWpsot3urEkhJlvK/cVw5VzSkN7pY88r/aKh X-Received: by 2002:a17:906:90c9:b0:933:6ae6:374d with SMTP id v9-20020a17090690c900b009336ae6374dmr23496997ejw.73.1680280454194; Fri, 31 Mar 2023 09:34:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280454; cv=none; d=google.com; s=arc-20160816; b=bZfxG7X0gNLtYNKRdsJ6cjI5O1XdFhByo/xFfLBkqfiwTGyVODjWVfk2BiivLWtioZ FZitiob8172A3hCWeWf8dssWmJZ3bFTo3EPlbI/WiF23eS3EnewsXkmMR6fILyVTKeQ4 lLjPgaHB2k0GvwXz69W7nObN0WFBqcbSvTLuO2E0pmsAISpyMbeONvgrWuh8S+2wI40F Vg9V/2/kyWDqnAIynfA2YHZp3T1d/Aq8CCbUfMkY5HSLsa87S0mUOko1b8+fb2erdO3U XnE6O45XHTztwuaidG/3WOrp3wej4lgjuWaNrmJMQU0Ce7z7IaIJy4P9Rjq+X18FojQV dQZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mERLnHIZ7Ufhp5QAaSwmDcCxGjZ2CWUtcWDMeYUTupY=; b=Ne7ubynBZsl/2svCaM9Tl4imZvu7PclJlRXQ8fV2LiSSBAVmO5C5mdz+ygJwKaT9W4 md6e4FgBDArRAlyGcyd8NhVpiHrsTdEEw38rgXMteq+PxEJe+oUoXPntrB5tuRvsfidQ z//hTD3iZQILuxx7X0XK6W+lhHbe0qZKFx7Ty+Xwk6Socs4j7R8THxyqjYLZBldiIZy5 3QtTSJOGyqoD0qIYXVRgb/mbjuCEMcL7m+R7DdaI4BLB7Tlqnh3uFlejFyQ6QyzHmLXW uZ9RDkgAdLS/txIuKMOpJqLdSVE1YZo5mcRazK2qardvmO6b7QvmrXJgFqhLPpS+RCUe fCFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gwG5FE3d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g11-20020a170906348b00b00930f5ecbd94si2063715ejb.346.2023.03.31.09.33.50; Fri, 31 Mar 2023 09:34:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gwG5FE3d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233358AbjCaQOv (ORCPT + 99 others); Fri, 31 Mar 2023 12:14:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbjCaQMy (ORCPT ); Fri, 31 Mar 2023 12:12:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D965522EBE for ; Fri, 31 Mar 2023 09:10:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mERLnHIZ7Ufhp5QAaSwmDcCxGjZ2CWUtcWDMeYUTupY=; b=gwG5FE3dD6oDSnG8oyYnunUWjfkBJ1xaS3YoQU3idWN+rmI453zIieNYCOjl6HRLSDIkhc xDmvDwdHos2bFrkSMMnnByyek530O2xJjdiXvm4V2ZMVNcGQbBmfEmj3FaIGcgoMqF5Bo8 KJwFZT0X3RbmS6RfP9rjAX9cl2fvVAw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-396-nbKIJurtO2appj6WS6snYg-1; Fri, 31 Mar 2023 12:10:32 -0400 X-MC-Unique: nbKIJurtO2appj6WS6snYg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C13C3886068; Fri, 31 Mar 2023 16:10:30 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id ADCA81121314; Fri, 31 Mar 2023 16:10:28 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 25/55] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:44 +0100 Message-Id: <20230331160914.1608208-26-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901757865132558?= X-GMAIL-MSGID: =?utf-8?q?1761901757865132558?= Convert af_alg_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. [!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib. This probably needs moving to core code somewhere. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 53 +++++++++---------------------------------------- 1 file changed, 9 insertions(+), 44 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 3088ab298632..7fe8c8db6bb5 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1118,53 +1118,18 @@ EXPORT_SYMBOL_GPL(af_alg_sendmsg); ssize_t af_alg_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { - struct sock *sk = sock->sk; - struct alg_sock *ask = alg_sk(sk); - struct af_alg_ctx *ctx = ask->private; - struct af_alg_tsgl *sgl; - int err = -EINVAL; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - lock_sock(sk); - if (!ctx->more && ctx->used) - goto unlock; - - if (!size) - goto done; - - if (!af_alg_writable(sk)) { - err = af_alg_wait_for_wmem(sk, flags); - if (err) - goto unlock; - } - - err = af_alg_alloc_tsgl(sk); - if (err) - goto unlock; - - ctx->merge = 0; - sgl = list_entry(ctx->tsgl_list.prev, struct af_alg_tsgl, list); - - if (sgl->cur) - sg_unmark_end(sgl->sg + sgl->cur - 1); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES, + }; - sg_mark_end(sgl->sg + sgl->cur); + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - get_page(page); - sg_set_page(sgl->sg + sgl->cur, page, size, offset); - sgl->cur++; - ctx->used += size; - -done: - ctx->more = flags & MSG_MORE; - -unlock: - af_alg_data_wakeup(sk); - release_sock(sk); + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - return err ?: size; + return sock_sendmsg(sock, &msg); } EXPORT_SYMBOL_GPL(af_alg_sendpage); From patchwork Fri Mar 31 16:08:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77867 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692116vqo; Fri, 31 Mar 2023 09:36:04 -0700 (PDT) X-Google-Smtp-Source: AKy350aVVQpMUczPO5dtY2guskR585qMfijzhwJKFjnW/vWsslZKkac3+vvu7h/q0nCBZdjtM5Rh X-Received: by 2002:a05:6402:483:b0:502:4c82:9b4d with SMTP id k3-20020a056402048300b005024c829b4dmr16652186edv.15.1680280564025; Fri, 31 Mar 2023 09:36:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280564; cv=none; d=google.com; s=arc-20160816; b=J3FEKg7jyR2zqybD3ne+ngo4RGDXw1oGqrqRPtIAZsJpe1wdV3elgG4dydNfRDD29j pZurj+SzYNy0oy0DEaxKhipGPGlP0oBNAmjH0+QQU4GyxFO1S4PtYVJum4/fOkpxYCbS 9L0HWHl94jg+k1rFzgPVgQ+LvCE5UKIWj323bly5VFq2qaFGFmFqnPJNLVhJsx5VqnGy niChajPXN8AB3Y3xNVLJ1x8QyU0EDYPgMSBiuYwBMiqwz/S1zVmj48oy56OcNmdWF6hX 1l6oXfsc/O1ixfIjWhJnl01wZMvngBmm4z43Pv97WXH0EbaVQoUnFkCH+GMTt0Ke+afX LfEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=M5DaLi1tovZrH11oTI95mtlHRv/XZcPSFvwL/aUxUh4=; b=Bfbgf1X51Ti+D8TR2zx2nFXz0wa/5BXkDWRIwQLCwqnjrNE/pmstlYNz2floRpGETD 4OULQf+AEwov5iVJnK3S/QLARU2hKf3Bc1/VuMyE2NIXqcyzidiMFaX6SDMivf5dviQS 1081XIRCoB7XcQ0sK08f+W0wlJ4ryOCtaY6KXTRAY3D6jfm1Lu+GQChE+Q9w3i5fOChA dtymKdycO75o3lprtLVjF2dEZLX3eBYBwEHQPEbi7UTo0ReZJ5mzyi/MgXRENtVolxXH sRBvZu0ulI0rAwNJUqpre9DfBq4SSpmMaJzG1OlUZpjdanqSVTkXe55BZh9zxMJfHkWt 6s4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hvuJVlhU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ia3-20020a170907a06300b0094762e9cf4dsi2050527ejc.732.2023.03.31.09.35.40; Fri, 31 Mar 2023 09:36:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hvuJVlhU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233272AbjCaQO6 (ORCPT + 99 others); Fri, 31 Mar 2023 12:14:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233298AbjCaQNL (ORCPT ); Fri, 31 Mar 2023 12:13:11 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80E9D24436 for ; Fri, 31 Mar 2023 09:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279040; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M5DaLi1tovZrH11oTI95mtlHRv/XZcPSFvwL/aUxUh4=; b=hvuJVlhUbWk56eDcODE5dwNiIYjJlheVSuisaPadN0mshUCuer2jQJLJatzv0cy+pDtDYM 2Vh2o5aVnqXZNZCp4s0y2MtNUA4o/HGR4uOjzikqgsqbQtISkJKDNIxX+Xgb4RRS2iuas9 Qm6FpzNUjMX3LjOgqWzvYmS67yzI0dE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-185-bYCsTSCqNby-9SBEzU5ngA-1; Fri, 31 Mar 2023 12:10:36 -0400 X-MC-Unique: bYCsTSCqNby-9SBEzU5ngA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 986783803502; Fri, 31 Mar 2023 16:10:33 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 816BF4020C82; Fri, 31 Mar 2023 16:10:31 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 26/55] crypto: af_alg/hash: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:45 +0100 Message-Id: <20230331160914.1608208-27-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901872674731231?= X-GMAIL-MSGID: =?utf-8?q?1761901872674731231?= Make AF_ALG sendmsg() support MSG_SPLICE_PAGES in the hashing code. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. [!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib. This probably needs moving to core code somewhere. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 11 +++-- crypto/algif_hash.c | 99 ++++++++++++++++++++++++++++----------------- 2 files changed, 70 insertions(+), 40 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 7fe8c8db6bb5..686610a4986f 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -543,9 +543,14 @@ void af_alg_free_sg(struct af_alg_sgl *sgl) { int i; - if (sgl->need_unpin) - for (i = 0; i < sgl->sgt.nents; i++) - unpin_user_page(sg_page(&sgl->sgt.sgl[i])); + if (sgl->sgt.sgl) { + if (sgl->need_unpin) + for (i = 0; i < sgl->sgt.nents; i++) + unpin_user_page(sg_page(&sgl->sgt.sgl[i])); + if (sgl->sgt.sgl != sgl->sgl) + kvfree(sgl->sgt.sgl); + sgl->sgt.sgl = NULL; + } } EXPORT_SYMBOL_GPL(af_alg_free_sg); diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index f051fa624bd7..b89c2c50cecc 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -64,77 +64,102 @@ static void hash_free_result(struct sock *sk, struct hash_ctx *ctx) static int hash_sendmsg(struct socket *sock, struct msghdr *msg, size_t ignored) { - int limit = ALG_MAX_PAGES * PAGE_SIZE; struct sock *sk = sock->sk; struct alg_sock *ask = alg_sk(sk); struct hash_ctx *ctx = ask->private; - long copied = 0; + ssize_t copied = 0; + size_t len, max_pages = ALG_MAX_PAGES, npages; + bool continuing = ctx->more, need_init = false; int err; - if (limit > sk->sk_sndbuf) - limit = sk->sk_sndbuf; + /* Don't limit to ALG_MAX_PAGES if the pages are all already pinned. */ + if (!user_backed_iter(&msg->msg_iter)) + max_pages = INT_MAX; + else + max_pages = min_t(size_t, max_pages, + DIV_ROUND_UP(sk->sk_sndbuf, PAGE_SIZE)); lock_sock(sk); - if (!ctx->more) { + if (!continuing) { if ((msg->msg_flags & MSG_MORE)) hash_free_result(sk, ctx); - - err = crypto_wait_req(crypto_ahash_init(&ctx->req), &ctx->wait); - if (err) - goto unlock; + need_init = true; } ctx->more = false; while (msg_data_left(msg)) { - int len = msg_data_left(msg); - - if (len > limit) - len = limit; - ctx->sgl.sgt.sgl = ctx->sgl.sgl; ctx->sgl.sgt.nents = 0; ctx->sgl.sgt.orig_nents = 0; - len = netfs_extract_iter_to_sg(&msg->msg_iter, len, - &ctx->sgl.sgt, ALG_MAX_PAGES, 0); - if (len < 0) { - err = copied ? 0 : len; - goto unlock; + err = -EIO; + npages = iov_iter_npages(&msg->msg_iter, max_pages); + if (npages == 0) + goto unlock_free; + + if (npages > ARRAY_SIZE(ctx->sgl.sgl)) { + err = -ENOMEM; + ctx->sgl.sgt.sgl = + kvmalloc(array_size(npages, sizeof(*ctx->sgl.sgt.sgl)), + GFP_KERNEL); + if (!ctx->sgl.sgt.sgl) + goto unlock_free; } + sg_init_table(ctx->sgl.sgl, npages); ctx->sgl.need_unpin = iov_iter_extract_will_pin(&msg->msg_iter); - ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl, NULL, len); + err = netfs_extract_iter_to_sg(&msg->msg_iter, LONG_MAX, + &ctx->sgl.sgt, npages, 0); + if (err < 0) + goto unlock_free; + len = err; + sg_mark_end(ctx->sgl.sgt.sgl + ctx->sgl.sgt.nents - 1); - err = crypto_wait_req(crypto_ahash_update(&ctx->req), - &ctx->wait); - af_alg_free_sg(&ctx->sgl); - if (err) { - iov_iter_revert(&msg->msg_iter, len); - goto unlock; + if (!msg_data_left(msg)) { + err = hash_alloc_result(sk, ctx); + if (err) + goto unlock_free; } - copied += len; - } + ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl, ctx->result, len); - err = 0; + if (!msg_data_left(msg) && !continuing && !(msg->msg_flags & MSG_MORE)) { + err = crypto_ahash_digest(&ctx->req); + } else { + if (need_init) { + err = crypto_wait_req(crypto_ahash_init(&ctx->req), + &ctx->wait); + if (err) + goto unlock_free; + need_init = false; + } + + if (msg_data_left(msg) || (msg->msg_flags & MSG_MORE)) + err = crypto_ahash_update(&ctx->req); + else + err = crypto_ahash_finup(&ctx->req); + continuing = true; + } - ctx->more = msg->msg_flags & MSG_MORE; - if (!ctx->more) { - err = hash_alloc_result(sk, ctx); + err = crypto_wait_req(err, &ctx->wait); if (err) - goto unlock; + goto unlock_free; - ahash_request_set_crypt(&ctx->req, NULL, ctx->result, 0); - err = crypto_wait_req(crypto_ahash_final(&ctx->req), - &ctx->wait); + copied += len; + af_alg_free_sg(&ctx->sgl); } + ctx->more = msg->msg_flags & MSG_MORE; + err = 0; unlock: release_sock(sk); + return copied ?: err; - return err ?: copied; +unlock_free: + af_alg_free_sg(&ctx->sgl); + goto unlock; } static ssize_t hash_sendpage(struct socket *sock, struct page *page, From patchwork Fri Mar 31 16:08:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77873 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692391vqo; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) X-Google-Smtp-Source: AKy350b0tceowbC0PgoNB+qo0Xzknz36iw9Cwh2cCr6uWF9TzxgaV9PmcjI8JboUvpVe52rSjD6+ X-Received: by 2002:a05:6402:cc:b0:500:5627:a20b with SMTP id i12-20020a05640200cc00b005005627a20bmr25554994edu.1.1680280589124; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280589; cv=none; d=google.com; s=arc-20160816; b=gBpaXZtycsBi3iwX2C8BEJ+ADcgxczv6WRR2vGw5GRKOydmntm2iId0vPXtTuMovPZ zCoet3ot9M83HwWna3nUc1bEmzSJQjLieP3w5XSopmj4CAoGkmgYiNw0xOAPDnUjYGOu 2WAyZCUGiRNGBL1y4NmenUHs7g9L+hOG7Ifzs1bKY5KcTk50i3gh7nV4/jrxcwrJHjk/ mELqP2pxGMPHR6B+LA8DqOw/JPVd8Q4Hp+JCpjYpYMeAObFax2de3epgulGmnJlzdF+s yOZLzdEtSu4waM92O+Tiqj3TYjhep8HyYFb90Fc/79fgS7qgrNVUJ4bTpy0yYWUmI6Cp AX8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zbMUDKrdVAVwS+N9r9mW+fACtMcx+N9uUpcmjLnA2dk=; b=wK6oHDgjl9AUnbzPHqAA6wj0RZkPoTLmtvkLkz0iYDm5FPMyuDdznh1nhHWzWWqtaq oXNLG6KwGRja6gAXAtnHEhp+zsubES8wj61VukV/SFbSFMMX1S9Be4rz4av0Ckk/TkcQ 5GsQ2C+oqzlgFfimDT3Gy/eu25wqWu6xoqNhRmfM+GuY8hl1uT/xWZPKiG6vHhCbrDkw PFoNnaE8p75Y1PMoBv7YEnvMvc4l07Vfe7OH8F29RzcfxFQATECVJs3zrNKCxgX6zy8W ISB7u1PnA8mFPPfwKUZDbOPd54F7zPep1mtV9jKxgpswORtoOcOPEdzvmhVc/InAbYeN ynWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="b/WdsDLY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a50ed85000000b004fe1b624748si80008edr.354.2023.03.31.09.36.03; Fri, 31 Mar 2023 09:36:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="b/WdsDLY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233092AbjCaQPY (ORCPT + 99 others); Fri, 31 Mar 2023 12:15:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232891AbjCaQNt (ORCPT ); Fri, 31 Mar 2023 12:13:49 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 207BE24AD1 for ; Fri, 31 Mar 2023 09:10:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279042; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zbMUDKrdVAVwS+N9r9mW+fACtMcx+N9uUpcmjLnA2dk=; b=b/WdsDLYORxq24qusI2xnKsKvbS+E3NWC6906+g01CJuco0LJowsRZQwzXmK7A2HwlnYvn eSL94QCWzav4oT1XC3dSP0+4GrmZYIx6bfvnH9lUpLZ7lNocm/Xcef/HTV9gId24cRBbrp kev/sxTiVZe0U0UTXmxchBD39bNwa+E= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-398-2LwuMPeSO5ywsd8zwlaphQ-1; Fri, 31 Mar 2023 12:10:38 -0400 X-MC-Unique: 2LwuMPeSO5ywsd8zwlaphQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 48CE73802B94; Fri, 31 Mar 2023 16:10:36 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 35BAC492C3E; Fri, 31 Mar 2023 16:10:34 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH v3 27/55] tls/device: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:46 +0100 Message-Id: <20230331160914.1608208-28-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901899166920782?= X-GMAIL-MSGID: =?utf-8?q?1761901899166920782?= Make TLS's device sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible and copied the data if not. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Chuck Lever cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/tls/tls_device.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 6c593788dc25..f5c3b56ac1ce 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -508,7 +508,30 @@ static int tls_push_data(struct sock *sk, zc_pfrag.offset = iter_offset.offset; zc_pfrag.size = copy; tls_append_frag(record, &zc_pfrag, copy); + } else if (copy && (flags & MSG_SPLICE_PAGES)) { + struct page_frag zc_pfrag; + struct page **pages = &zc_pfrag.page; + size_t off; + + rc = iov_iter_extract_pages(iter_offset.msg_iter, &pages, + copy, 1, 0, &off); + if (rc <= 0) { + if (rc == 0) + rc = -EIO; + goto handle_error; + } + copy = rc; + + if (!sendpage_ok(zc_pfrag.page)) { + iov_iter_revert(iter_offset.msg_iter, copy); + goto no_zcopy_this_page; + } + + zc_pfrag.offset = off; + zc_pfrag.size = copy; + tls_append_frag(record, &zc_pfrag, copy); } else if (copy) { +no_zcopy_this_page: copy = min_t(size_t, copy, pfrag->size - pfrag->offset); rc = tls_device_copy_data(page_address(pfrag->page) + @@ -571,6 +594,9 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) union tls_iter_offset iter; int rc; + if (!tls_ctx->zerocopy_sendfile) + msg->msg_flags &= ~MSG_SPLICE_PAGES; + mutex_lock(&tls_ctx->tx_lock); lock_sock(sk); From patchwork Fri Mar 31 16:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77856 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690610vqo; Fri, 31 Mar 2023 09:33:41 -0700 (PDT) X-Google-Smtp-Source: AKy350ZXLKd+oWC2StyhBYHPhf8ntMUPsXdXI6g5acFj5ZD+5EnH5wkukYxQjZFdRX361svOJdQs X-Received: by 2002:a17:902:dace:b0:1a1:d1be:91fe with SMTP id q14-20020a170902dace00b001a1d1be91femr32274510plx.0.1680280421421; Fri, 31 Mar 2023 09:33:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280421; cv=none; d=google.com; s=arc-20160816; b=i3txfu9WLOnhk4pS4HF31w2TL/SXnSkVKNcMywR2r5kvwPZ8u1WoQPmKr6XDr79jWj paghKtLNnZ/vPtHPZNghdrnfF1QCl44lnzoosdz54D1STc9Nd6r5i4rM+mpLH9T+lX1t cBIOtZ/D60OlawKhHEUPHAGFzpSD3RHVa3jQsejvh0izW09f4A7kyoc4JW9liBkQtKVI u+NeJflsLk4XqEELq93w43jmkUVGGjXYvDOWm6Ig+VbQdaEBtDUjIKEbqjcSDaRd7QP5 d6NLB0FWjhgosZRFHERves2zNu/muKznadMyMho3O9RDnX7NOVqSgaWhk+lBbxRzdAVW LJiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nNsV/6c3ho01ZHWXYC82bpiErYX0eKAApSJsSVNmIO8=; b=mjE8T1SQ9MRPcDLqITs9BufrtX7SBD6Srx0VUzb0v+A1c8ZT8OjAU0rVu2ZCRCGIY0 sNibe4vSTLUPF5lDhniwYIP4XXik0gHYSriHiJKu9v/zIqESkfc5/kjoKtzT/krm8G1f d0v5prdWsLZ8oShMSNMCyOS2Wuiwx2/D65zBnL3SQUYCEFQO84TK3x/SjTrzdpY40q5t IlFPYUBaJblqAlagRDPV3rf8bhU1lOm6TPK+mV7Bdyn8+pr826Hlaxf+lHnQN0/xLtw8 Y4fGEM2N+jXUtfgRcgOA6YbK9R1L1QBPCKjihs7S8taB/tzgk5ki7COMEaSHaRic3Bqh cX/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="FcPDiP/O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z19-20020a63e113000000b00513234112b3si2743380pgh.895.2023.03.31.09.33.29; Fri, 31 Mar 2023 09:33:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="FcPDiP/O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233108AbjCaQP2 (ORCPT + 99 others); Fri, 31 Mar 2023 12:15:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233088AbjCaQOA (ORCPT ); Fri, 31 Mar 2023 12:14:00 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C75524AEB for ; Fri, 31 Mar 2023 09:10:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279044; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nNsV/6c3ho01ZHWXYC82bpiErYX0eKAApSJsSVNmIO8=; b=FcPDiP/Oetf3GAszhKBIHcJxBJ8Zd6gUExNMx6B7FOHsr1/iB/vsYC/aIKGNAREWzwpBbY tQ0DjW4tuOY+7XBMwDYWL63JB4tmLl+eb9JzoESiHqnBQj7M4cZwB+xcWcGhpss7Ewf+GA hQsW0S8lKv8Qe/7gc5GZ/DI5BzdLXWE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-138-CQ22tLM2M2iPd2GX0NVahQ-1; Fri, 31 Mar 2023 12:10:40 -0400 X-MC-Unique: CQ22tLM2M2iPd2GX0NVahQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 19982101A54F; Fri, 31 Mar 2023 16:10:39 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08B612166B33; Fri, 31 Mar 2023 16:10:36 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH v3 28/55] tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:47 +0100 Message-Id: <20230331160914.1608208-29-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901723258708487?= X-GMAIL-MSGID: =?utf-8?q?1761901723258708487?= Convert tls_device_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. With that, the tls_iter_offset union is no longer necessary and can be replaced with an iov_iter pointer and the zc_page argument to tls_push_data() can also be removed. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Chuck Lever cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/tls/tls_device.c | 79 ++++++++++---------------------------------- 1 file changed, 18 insertions(+), 61 deletions(-) diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index f5c3b56ac1ce..6cfd1577a212 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -424,16 +424,10 @@ static int tls_device_copy_data(void *addr, size_t bytes, struct iov_iter *i) return 0; } -union tls_iter_offset { - struct iov_iter *msg_iter; - int offset; -}; - static int tls_push_data(struct sock *sk, - union tls_iter_offset iter_offset, + struct iov_iter *iter, size_t size, int flags, - unsigned char record_type, - struct page *zc_page) + unsigned char record_type) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; @@ -501,19 +495,12 @@ static int tls_push_data(struct sock *sk, record = ctx->open_record; copy = min_t(size_t, size, max_open_record_len - record->len); - if (copy && zc_page) { - struct page_frag zc_pfrag; - - zc_pfrag.page = zc_page; - zc_pfrag.offset = iter_offset.offset; - zc_pfrag.size = copy; - tls_append_frag(record, &zc_pfrag, copy); - } else if (copy && (flags & MSG_SPLICE_PAGES)) { + if (copy && (flags & MSG_SPLICE_PAGES)) { struct page_frag zc_pfrag; struct page **pages = &zc_pfrag.page; size_t off; - rc = iov_iter_extract_pages(iter_offset.msg_iter, &pages, + rc = iov_iter_extract_pages(iter, &pages, copy, 1, 0, &off); if (rc <= 0) { if (rc == 0) @@ -523,7 +510,7 @@ static int tls_push_data(struct sock *sk, copy = rc; if (!sendpage_ok(zc_pfrag.page)) { - iov_iter_revert(iter_offset.msg_iter, copy); + iov_iter_revert(iter, copy); goto no_zcopy_this_page; } @@ -536,7 +523,7 @@ static int tls_push_data(struct sock *sk, rc = tls_device_copy_data(page_address(pfrag->page) + pfrag->offset, copy, - iter_offset.msg_iter); + iter); if (rc) goto handle_error; tls_append_frag(record, pfrag, copy); @@ -591,7 +578,6 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) { unsigned char record_type = TLS_RECORD_TYPE_DATA; struct tls_context *tls_ctx = tls_get_ctx(sk); - union tls_iter_offset iter; int rc; if (!tls_ctx->zerocopy_sendfile) @@ -606,8 +592,7 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) goto out; } - iter.msg_iter = &msg->msg_iter; - rc = tls_push_data(sk, iter, size, msg->msg_flags, record_type, NULL); + rc = tls_push_data(sk, &msg->msg_iter, size, msg->msg_flags, record_type); out: release_sock(sk); @@ -618,44 +603,18 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) int tls_device_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct tls_context *tls_ctx = tls_get_ctx(sk); - union tls_iter_offset iter_offset; - struct iov_iter msg_iter; - char *kaddr; - struct kvec iov; - int rc; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - mutex_lock(&tls_ctx->tx_lock); - lock_sock(sk); + msg.msg_flags |= MSG_MORE; - if (flags & MSG_OOB) { - rc = -EOPNOTSUPP; - goto out; - } - - if (tls_ctx->zerocopy_sendfile) { - iter_offset.offset = offset; - rc = tls_push_data(sk, iter_offset, size, - flags, TLS_RECORD_TYPE_DATA, page); - goto out; - } - - kaddr = kmap(page); - iov.iov_base = kaddr + offset; - iov.iov_len = size; - iov_iter_kvec(&msg_iter, ITER_SOURCE, &iov, 1, size); - iter_offset.msg_iter = &msg_iter; - rc = tls_push_data(sk, iter_offset, size, flags, TLS_RECORD_TYPE_DATA, - NULL); - kunmap(page); + if (flags & MSG_OOB) + return -EOPNOTSUPP; -out: - release_sock(sk); - mutex_unlock(&tls_ctx->tx_lock); - return rc; + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return tls_device_sendmsg(sk, &msg, size); } struct tls_record_info *tls_get_record(struct tls_offload_context_tx *context, @@ -720,12 +679,10 @@ EXPORT_SYMBOL(tls_get_record); static int tls_device_push_pending_record(struct sock *sk, int flags) { - union tls_iter_offset iter; - struct iov_iter msg_iter; + struct iov_iter iter; - iov_iter_kvec(&msg_iter, ITER_SOURCE, NULL, 0, 0); - iter.msg_iter = &msg_iter; - return tls_push_data(sk, iter, 0, flags, TLS_RECORD_TYPE_DATA, NULL); + iov_iter_kvec(&iter, ITER_SOURCE, NULL, 0, 0); + return tls_push_data(sk, &iter, 0, flags, TLS_RECORD_TYPE_DATA); } void tls_device_write_space(struct sock *sk, struct tls_context *ctx) From patchwork Fri Mar 31 16:08:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77848 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp686810vqo; Fri, 31 Mar 2023 09:28:23 -0700 (PDT) X-Google-Smtp-Source: AKy350aSr4eCCFqZpJ5fyOokib1agATaHd6OCrmseNmG1Vr9R0E1BJGQiIYUMkkfLJlnFpKyPWCy X-Received: by 2002:a17:906:3d72:b0:946:c09a:4262 with SMTP id r18-20020a1709063d7200b00946c09a4262mr9518369ejf.29.1680280103500; Fri, 31 Mar 2023 09:28:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280103; cv=none; d=google.com; s=arc-20160816; b=j8H4hgClBzxOB7PDNQkkoWJHD2AMyxyBuIZ+mf1/85jWQtNsGMURTo6ubc+J89H5TK kzKsdHRE/hyFvBLWm9KFj0Uyk8mjkRRF4KjXc+d/LdxQsSLSuEdfuI8aqoeSFZuJEktp blBKAKzMnGv3tVzQYYvhFn0Hg18avL+v+bJXl6O//lmBTS/8FLm7MG+Y1OLsoyc5g/jH 3qnRc8qaVRBZMCxVeTKNlaqKggOsVa6t33/05h8hacZHAr3FQEv7Segwvl81lRVTWXVN Cvf+xKhbRfxgZcDp403Zu/GPQtlW8u2xB0a+JRAw+4KNexz1Sy6GSRdMGQzxbNO+POWW +TJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NZJQb7iAop4LtoT19/C7V+Vk0vTB0ScTwxSGzzmAlmQ=; b=KFGu2ZLK2aE+u7TEIrOgZ+lyT1IDy6eJLWAYNlSSeLbM3x8RJjURTHOAGyh7IxZ/kS LaXlGV/wJ8w/JRetCW9Ah/y6kui/4qfyHiQN3iJLHioSwjsQ+WEwlHjOGZV2GA4/Pdal 40FAeIgEFPXPrN6rVBd9P3Xxp3iPiie87yEzmhHorjBmO9L4jaW4KBwa6rXIuAkSkuBO PK20rq+AHS9RI4mBoL4LLkZjB9AW2ZJEKkCJtuh6qqc2jHxZxz/obP+NTrN4UGh5xq3q FY9K2KxB+Lqns38QiuY3wFYvkIny2BYvBSrNUB9ZbHW7J97oJuQCaVoOsC2nsWeJL9xf ksEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OmrG1vgt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l8-20020a056402124800b004cdc92cc412si2416346edw.69.2023.03.31.09.27.58; Fri, 31 Mar 2023 09:28:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OmrG1vgt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233400AbjCaQPp (ORCPT + 99 others); Fri, 31 Mar 2023 12:15:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231215AbjCaQO2 (ORCPT ); Fri, 31 Mar 2023 12:14:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A11C22E8F for ; Fri, 31 Mar 2023 09:10:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279045; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NZJQb7iAop4LtoT19/C7V+Vk0vTB0ScTwxSGzzmAlmQ=; b=OmrG1vgtBwF7BqMP3uI7P4DUYVJoR6NeLlvkMZTdUVN13dRavFgxi8f3nF55oxN2kufpP7 zN6bDeWxqMyR3XAByDN0CrqXUYFn78L9l2zqNe0US96Di0XNnaeLiwXUHGvEgLAlYHR7Br hb19Vilctkp52svxXkvSl1t+/J+7A5c= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-380-s0-9Da1IPNaVGNWY3Pu9-w-1; Fri, 31 Mar 2023 12:10:42 -0400 X-MC-Unique: s0-9Da1IPNaVGNWY3Pu9-w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B8BF029ABA1D; Fri, 31 Mar 2023 16:10:41 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A7E464020C82; Fri, 31 Mar 2023 16:10:39 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH v3 29/55] tls/sw: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:48 +0100 Message-Id: <20230331160914.1608208-30-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901389852888957?= X-GMAIL-MSGID: =?utf-8?q?1761901389852888957?= Make TLS's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible and copied the data if not. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Chuck Lever cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/tls/tls_sw.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 782d3701b86f..ce0c289e68ca 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -929,6 +929,49 @@ static int tls_sw_push_pending_record(struct sock *sk, int flags) &copied, flags); } +static int rls_sw_sendmsg_splice(struct sock *sk, struct msghdr *msg, + struct sk_msg *msg_pl, size_t try_to_copy, + ssize_t *copied) +{ + struct page *page, **pages = &page; + + do { + ssize_t part; + size_t off; + bool put = false; + + part = iov_iter_extract_pages(&msg->msg_iter, &pages, + try_to_copy, 1, 0, &off); + if (part <= 0) + return part ?: -EIO; + + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, part, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, part); + return -ENOMEM; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } + + sk_msg_page_add(msg_pl, page, part, off); + sk_mem_charge(sk, part); + if (put) + put_page(page); + *copied += part; + try_to_copy -= part; + } while (try_to_copy && !sk_msg_full(msg_pl)); + + return 0; +} + int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) { long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); @@ -1016,6 +1059,17 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) full_record = true; } + if (try_to_copy && (msg->msg_flags & MSG_SPLICE_PAGES)) { + ret = rls_sw_sendmsg_splice(sk, msg, msg_pl, + try_to_copy, &copied); + if (ret < 0) + goto send_end; + tls_ctx->pending_open_record_frags = true; + if (full_record || eor || sk_msg_full(msg_pl)) + goto copied; + continue; + } + if (!is_kvec && (full_record || eor) && !async_capable) { u32 first = msg_pl->sg.end; @@ -1078,8 +1132,9 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) /* Open records defined only if successfully copied, otherwise * we would trim the sg but not reset the open record frags. */ - tls_ctx->pending_open_record_frags = true; copied += try_to_copy; +copied: + tls_ctx->pending_open_record_frags = true; if (full_record || eor) { ret = bpf_exec_tx_verdict(msg_pl, sk, full_record, record_type, &copied, From patchwork Fri Mar 31 16:08:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692989vqo; Fri, 31 Mar 2023 09:37:33 -0700 (PDT) X-Google-Smtp-Source: AKy350bBhqEpwlZhabllhQIqucpVzY4o6QnBdjMKP7r68/9+qmh5A11lSOZkdbK4VGkDo9apeSjH X-Received: by 2002:a17:906:395:b0:93b:6da8:539a with SMTP id b21-20020a170906039500b0093b6da8539amr26937674eja.18.1680280653501; Fri, 31 Mar 2023 09:37:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280653; cv=none; d=google.com; s=arc-20160816; b=hVLxQ/aWRrjIVrXLLRXFkDe1LGjbzKEAhuTio8/nP3mAzfyswEkPkplxUDNo5+wsqX AUlzg6V+8YBJ8WcBPg3PoQ5qYWuxs55OwgkIpmVo5TkZopFawrSqF1RQEXhrWdyel6T8 lOD5SVlCRgXpJTCYQx5FYb+kv+dWpOl/ix+qqa9TvSqZV9Go4Qb1pcMyVodbD21PsCq3 AQFlM0ocF9eGF4DJokicte6naWvPlhg26b9LsQJ+fhZwU0IKn+qJgKrOMfSwPVPiJ5Mk gYbsiYsqIT7cTLUoqFT6UF7Zmr4yLz683V+0wQOYuG2Mqw2CGOfdQfX+dl9PyfPrgWOw RDBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qgrsaUecWqqjS48xBO3DK0ifte08BTz2ZXAzew8KrV0=; b=RFZ0eGqTL9caRWtasEqn5SerIudVf23Mf/83emFV2kgrVZYbmzLrc/C5F3HS8zXlG2 +l204iaGTZzcACJNymQWi15S9zumMhsir3F0h8IbK/nHpqJnl6EVE44PvykhbfcBsJS8 +wfbIaCkX0n2RSpkTgfSsKSdFsZelUm1pZ+z/Druxxg8HprW3q76XIHyX74OCTsS2Kn4 mw589jzh0RpYQjO/w4WdtG6wF9RKqMpG04OgCmYbmazzjge8f3WzrlgY38QBEwzoW/L1 JsJbYvOXBfy2+c42iOmtp8LoQDrTQ0+rRcRB44La5HsN9UaaXskzJZXJeiks6jtCvUNN zlIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PpT6uB4r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sa7-20020a170906eda700b00930679e9d4fsi2322329ejb.751.2023.03.31.09.37.09; Fri, 31 Mar 2023 09:37:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PpT6uB4r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233234AbjCaQPr (ORCPT + 99 others); Fri, 31 Mar 2023 12:15:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231616AbjCaQOr (ORCPT ); Fri, 31 Mar 2023 12:14:47 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EB4624AF9 for ; Fri, 31 Mar 2023 09:10:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279048; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qgrsaUecWqqjS48xBO3DK0ifte08BTz2ZXAzew8KrV0=; b=PpT6uB4rrsLiOc0H30Q9epGA2w4n3TB/y4+iu6bn/R9/YbicvhgLyEZh1Ej3a3SyEN6MT2 s6mVZqwLwmPaU3CZFDDEqizGGSXDrxlIVEWJJ+oFoTZDyCv9sqf+YQ9HEtJl1ujrTfyX+/ IqwblgPCvCT03uz3tCSQ1G+CAadQpLA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-146-u0t01DzwNZa0rrcf5yQq9A-1; Fri, 31 Mar 2023 12:10:46 -0400 X-MC-Unique: u0t01DzwNZa0rrcf5yQq9A-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 875E28028AD; Fri, 31 Mar 2023 16:10:44 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 745581415139; Fri, 31 Mar 2023 16:10:42 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH v3 30/55] tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:49 +0100 Message-Id: <20230331160914.1608208-31-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901966800081731?= X-GMAIL-MSGID: =?utf-8?q?1761901966800081731?= Convert tls_sw_sendpage() and tls_sw_sendpage_locked() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. [!] Note that tls_sw_sendpage_locked() appears to have the wrong locking upstream. I think the caller will only hold the socket lock, but it should hold tls_ctx->tx_lock too. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Chuck Lever cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/tls/tls_sw.c | 158 +++++++++-------------------------------------- 1 file changed, 28 insertions(+), 130 deletions(-) diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index ce0c289e68ca..256824fca651 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -972,7 +972,7 @@ static int rls_sw_sendmsg_splice(struct sock *sk, struct msghdr *msg, return 0; } -int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) +static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) { long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); struct tls_context *tls_ctx = tls_get_ctx(sk); @@ -995,13 +995,6 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) int ret = 0; int pending; - if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | - MSG_CMSG_COMPAT)) - return -EOPNOTSUPP; - - mutex_lock(&tls_ctx->tx_lock); - lock_sock(sk); - if (unlikely(msg->msg_controllen)) { ret = tls_process_cmsg(sk, msg, &record_type); if (ret) { @@ -1202,155 +1195,60 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) send_end: ret = sk_stream_error(sk, msg->msg_flags, ret); - - release_sock(sk); - mutex_unlock(&tls_ctx->tx_lock); return copied > 0 ? copied : ret; } -static int tls_sw_do_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags) +int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) { - long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); struct tls_context *tls_ctx = tls_get_ctx(sk); - struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); - struct tls_prot_info *prot = &tls_ctx->prot_info; - unsigned char record_type = TLS_RECORD_TYPE_DATA; - struct sk_msg *msg_pl; - struct tls_rec *rec; - int num_async = 0; - ssize_t copied = 0; - bool full_record; - int record_room; - int ret = 0; - bool eor; - - eor = !(flags & MSG_SENDPAGE_NOTLAST); - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); - - /* Call the sk_stream functions to manage the sndbuf mem. */ - while (size > 0) { - size_t copy, required_size; - - if (sk->sk_err) { - ret = -sk->sk_err; - goto sendpage_end; - } - - if (ctx->open_rec) - rec = ctx->open_rec; - else - rec = ctx->open_rec = tls_get_rec(sk); - if (!rec) { - ret = -ENOMEM; - goto sendpage_end; - } - - msg_pl = &rec->msg_plaintext; - - full_record = false; - record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size; - copy = size; - if (copy >= record_room) { - copy = record_room; - full_record = true; - } - - required_size = msg_pl->sg.size + copy + prot->overhead_size; - - if (!sk_stream_memory_free(sk)) - goto wait_for_sndbuf; -alloc_payload: - ret = tls_alloc_encrypted_msg(sk, required_size); - if (ret) { - if (ret != -ENOSPC) - goto wait_for_memory; - - /* Adjust copy according to the amount that was - * actually allocated. The difference is due - * to max sg elements limit - */ - copy -= required_size - msg_pl->sg.size; - full_record = true; - } - - sk_msg_page_add(msg_pl, page, copy, offset); - sk_mem_charge(sk, copy); - - offset += copy; - size -= copy; - copied += copy; - - tls_ctx->pending_open_record_frags = true; - if (full_record || eor || sk_msg_full(msg_pl)) { - ret = bpf_exec_tx_verdict(msg_pl, sk, full_record, - record_type, &copied, flags); - if (ret) { - if (ret == -EINPROGRESS) - num_async++; - else if (ret == -ENOMEM) - goto wait_for_memory; - else if (ret != -EAGAIN) { - if (ret == -ENOSPC) - ret = 0; - goto sendpage_end; - } - } - } - continue; -wait_for_sndbuf: - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); -wait_for_memory: - ret = sk_stream_wait_memory(sk, &timeo); - if (ret) { - if (ctx->open_rec) - tls_trim_both_msgs(sk, msg_pl->sg.size); - goto sendpage_end; - } + int ret; - if (ctx->open_rec) - goto alloc_payload; - } + if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | + MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | + MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY)) + return -EOPNOTSUPP; - if (num_async) { - /* Transmit if any encryptions have completed */ - if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { - cancel_delayed_work(&ctx->tx_work.work); - tls_tx_records(sk, flags); - } - } -sendpage_end: - ret = sk_stream_error(sk, flags, ret); - return copied > 0 ? copied : ret; + mutex_lock(&tls_ctx->tx_lock); + lock_sock(sk); + ret = tls_sw_sendmsg_locked(sk, msg, size); + release_sock(sk); + mutex_unlock(&tls_ctx->tx_lock); + return ret; } int tls_sw_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags) { + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY | MSG_NO_SHARED_FRAGS)) return -EOPNOTSUPP; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - return tls_sw_do_sendpage(sk, page, offset, size, flags); + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return tls_sw_sendmsg_locked(sk, &msg, size); } int tls_sw_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct tls_context *tls_ctx = tls_get_ctx(sk); - int ret; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY)) return -EOPNOTSUPP; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - mutex_lock(&tls_ctx->tx_lock); - lock_sock(sk); - ret = tls_sw_do_sendpage(sk, page, offset, size, flags); - release_sock(sk); - mutex_unlock(&tls_ctx->tx_lock); - return ret; + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return tls_sw_sendmsg(sk, &msg, size); } static int From patchwork Fri Mar 31 16:08:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77849 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp687706vqo; Fri, 31 Mar 2023 09:29:47 -0700 (PDT) X-Google-Smtp-Source: AKy350bSKfQDsbvEi2iGvL87h3Lz8iHHSB/DNrfugYT/sXo08J/grFt2VgK2IgLC4Gjk7Ms+dbY+ X-Received: by 2002:a17:902:e84e:b0:19e:8076:9bd2 with SMTP id t14-20020a170902e84e00b0019e80769bd2mr32941893plg.17.1680280186997; Fri, 31 Mar 2023 09:29:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280186; cv=none; d=google.com; s=arc-20160816; b=PWO8bsRsD26Nq1PTMVnsn0F/43JDzyXlhpLXR47vto/qLT65lLEgxHRuHDDXQzWucy 7TjmVlSChiTn1xmLFwiJe9HZZmBbuJhJ77cXbhRYK1ZvZRTiLWLBuChUFtsgHoQytAxe RIPCWTqbukmHrxwau0r9+/63ZqXAonM/gJxdTL6FSyp/ycPMPcaAeVyyDhoqFDYMF8bd Tl0Q3zRnNxeuBKEIDXTdHOIBs+l+PsqtjjRTxjo/GSGdDI5aHcrHOfyQReB4EcjkHUZZ xATJpQBrDXC8ygi8RJj6EoeNoc5oDe++ZTXuDfHyGaQCUUXs90xAXXXV+yFdZpklei+k Ke4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wy1bl7/wKEZT2hBAqmE7Plq7HNXSqd6J/GoEc2iata8=; b=Nv9ue5rQLbmNHwhKm0xIKAbfty8AEsDpMZIRcoyweEb846Xno1IZg1SvIJMBzOpSBN Kamdgc96HADeR9HmW2qjJYBI2OgtRWx9SlSQYGXcEW85Zask3rsYpclORNAiNn+bo//9 ksXM9dgjDkVHuwLVA+9487MoLfyfmUznRRwoLXzZdkE+NdLMfKuTLp5ety4lF+CKGINq HBhmswo0IIZ61r9zsC9M7msPMFvB2R9hvVjwmJyU3hXVphJ3lDRpYe7RvGDooGxmvmVf i/8FIy3BYItiGfHZtkrJHN5ut//8AJrkXc4t3hJ1ZcR4M7OaV1H9HpWuqXHT7epZRghi 9rsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JA3ZD2pl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n25-20020a635919000000b00502f0ddd50bsi2767179pgb.382.2023.03.31.09.29.34; Fri, 31 Mar 2023 09:29:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JA3ZD2pl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233112AbjCaQQU (ORCPT + 99 others); Fri, 31 Mar 2023 12:16:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233384AbjCaQPe (ORCPT ); Fri, 31 Mar 2023 12:15:34 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C1DC20C3C for ; Fri, 31 Mar 2023 09:10:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wy1bl7/wKEZT2hBAqmE7Plq7HNXSqd6J/GoEc2iata8=; b=JA3ZD2plQ2Z/EZnGdembztG1+rF7E4nmEd/n6yK6VRnOSfRKvuScy47XMbnjeT/F1Kv5+X tRGyiFJM+EOdehCuBNEqrIUYlZfsOpWP673w/ZFA6ACZ4MgJU+T3rCGCnSXzQB3KeOj/sT WLyxot8H3vrpKmhglXHKfe7U2d9YaGQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-408-TWrdWqY9OgKCvKHcSyW2Nw-1; Fri, 31 Mar 2023 12:10:48 -0400 X-MC-Unique: TWrdWqY9OgKCvKHcSyW2Nw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1E33329ABA17; Fri, 31 Mar 2023 16:10:47 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 223762166B33; Fri, 31 Mar 2023 16:10:45 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ayush Sawal Subject: [PATCH v3 31/55] chelsio: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:50 +0100 Message-Id: <20230331160914.1608208-32-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901477354320737?= X-GMAIL-MSGID: =?utf-8?q?1761901477354320737?= Make Chelsio's TLS offload sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the source iterator if possible and copying the data in otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Ayush Sawal cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- .../chelsio/inline_crypto/chtls/chtls_io.c | 60 ++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index ae6b17b96bf1..ca3daf5df95c 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c @@ -1092,7 +1092,65 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) if (copy > size) copy = size; - if (skb_tailroom(skb) > 0) { + if (msg->msg_flags & MSG_SPLICE_PAGES) { + struct page *page, **pages = &page; + ssize_t part; + size_t off, spliced = 0; + bool put = false; + int i; + + do { + i = skb_shinfo(skb)->nr_frags; + part = iov_iter_extract_pages(&msg->msg_iter, &pages, + copy - spliced, 1, 0, &off); + if (part <= 0) { + err = part ?: -EIO; + goto do_fault; + } + + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, part, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, part); + return -ENOMEM; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } + + if (skb_can_coalesce(skb, i, page, off)) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], part); + spliced += part; + if (put) + put_page(page); + } else if (i < MAX_SKB_FRAGS) { + if (!put) + get_page(page); + skb_fill_page_desc(skb, i, page, off, spliced); + spliced += part; + put = false; + } else { + if (put) + put_page(page); + if (!spliced) + goto new_buf; + break; + } + } while (spliced < copy); + + copy = spliced; + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + sk->sk_wmem_queued += copy; + + } else if (skb_tailroom(skb) > 0) { copy = min(copy, skb_tailroom(skb)); if (is_tls_tx(csk)) copy = min_t(int, copy, csk->tlshws.txleft); From patchwork Fri Mar 31 16:08:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77882 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp693610vqo; Fri, 31 Mar 2023 09:38:38 -0700 (PDT) X-Google-Smtp-Source: AKy350Z4YnBhIFmFYtX628e45mVg1FlAFZaC6VBkD4SV69yYWS0g+mDln9HL9DL/aMxDs2kUtpGr X-Received: by 2002:aa7:cc04:0:b0:4fa:fcee:1727 with SMTP id q4-20020aa7cc04000000b004fafcee1727mr22974531edt.13.1680280718521; Fri, 31 Mar 2023 09:38:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280718; cv=none; d=google.com; s=arc-20160816; b=GBB8Vu58yzUCr/8rSZvuCmu7MSaDaxSGS+f6SZVjKmF0+KUneUqMuvnzEAZdPJKe1N 5vFWkbJ3pJRqQChJInZdAdGjFdCRqTOZtst3YwACQ+fmHrBkcqZeLDO8/HmM+2ZZgpa4 F3waETgmHhbnevhOUufQh0YivX+pNtL4Gq8O3HfWme08q6aqTSKzYBkJ8QEqM45BAgh2 Mp/efDZALR6Fjk5xbJ93Bzl4lYt+xYGcr1lRKU9FuuoW+QlqT+6D3gyRrhHcFk5eH17Q XcEzdy1KQWcwOt11kODjRzSvTCUkRgW/CnMWN4HVwFeI2+iUGzfUgQIImB9crb1vu+zA C8hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4EhmophDUNRsJifv6aZ6zofwzEVulzOQZ9lN1G4MZvA=; b=uSjQ+nQsuoPxtGq/Iw7xnjaTso1OGKYtZhqIT5RdiXkEt5/wvAs+PfZpo12bo9QMiz +Gp4BLtJbKot1XWoQ9WlIzwZJWJNYmxFhAtHKDG0WW1PEnjl+pTKrJvnWgig+4+Jcpuv otxeKq/ub1NpNR6dex606n14XYM6sj4TXkTyl+fze3LaNrzNdC0xO/PKKJ3tZlulsUhN Lr2ywAq6dTkytvOsBjrmyrrkPM4k/L4MzVWCEjgwcWBeGUerljq6xSb/+qKggc0YCZno EP/NycuhJ7pc2btXpXQoQYMiPnr8yDJUlv166WlIkesFf6x5M5gTDSVC8M5dQdIsy9XQ /g/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bmmSABIO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e17-20020aa7d7d1000000b0050223f6f6c3si2415545eds.108.2023.03.31.09.38.14; Fri, 31 Mar 2023 09:38:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bmmSABIO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233416AbjCaQRK (ORCPT + 99 others); Fri, 31 Mar 2023 12:17:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233398AbjCaQQr (ORCPT ); Fri, 31 Mar 2023 12:16:47 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1795B21AB9 for ; Fri, 31 Mar 2023 09:11:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279058; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4EhmophDUNRsJifv6aZ6zofwzEVulzOQZ9lN1G4MZvA=; b=bmmSABIOUdPqsSZ9BXzJCJCh87f7+6yrZRJvu1AjaK4Yro9nu7x3pUxpO/Tbtw9+9yBvcX Y9ctleZhpswS1v7mfwKtlqxXK2djTHEb8CmAu6kJET12sp9oyWxs4M/YN7M4fGufRuAKkH yTI8C1I2Bi18F7b2oxeGZrYTJXkAxTU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-377-HVkLcFjMPH6uSNSWjZfLnw-1; Fri, 31 Mar 2023 12:10:51 -0400 X-MC-Unique: HVkLcFjMPH6uSNSWjZfLnw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C56DE185A791; Fri, 31 Mar 2023 16:10:49 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBA5914171B6; Fri, 31 Mar 2023 16:10:47 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ayush Sawal Subject: [PATCH v3 32/55] chelsio: Convert chtls_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:51 +0100 Message-Id: <20230331160914.1608208-33-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902034821806814?= X-GMAIL-MSGID: =?utf-8?q?1761902034821806814?= Convert chtls_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Ayush Sawal cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- .../chelsio/inline_crypto/chtls/chtls_io.c | 109 ++---------------- 1 file changed, 7 insertions(+), 102 deletions(-) diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index ca3daf5df95c..5c397cb57300 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c @@ -1288,110 +1288,15 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) int chtls_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct chtls_sock *csk; - struct chtls_dev *cdev; - int mss, err, copied; - struct tcp_sock *tp; - long timeo; - - tp = tcp_sk(sk); - copied = 0; - csk = rcu_dereference_sk_user_data(sk); - cdev = csk->cdev; - lock_sock(sk); - timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - err = sk_stream_wait_connect(sk, &timeo); - if (!sk_in_state(sk, TCPF_ESTABLISHED | TCPF_CLOSE_WAIT) && - err != 0) - goto out_err; - - mss = csk->mss; - csk_set_flag(csk, CSK_TX_MORE_DATA); - - while (size > 0) { - struct sk_buff *skb = skb_peek_tail(&csk->txq); - int copy, i; - - if (!skb || (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND) || - (copy = mss - skb->len) <= 0) { -new_buf: - if (!csk_mem_free(cdev, sk)) - goto wait_for_sndbuf; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - if (is_tls_tx(csk)) { - skb = get_record_skb(sk, - select_size(sk, size, - flags, - TX_TLSHDR_LEN), - true); - } else { - skb = get_tx_skb(sk, 0); - } - if (!skb) - goto wait_for_memory; - copy = mss; - } - if (copy > size) - copy = size; - - i = skb_shinfo(skb)->nr_frags; - if (skb_can_coalesce(skb, i, page, offset)) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); - } else if (i < MAX_SKB_FRAGS) { - get_page(page); - skb_fill_page_desc(skb, i, page, offset, copy); - } else { - tx_skb_finalize(skb); - push_frames_if_head(sk); - goto new_buf; - } - - skb->len += copy; - if (skb->len == mss) - tx_skb_finalize(skb); - skb->data_len += copy; - skb->truesize += copy; - sk->sk_wmem_queued += copy; - tp->write_seq += copy; - copied += copy; - offset += copy; - size -= copy; - - if (corked(tp, flags) && - (sk_stream_wspace(sk) < sk_stream_min_wspace(sk))) - ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND; - - if (!size) - break; - - if (unlikely(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND)) - push_frames_if_head(sk); - continue; -wait_for_sndbuf: - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); -wait_for_memory: - err = csk_wait_memory(cdev, sk, &timeo); - if (err) - goto do_error; - } -out: - csk_reset_flag(csk, CSK_TX_MORE_DATA); - if (copied) - chtls_tcp_push(sk, flags); -done: - release_sock(sk); - return copied; - -do_error: - if (copied) - goto out; - -out_err: - if (csk_conn_inline(csk)) - csk_reset_flag(csk, CSK_TX_MORE_DATA); - copied = sk_stream_error(sk, flags, err); - goto done; + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return chtls_sendmsg(sk, &msg, size); } static void chtls_select_window(struct sock *sk) From patchwork Fri Mar 31 16:08:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77870 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692268vqo; Fri, 31 Mar 2023 09:36:16 -0700 (PDT) X-Google-Smtp-Source: AKy350Z/K8ZHifBPk+Rm5+NCXIj2v227h1MltoMudKvsmsNaveQMydghLa9kHj15qvuGyfc357A5 X-Received: by 2002:a17:906:f193:b0:92b:eca6:43fc with SMTP id gs19-20020a170906f19300b0092beca643fcmr28250364ejb.64.1680280576246; Fri, 31 Mar 2023 09:36:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280576; cv=none; d=google.com; s=arc-20160816; b=VcPBTmyn9V+VeoY8wgdj5+i9BM7cLMEXNN3zLI8d3xxzeNC8qFc6+K3PkOqIE37Gi7 oo/FxuXH0WFoLgL3SkKWznuhCaCvz1NThYkz359xW2ujvpHIrdqyeI8pBwerUJIAX0T+ vvrk+RYDyhP9PRVmQDDwD9UidWL4mnyKrSHTBz5QJk3HmMpVHe0yYH+ThrKoxi3CKJmr inpw+e9R6xGe82nedSpZmguHZUGMZE3QU1ugcAPdLyuuG10R5Gvpr25Xu8vRt0vXockz 4PP4kWjK553ViFBBnoLZmy6Y49WX6c8vkFEa1K/dMN2GKQatd3lqBTyEzNOa1V8bxCQQ mUeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KbTdePCQuY00Eh43/mGhCoyrcL0EDX5k9L0XaPYIsA0=; b=qkUTmDgy5QGB5DiUIq6p/gRu1y1UR7c0va2OcMqOye/LR1ZZxHQifknR2hXeU0QBIz //xF3eAp4ZqbzS2JR8cQR44oEGVz7eZE89QIc82jN3+Ecf9zKB3+GEg6cb0BvEGfdKUX AgoAHQnyCWW9AVhwRk6Nu1AvcRyuxCZUHDQ+YxSGrSef3pI5Jt4l4vWjXVbNe9kbfPJX 1BomFYKPFy0/pqurtdhJQ8/EhqTDTL01mCEyTFZWhF99GjJs1O0frSY0KQ0gjPIH4/AY 1KLddYEQgJfh2QL1CW7O/IVIoyo/qx5GD6Msg8Yp5CeJ/C5rnYnuSZb99FOjmIh478+f lW9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DNrYTrhc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lv21-20020a170906bc9500b00947797f2840si1078731ejb.320.2023.03.31.09.35.52; Fri, 31 Mar 2023 09:36:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DNrYTrhc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233153AbjCaQQs (ORCPT + 99 others); Fri, 31 Mar 2023 12:16:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233391AbjCaQQa (ORCPT ); Fri, 31 Mar 2023 12:16:30 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA8C42658C for ; Fri, 31 Mar 2023 09:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KbTdePCQuY00Eh43/mGhCoyrcL0EDX5k9L0XaPYIsA0=; b=DNrYTrhcDpY+5WJqIl+waxiTz4PVv9bTa4MGMutlH6CHO6bxVH8cJuEPnDRMaLUmseHKM7 A3vwdzrc145tImcq6UJP4biY8cP6JfS+2GVhEMLjwt+SeT1I+o3johwwKvolHrlVagztYY XUl2hzO4a4SBgvO0cXmLGcu3oPCmHmk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-615--qgn__EoP-eE5uusdvkQ2A-1; Fri, 31 Mar 2023 12:10:54 -0400 X-MC-Unique: -qgn__EoP-eE5uusdvkQ2A-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7FD543802AC1; Fri, 31 Mar 2023 16:10:52 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 63D4851FF; Fri, 31 Mar 2023 16:10:50 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Tom Herbert , Tom Herbert Subject: [PATCH v3 33/55] kcm: Support MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:52 +0100 Message-Id: <20230331160914.1608208-34-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901885533544298?= X-GMAIL-MSGID: =?utf-8?q?1761901885533544298?= Make AF_KCM sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible and copied otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Tom Herbert cc: Tom Herbert cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/kcm/kcmsock.c | 89 ++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 72 insertions(+), 17 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index cfe828bd7fc6..0a3f79d81595 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -989,29 +989,84 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) merge = false; } - copy = min_t(int, msg_data_left(msg), - pfrag->size - pfrag->offset); + if (msg->msg_flags & MSG_SPLICE_PAGES) { + struct page *page = NULL, **pages = &page; + size_t off; + bool put = false; + + err = iov_iter_extract_pages(&msg->msg_iter, &pages, + INT_MAX, 1, 0, &off); + if (err <= 0) { + err = err ?: -EIO; + goto out_error; + } + copy = err; - if (!sk_wmem_schedule(sk, copy)) - goto wait_for_memory; + if (skb_can_coalesce(skb, i, page, off)) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + goto coalesced; + } - err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, - pfrag->page, - pfrag->offset, - copy); - if (err) - goto out_error; + if (!sk_wmem_schedule(sk, copy)) { + iov_iter_revert(&msg->msg_iter, copy); + goto wait_for_memory; + } + + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; - /* Update the skb. */ - if (merge) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + q = page_frag_memdup(NULL, p + off, copy, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, copy); + err = -ENOMEM; + goto out_error; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } + + skb_fill_page_desc_noacc(skb, i, page, off, copy); + if (put) + put_page(page); +coalesced: + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + sk->sk_wmem_queued += copy; + sk_mem_charge(sk, copy); + + if (head != skb) + head->truesize += copy; } else { - skb_fill_page_desc(skb, i, pfrag->page, - pfrag->offset, copy); - get_page(pfrag->page); + copy = min_t(int, msg_data_left(msg), + pfrag->size - pfrag->offset); + if (!sk_wmem_schedule(sk, copy)) + goto wait_for_memory; + + err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, + pfrag->page, + pfrag->offset, + copy); + if (err) + goto out_error; + + /* Update the skb. */ + if (merge) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + skb_fill_page_desc(skb, i, pfrag->page, + pfrag->offset, copy); + get_page(pfrag->page); + } + + pfrag->offset += copy; } - pfrag->offset += copy; copied += copy; if (head != skb) { head->len += copy; From patchwork Fri Mar 31 16:08:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692836vqo; Fri, 31 Mar 2023 09:37:16 -0700 (PDT) X-Google-Smtp-Source: AKy350aA4OLX22tKyHGqp7RcIPLXdSgDfkkmi3/bBTeVWdDXBEWtTqDdiP4epd8ORY/T5dq/7KXI X-Received: by 2002:aa7:d619:0:b0:502:1d22:4890 with SMTP id c25-20020aa7d619000000b005021d224890mr26613257edr.6.1680280636675; Fri, 31 Mar 2023 09:37:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280636; cv=none; d=google.com; s=arc-20160816; b=OJne1jzozHM0Vhf3+wNye1FiHt80VtRzstX9HD80ts7AtPmFXHOFzdU48gfw5tzTDk cO3Iu9nlzZRw2STc10sGQiGylQGI7GAa/w5VAhvOjzlSGYVktFn7xBgGXdNCcReEU4e9 /5ljMoGCSERaq+bD5VOr8lmazTTZaKAVY19zo6h9wH16wvvJEZed9EbHMXZNlR9AN3VB 2TzXZ+wsB8IQI1yfuv07VAOVfvD0vFO/0KbG8TNep3kmQ/09rnyQCi9SBOjS32D6BYdW g5WMAN9e1reSX7d54wF/nJTCrPhZR5syeHfSvbHYJ81LVz8mZHWK6zpCZl7se6ueYxqI n8BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jPEZCXpX5VIblKmRhP3z5vIQLhv4QpeLvADltfz6mNU=; b=l/DZByebvbvwcvi5Y5TjnD0jMc94vgqwiKx3YrmOEIAzcuq/0IXdC71UtASA7ZAJoV G1UA2H3SNmvCFFLAmmb9CNIJb2myd+CjzfCYnwDnqv42cd0hnvixd5dikUHoRAx4CHa1 O6jj1nA6ZnLi3LWhPiK/NJDtRSBlpp/W9VMAZ6aYnqCazeye7Br2m4rNXOPf+zckwpAn pNedEuyeBECN0LvW0nNSXzaFwn6Aqz/gZCvqWSJC1NzQJ3IPoXp6zdXu7ax4uRweAKzv F/BwxOKZvaTEzyPu9faF8wX1hFCWgWqRxZ0FGHERcH+0Q8fBXp/ojwxh2RChAA8lewxU H3EA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkN0rbwa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e17-20020aa7d7d1000000b0050223f6f6c3si2415545eds.108.2023.03.31.09.36.52; Fri, 31 Mar 2023 09:37:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkN0rbwa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233428AbjCaQRN (ORCPT + 99 others); Fri, 31 Mar 2023 12:17:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233425AbjCaQQy (ORCPT ); Fri, 31 Mar 2023 12:16:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2CD520D90 for ; Fri, 31 Mar 2023 09:11:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jPEZCXpX5VIblKmRhP3z5vIQLhv4QpeLvADltfz6mNU=; b=fkN0rbwa1UcRPn2C+IrJmqsAR5MBaFzay4dB7sAl+1muU4/I49MFXwVFme/OIgLy5zpDm7 k8xVTz/MPS5jLfT3ze+YN+HJTrxuO2jrslU9jWuTCou4lwnwtWjzYlvahLKL3lFffYUr1z c6kOgCqEmVA5PM0nn9QOWOLprsGZq7Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-650-1RKJZrZKPJSsBYeQeItUoQ-1; Fri, 31 Mar 2023 12:10:56 -0400 X-MC-Unique: 1RKJZrZKPJSsBYeQeItUoQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 504CB801206; Fri, 31 Mar 2023 16:10:55 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3D9F71121314; Fri, 31 Mar 2023 16:10:53 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Tom Herbert , Tom Herbert Subject: [PATCH v3 34/55] kcm: Convert kcm_sendpage() to use MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:08:53 +0100 Message-Id: <20230331160914.1608208-35-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901948826691549?= X-GMAIL-MSGID: =?utf-8?q?1761901948826691549?= Convert kcm_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Tom Herbert cc: Tom Herbert cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/kcm/kcmsock.c | 161 ++++++---------------------------------------- 1 file changed, 18 insertions(+), 143 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 0a3f79d81595..d77d28fbf389 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -761,149 +761,6 @@ static void kcm_push(struct kcm_sock *kcm) kcm_write_msgs(kcm); } -static ssize_t kcm_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) - -{ - struct sock *sk = sock->sk; - struct kcm_sock *kcm = kcm_sk(sk); - struct sk_buff *skb = NULL, *head = NULL; - long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); - bool eor; - int err = 0; - int i; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - /* No MSG_EOR from splice, only look at MSG_MORE */ - eor = !(flags & MSG_MORE); - - lock_sock(sk); - - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); - - err = -EPIPE; - if (sk->sk_err) - goto out_error; - - if (kcm->seq_skb) { - /* Previously opened message */ - head = kcm->seq_skb; - skb = kcm_tx_msg(head)->last_skb; - i = skb_shinfo(skb)->nr_frags; - - if (skb_can_coalesce(skb, i, page, offset)) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], size); - skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; - goto coalesced; - } - - if (i >= MAX_SKB_FRAGS) { - struct sk_buff *tskb; - - tskb = alloc_skb(0, sk->sk_allocation); - while (!tskb) { - kcm_push(kcm); - err = sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - if (head == skb) - skb_shinfo(head)->frag_list = tskb; - else - skb->next = tskb; - - skb = tskb; - skb->ip_summed = CHECKSUM_UNNECESSARY; - i = 0; - } - } else { - /* Call the sk_stream functions to manage the sndbuf mem. */ - if (!sk_stream_memory_free(sk)) { - kcm_push(kcm); - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - err = sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - head = alloc_skb(0, sk->sk_allocation); - while (!head) { - kcm_push(kcm); - err = sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - skb = head; - i = 0; - } - - get_page(page); - skb_fill_page_desc_noacc(skb, i, page, offset, size); - skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; - -coalesced: - skb->len += size; - skb->data_len += size; - skb->truesize += size; - sk->sk_wmem_queued += size; - sk_mem_charge(sk, size); - - if (head != skb) { - head->len += size; - head->data_len += size; - head->truesize += size; - } - - if (eor) { - bool not_busy = skb_queue_empty(&sk->sk_write_queue); - - /* Message complete, queue it on send buffer */ - __skb_queue_tail(&sk->sk_write_queue, head); - kcm->seq_skb = NULL; - KCM_STATS_INCR(kcm->stats.tx_msgs); - - if (flags & MSG_BATCH) { - kcm->tx_wait_more = true; - } else if (kcm->tx_wait_more || not_busy) { - err = kcm_write_msgs(kcm); - if (err < 0) { - /* We got a hard error in write_msgs but have - * already queued this message. Report an error - * in the socket, but don't affect return value - * from sendmsg - */ - pr_warn("KCM: Hard failure on kcm_write_msgs\n"); - report_csk_error(&kcm->sk, -err); - } - } - } else { - /* Message not complete, save state */ - kcm->seq_skb = head; - kcm_tx_msg(head)->last_skb = skb; - } - - KCM_STATS_ADD(kcm->stats.tx_bytes, size); - - release_sock(sk); - return size; - -out_error: - kcm_push(kcm); - - err = sk_stream_error(sk, flags, err); - - /* make sure we wake any epoll edge trigger waiter */ - if (unlikely(skb_queue_len(&sk->sk_write_queue) == 0 && err == -EAGAIN)) - sk->sk_write_space(sk); - - release_sock(sk); - return err; -} - static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; @@ -1143,6 +1000,24 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) return err; } +static ssize_t kcm_sendpage(struct socket *sock, struct page *page, + int offset, size_t size, int flags) + +{ + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; + + if (flags & MSG_OOB) + return -EOPNOTSUPP; + + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return kcm_sendmsg(sock, &msg, size); +} + static int kcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { From patchwork Fri Mar 31 16:08:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77844 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp684808vqo; Fri, 31 Mar 2023 09:25:09 -0700 (PDT) X-Google-Smtp-Source: AKy350az2plc46RckFfDnpwPMbtnpWav2PZYUI2H/X0wRh1tTHczpTsg1aQqNx+Xh3EF3kKs1kZ/ X-Received: by 2002:a17:907:86ac:b0:92b:eefb:b966 with SMTP id qa44-20020a17090786ac00b0092beefbb966mr37455894ejc.0.1680279909139; Fri, 31 Mar 2023 09:25:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279909; cv=none; d=google.com; s=arc-20160816; b=X5hgBKM8Xcl1uzUCUa36LAVVyLn/hnSi2RA8S1qIaEBnuMljIsa/TDkkUfvlPl0rr3 onfoaUKYW/XzSWXaRPLE+zu0aiylfDzz0L9WBz2n6T9dZ1Ew+adVL81IClreUbHvfe05 v2FSm1oH82zjqi6zPRUD6v+XBLsnK49VAKK6kuQhgXjbDv11+sv5zjir1JgTM8FNQ73J uef9L3GMhYFYYMjZTjqGWX88obggaSvNQAk864Pa5DnI38wCbzcRi9i1/o15GdA6fdEy tOzrrJp4KFwdCUrPurmd5B/+Ss80AG7O0pEHL5C/hkrlSYPsVLG1P1iv3KvUa+YfH2Wo czCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rdhhmJu37H1diE1YTKpbylxaJjF+u38cNy50QUFNMbM=; b=Z+WwCjQ/iKi8dZGksnu+VdOGbWknC16jLwMi5eAFJweltFas0VTxEjYJt/KLTNTkUc dl6bZVBbOQhSTjGvSxIszYtU+64+a+I249IvlttcqPkzRLrR5NPRw0TviV89BjUXUB2s 3QmmgRh54E/nTleoUlxCDZ9H7qHq++ZL/mn9p8liQEjnIJHcempmfGG4HY/ssANGarYL sqI/9oT58OOTsOy6AFmzndLtkRM3L2G6Ez7NnU8pCwzy2ts3Mxl/LB/d5f9roof+2RJ2 3fbMaGplYGGaVG8FzAiv/hTmsXUbP0MTvgsSsZOXkIIRp+oZp9oPloKR5SK0DtAMcKhF CBAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="IQW/fC1i"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bm11-20020a170906c04b00b0092043425e1bsi2278187ejb.714.2023.03.31.09.24.44; Fri, 31 Mar 2023 09:25:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="IQW/fC1i"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233438AbjCaQRn (ORCPT + 99 others); Fri, 31 Mar 2023 12:17:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233081AbjCaQRQ (ORCPT ); Fri, 31 Mar 2023 12:17:16 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BD3ED4F88 for ; Fri, 31 Mar 2023 09:11:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rdhhmJu37H1diE1YTKpbylxaJjF+u38cNy50QUFNMbM=; b=IQW/fC1iRNGAFEkS/y/FMAJzvVzppWW7C0zoRcaeGcs+YME5F6JgJGzyu26iEm9pCh0rPH wP22oIn/BpNkkMfBCadJat160oQHPp0OfM58028lqh3YAVa/kAf/7FJZ1OLyzXPemuvaSf kirA8ZzicjL3XdS0nS45+bp1fKkX+ew= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-373-HXcdn5EjOZiGCXUameYioQ-1; Fri, 31 Mar 2023 12:11:02 -0400 X-MC-Unique: HXcdn5EjOZiGCXUameYioQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C329F3802AC3; Fri, 31 Mar 2023 16:10:57 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DF6F714171B6; Fri, 31 Mar 2023 16:10:55 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 35/55] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() Date: Fri, 31 Mar 2023 17:08:54 +0100 Message-Id: <20230331160914.1608208-36-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901186228129262?= X-GMAIL-MSGID: =?utf-8?q?1761901186228129262?= Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() to splice data from a pipe to a socket. This paves the way for passing in multiple pages at once from a pipe and the handling of multipage folios. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- fs/splice.c | 42 +++++++++++++++++++++++------------------- include/linux/fs.h | 2 -- include/linux/splice.h | 2 ++ net/socket.c | 26 ++------------------------ 4 files changed, 27 insertions(+), 45 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index f46dd1fb367b..23ead122d631 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -410,29 +411,32 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = { }; EXPORT_SYMBOL(nosteal_pipe_buf_ops); +#ifdef CONFIG_NET /* * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos' * using sendpage(). Return the number of bytes sent. */ -static int pipe_to_sendpage(struct pipe_inode_info *pipe, - struct pipe_buffer *buf, struct splice_desc *sd) +static int pipe_to_sendmsg(struct pipe_inode_info *pipe, + struct pipe_buffer *buf, struct splice_desc *sd) { - struct file *file = sd->u.file; - loff_t pos = sd->pos; - int more; - - if (!likely(file->f_op->sendpage)) - return -EINVAL; + struct socket *sock = sock_from_file(sd->u.file); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES, + }; - more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0; + if (sd->flags & SPLICE_F_MORE) + msg.msg_flags |= MSG_MORE; if (sd->len < sd->total_len && pipe_occupancy(pipe->head, pipe->tail) > 1) - more |= MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; - return file->f_op->sendpage(file, buf->page, buf->offset, - sd->len, &pos, more); + bvec_set_page(&bvec, buf->page, sd->len, buf->offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, sd->len); + return sock_sendmsg(sock, &msg); } +#endif static void wakeup_pipe_writers(struct pipe_inode_info *pipe) { @@ -614,7 +618,7 @@ static void splice_from_pipe_end(struct pipe_inode_info *pipe, struct splice_des * Description: * This function does little more than loop over the pipe and call * @actor to do the actual moving of a single struct pipe_buffer to - * the desired destination. See pipe_to_file, pipe_to_sendpage, or + * the desired destination. See pipe_to_file, pipe_to_sendmsg, or * pipe_to_user. * */ @@ -795,8 +799,9 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out, EXPORT_SYMBOL(iter_file_splice_write); +#ifdef CONFIG_NET /** - * generic_splice_sendpage - splice data from a pipe to a socket + * splice_to_socket - splice data from a pipe to a socket * @pipe: pipe to splice from * @out: socket to write to * @ppos: position in @out @@ -808,13 +813,12 @@ EXPORT_SYMBOL(iter_file_splice_write); * is involved. * */ -ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, struct file *out, - loff_t *ppos, size_t len, unsigned int flags) +ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out, + loff_t *ppos, size_t len, unsigned int flags) { - return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendpage); + return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendmsg); } - -EXPORT_SYMBOL(generic_splice_sendpage); +#endif static int warn_unsupported(struct file *file, const char *op) { diff --git a/include/linux/fs.h b/include/linux/fs.h index c85916e9f7db..f3ccc243851e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2740,8 +2740,6 @@ extern ssize_t generic_file_splice_read(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); extern ssize_t iter_file_splice_write(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); -extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, - struct file *out, loff_t *, size_t len, unsigned int flags); extern long do_splice_direct(struct file *in, loff_t *ppos, struct file *out, loff_t *opos, size_t len, unsigned int flags); diff --git a/include/linux/splice.h b/include/linux/splice.h index 8f052c3dae95..e6153feda86c 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -87,6 +87,8 @@ extern long do_splice(struct file *in, loff_t *off_in, extern long do_tee(struct file *in, struct file *out, size_t len, unsigned int flags); +extern ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out, + loff_t *ppos, size_t len, unsigned int flags); /* * for dynamic pipe sizing diff --git a/net/socket.c b/net/socket.c index 0c39ce57d603..3e9bd8261357 100644 --- a/net/socket.c +++ b/net/socket.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -126,8 +127,6 @@ static long compat_sock_ioctl(struct file *file, unsigned int cmd, unsigned long arg); #endif static int sock_fasync(int fd, struct file *filp, int on); -static ssize_t sock_sendpage(struct file *file, struct page *page, - int offset, size_t size, loff_t *ppos, int more); static ssize_t sock_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); @@ -162,8 +161,7 @@ static const struct file_operations socket_file_ops = { .mmap = sock_mmap, .release = sock_close, .fasync = sock_fasync, - .sendpage = sock_sendpage, - .splice_write = generic_splice_sendpage, + .splice_write = splice_to_socket, .splice_read = sock_splice_read, .show_fdinfo = sock_show_fdinfo, }; @@ -1062,26 +1060,6 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg, } EXPORT_SYMBOL(kernel_recvmsg); -static ssize_t sock_sendpage(struct file *file, struct page *page, - int offset, size_t size, loff_t *ppos, int more) -{ - struct socket *sock; - int flags; - int ret; - - sock = file->private_data; - - flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0; - /* more is a combination of MSG_MORE and MSG_SENDPAGE_NOTLAST */ - flags |= more; - - ret = kernel_sendpage(sock, page, offset, size, flags); - - if (trace_sock_send_length_enabled()) - call_trace_sock_send_length(sock->sk, ret, 0); - return ret; -} - static ssize_t sock_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) From patchwork Fri Mar 31 16:08:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77855 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690595vqo; Fri, 31 Mar 2023 09:33:40 -0700 (PDT) X-Google-Smtp-Source: AKy350ay1PfOW5HjuvKeurj02Yy4SOPp2do3ZCxcXkjME+1JDX1tFhywIFS38e4nerSB+67VT5dt X-Received: by 2002:a05:6a00:4309:b0:627:de16:889c with SMTP id cb9-20020a056a00430900b00627de16889cmr11577121pfb.4.1680280419756; Fri, 31 Mar 2023 09:33:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280419; cv=none; d=google.com; s=arc-20160816; b=P60kQGDvshFzfTTowRwAVqzLNuyeATs4ZnHO+Eq08Nl52irZlszESeEqOigPcreuqe K98kjAOyOwz0ZXSgjU9I79n1n25jljd6ajtG5uc2p9HwXFMZUaEyvlUkgU7D2H8FwLmZ Vd1DkaBddxGkWTzl4zqGlEi/ABxL/itwjgQ0zlfJx68cpZJ1uNB3D5ROnVNs4ndLGZms nZS71cMEsqMwxXfnHW1tbVxho1p95FwnPNyd2CXbYgpN2NhezLm7jTeWce7QRCfhEUtu kkLdiSUTIznjmR3yrU+ol+g+GxNx27TVty+S8RtVjKQfnu/XHqZYn03D1l5k+I/9WfzC jAlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ecYVxyEqGMFmF1eT8L3egIdesMHxN/xKxl+aET3XOq4=; b=D//aHicWBhm5dj6obSIFgf9T3u7DClCCXfEXernncqI4JJWbRBkMfmJG7bnX3bsK9c tV/EzRrcIFiYO4TNHR+1aZzyFqgnxsk+q6/ag0foPKjO8pyiQhHPJBuT6T/m1omtJ1A/ S/yDhebF6EjVGeUjohOJtaxW0VTr23E5siePVKZF/i+Sux3nm6k1YRSZmXH/m/H6FITl GREGG/XqXm4XbxBpmb2yYB8l4hG1z6zZDEDEUfuXp5H+5xd/J4EqU5/ety5ibc7ShKHB 6BiC8LCgGHDXRHvILr9meiUmWL80pGowc+0R3kIGNepyYuSzWPO4u1NP7sGYmzl7cKOc oM2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SPpBq8hB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z25-20020a656659000000b0051344de997fsi2611333pgv.437.2023.03.31.09.33.27; Fri, 31 Mar 2023 09:33:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SPpBq8hB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233445AbjCaQSS (ORCPT + 99 others); Fri, 31 Mar 2023 12:18:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230239AbjCaQRo (ORCPT ); Fri, 31 Mar 2023 12:17:44 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3A5B27800 for ; Fri, 31 Mar 2023 09:11:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ecYVxyEqGMFmF1eT8L3egIdesMHxN/xKxl+aET3XOq4=; b=SPpBq8hBalZilZpGY9od1pmIL5UDzi3vHeNclb3F/LTirxWVZgyy08ou+5rxbGIPY4JaSq XyLifpLLYE7iKvOaFLUhLLnRJpFE9hvtzrYFIJ3uRydABZCy3itF99jIdW+YahnExChAB7 Sua+lMMgNjcWn4nlARORLraxnrK4lkA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-364-3bh1vnySPnOS90RyB8ZPRw-1; Fri, 31 Mar 2023 12:11:04 -0400 X-MC-Unique: 3bh1vnySPnOS90RyB8ZPRw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 61AEC88606A; Fri, 31 Mar 2023 16:11:00 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7FD721121314; Fri, 31 Mar 2023 16:10:58 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 36/55] splice, net: Reimplement splice_to_socket() to pass multiple bufs to sendmsg() Date: Fri, 31 Mar 2023 17:08:55 +0100 Message-Id: <20230331160914.1608208-37-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901721497473910?= X-GMAIL-MSGID: =?utf-8?q?1761901721497473910?= Reimplement splice_to_socket() so that it can pass multiple pipe buffer pages to sendmsg() in a single go. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- fs/splice.c | 148 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 120 insertions(+), 28 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 23ead122d631..d5bc28b59720 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -411,33 +411,6 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = { }; EXPORT_SYMBOL(nosteal_pipe_buf_ops); -#ifdef CONFIG_NET -/* - * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos' - * using sendpage(). Return the number of bytes sent. - */ -static int pipe_to_sendmsg(struct pipe_inode_info *pipe, - struct pipe_buffer *buf, struct splice_desc *sd) -{ - struct socket *sock = sock_from_file(sd->u.file); - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = MSG_SPLICE_PAGES, - }; - - if (sd->flags & SPLICE_F_MORE) - msg.msg_flags |= MSG_MORE; - - if (sd->len < sd->total_len && - pipe_occupancy(pipe->head, pipe->tail) > 1) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, buf->page, sd->len, buf->offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, sd->len); - return sock_sendmsg(sock, &msg); -} -#endif - static void wakeup_pipe_writers(struct pipe_inode_info *pipe) { smp_mb(); @@ -816,7 +789,126 @@ EXPORT_SYMBOL(iter_file_splice_write); ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out, loff_t *ppos, size_t len, unsigned int flags) { - return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendmsg); + struct socket *sock = sock_from_file(out); + struct bio_vec bvec[16]; + struct msghdr msg = {}; + ssize_t ret; + size_t spliced = 0; + bool need_wakeup = false; + + pipe_lock(pipe); + + while (len > 0) { + unsigned int head, tail, mask, bc = 0; + size_t remain = len; + + /* + * Check for signal early to make process killable when there + * are always buffers available + */ + ret = -ERESTARTSYS; + if (signal_pending(current)) + break; + + while (pipe_empty(pipe->head, pipe->tail)) { + ret = 0; + if (!pipe->writers) + goto out; + + if (spliced) + goto out; + + ret = -EAGAIN; + if (flags & SPLICE_F_NONBLOCK) + goto out; + + ret = -ERESTARTSYS; + if (signal_pending(current)) + goto out; + + if (need_wakeup) { + wakeup_pipe_writers(pipe); + need_wakeup = false; + } + + pipe_wait_readable(pipe); + } + + head = pipe->head; + tail = pipe->tail; + mask = pipe->ring_size - 1; + + while (!pipe_empty(head, tail)) { + struct pipe_buffer *buf = &pipe->bufs[tail & mask]; + size_t seg; + + if (!buf->len) { + tail++; + continue; + } + + seg = min_t(size_t, remain, buf->len); + seg = min_t(size_t, seg, PAGE_SIZE); + + ret = pipe_buf_confirm(pipe, buf); + if (unlikely(ret)) { + if (ret == -ENODATA) + ret = 0; + break; + } + + bvec_set_page(&bvec[bc++], buf->page, seg, buf->offset); + remain -= seg; + if (seg >= buf->len) + tail++; + if (bc >= ARRAY_SIZE(bvec)) + break; + } + + if (!bc) + break; + + msg.msg_flags = 0; + if (flags & SPLICE_F_MORE) + msg.msg_flags = MSG_MORE; + if (remain && pipe_occupancy(pipe->head, tail) > 0) + msg.msg_flags = MSG_MORE; + msg.msg_flags |= MSG_SPLICE_PAGES; + + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, bc, len - remain); + ret = sock_sendmsg(sock, &msg); + if (ret <= 0) + break; + + spliced += ret; + len -= ret; + tail = pipe->tail; + while (ret > 0) { + struct pipe_buffer *buf = &pipe->bufs[tail & mask]; + size_t seg = min_t(size_t, ret, buf->len); + + buf->offset += seg; + buf->len -= seg; + ret -= seg; + + if (!buf->len) { + pipe_buf_release(pipe, buf); + tail++; + } + } + + if (tail != pipe->tail) { + pipe->tail = tail; + if (pipe->files) + need_wakeup = true; + } + } + +out: + pipe_unlock(pipe); + if (need_wakeup) + wakeup_pipe_writers(pipe); + return spliced ?: ret; } #endif From patchwork Fri Mar 31 16:08:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77915 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp712806vqo; Fri, 31 Mar 2023 10:09:45 -0700 (PDT) X-Google-Smtp-Source: AKy350ZE/OnO1xvDZtizrDKbmlkI4GPbi8d6K8B9dW3LtZrZTmyoGpamVmVOE6xDSZGDNJxDMdFr X-Received: by 2002:a17:902:da88:b0:1a1:b3bb:cd5b with SMTP id j8-20020a170902da8800b001a1b3bbcd5bmr33296007plx.62.1680282585392; Fri, 31 Mar 2023 10:09:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680282585; cv=none; d=google.com; s=arc-20160816; b=XKkG97/HCUi4ZOwQtWBiNUaSNJ03u/4OQbE7PVZ3+hVZ0uef3YTVGxOWH4i1VtRrfQ XJHOWR54qN2qM55gL5Wel3EvOeDlNR5DjsBeopYT+Ql22UssF8b1RRiE6Y2zMTNHBn/t 7qm8yh5NyGBXmS9wGjZ2WafE9+HQ9VJNzsS+lEweaFAL7NdVNKpc1AYfjYM3Y2gUduLN JDdXWSTUWf7afDyg469QnUXkhz2iKd3pTtKJZ0mMGPZOqCIx6MmWEUq6SOjLIHcj3dbc jBfVV36jCn2RG6jaExEeTfRD0awsAp170PlUoBuZjey9NDNUNkXQcMh9oav1Bp+sLKSk /3/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=BwLwJU8AwjD+1uYy2mnm7V08kEOLBE2Z8d7v3yKcDiA=; b=CeDWzs2jmzy75F+ENycNwAndpcWci2QKoh+Ax2wgXHbeKg0ohOrQxyUUuDxeFv6I7q 2IMzf0RlW1pGienzfJKQoREhvVsHsKYftMOcVkFQXr/v6XPHDNcqYic0KOcSrFccrzBY vTvBPHUD2CYXd8PoNDy9mqNZmeL08mDTeA7/2npWG77uI/XnohRXKNTNHAbzRSSWhCJG yQh+kgCnx99zPd3FNVO3am4DZtDErLNmsCY0G7CFc6z1MRkoQhOO4X87Gbh9PEcvdtBi chb0ZpGbQAjhBDon2e0ZVwjfGCE9GGoEnQlCG1xC6vF6gfWN1A0o2jn5EZgrJ53faFIQ Kw1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gzuTtOmf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z11-20020a170902834b00b001a05d8436d8si2536519pln.475.2023.03.31.10.09.32; Fri, 31 Mar 2023 10:09:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gzuTtOmf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231195AbjCaQZW (ORCPT + 99 others); Fri, 31 Mar 2023 12:25:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233385AbjCaQYu (ORCPT ); Fri, 31 Mar 2023 12:24:50 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 005902BEC9 for ; Fri, 31 Mar 2023 09:18:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BwLwJU8AwjD+1uYy2mnm7V08kEOLBE2Z8d7v3yKcDiA=; b=gzuTtOmf64m8N5Df0d315MkWUKkGQd0ADT51gFarLgt4f1ZcJpH8GrUShXsi98nDknIusW sWiq98jPnY3Vq+ABd0ZKNVp4y/c+qJ5JU7NZSZuo52MkgUzAMDt0RkrkI/YBAQgc6XNvaG o1r6jmn3cDQm0OpanYOpfD/njUOBCpc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-557-yXllut-UOFGxacMfCAiErA-1; Fri, 31 Mar 2023 12:11:05 -0400 X-MC-Unique: yXllut-UOFGxacMfCAiErA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DF65D85A5A3; Fri, 31 Mar 2023 16:11:02 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id EF7242166B33; Fri, 31 Mar 2023 16:11:00 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 37/55] Remove file->f_op->sendpage Date: Fri, 31 Mar 2023 17:08:56 +0100 Message-Id: <20230331160914.1608208-38-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761903992191767764?= X-GMAIL-MSGID: =?utf-8?q?1761903992191767764?= Remove file->f_op->sendpage as splicing to a socket now calls sendmsg rather than sendpage. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/linux/fs.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index f3ccc243851e..a9f1b2543d2c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1773,7 +1773,6 @@ struct file_operations { int (*fsync) (struct file *, loff_t, loff_t, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); - ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); int (*flock) (struct file *, int, struct file_lock *); From patchwork Fri Mar 31 16:08:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77858 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690815vqo; Fri, 31 Mar 2023 09:34:02 -0700 (PDT) X-Google-Smtp-Source: AK7set/teTfJhT7//3klO/ZfnzrFHZ6Rj15UG3fu1Dmg12fPszAp1OcZ6vYqVv5oxgPdLFFQajtg X-Received: by 2002:a05:6a20:b062:b0:d5:213a:476e with SMTP id dx34-20020a056a20b06200b000d5213a476emr21999048pzb.51.1680280442200; Fri, 31 Mar 2023 09:34:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280442; cv=none; d=google.com; s=arc-20160816; b=lKdEuNrHUY04W8Y5PDVjnGUinjHh9jJIHCw+g7iKpWGITBg8VsiruJA8iLlM0z7pl8 YrIGM9PsyKg3nXkSvRbKqoNxrLHRywP+V1Om1LiOTMC44udsf1xe1HTydgycKL6HlgPt gpaBprEIJOLfszWGbcl1LcgnBBgPykdmjSzHOtdwpXzfxgZf+kmYjtSEbvHB1SsYbDQH SSOMMlksXWm2EsPQEAknvMZFcq118xvOw8ViD7A+h1C+dGxN0/Pt6CSMm3UmBzPpqB/x 4WLCqBADCv3MyBrV9xK4VNdgFz6y2J9pkAqI6a+PISy9rTd/V8j2CkCpc9+WMvnMFFF9 1JzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uOfuvlKLCw3PwI6FlP4CG+iWtw5a+KyQbWE9E7aqyN8=; b=PN+nETm8MeLxH1/mU/VXUW1wdClrTjotcJXpDG+ZHn+FXaE9RUeX33wIVRROQ0mimr Om5v3dw0oZsf4SZsHz6Ukqwb+37BusWIXzUMkkBLOEsIqCV7MRKbAzq28w0AWEsgCVr7 aNoLBEc8wfz47x5W8DM61g1F+5LR/vvMdtJ4s8jxR/fl6TTIHOzbAiLmikrLuoG5AZo0 yB37/bVJYC1d2mW0kHq+yPUsZcF/Ne8EsSGIjgXYTZDGnBdfLReuU5lvyQaV9wirbDK0 E6W9F3YpRASnbPqAoO3Fv0U20odUcghVPc9RFw+HEha0I+GuOeE2NujSydFuxgbv14rW x53A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SxC7dBwL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j190-20020a6380c7000000b0050fa9bc63cbsi2601457pgd.432.2023.03.31.09.33.49; Fri, 31 Mar 2023 09:34:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SxC7dBwL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232520AbjCaQRq (ORCPT + 99 others); Fri, 31 Mar 2023 12:17:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233356AbjCaQRY (ORCPT ); Fri, 31 Mar 2023 12:17:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC5E5265B8 for ; Fri, 31 Mar 2023 09:11:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279070; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uOfuvlKLCw3PwI6FlP4CG+iWtw5a+KyQbWE9E7aqyN8=; b=SxC7dBwLPPCRfs5UPJx08wMBX2lVfo25E6cn8rCCbQG4fT8ObrtiXlCWCSdc+eDYxgBrft jsNZ/6RdhkO/WLkwCUiZSwwEexPbv3fr553TLpQlmcsU+WgFyctKq0OTLAXsxhymZC+P+j u+SzDIYlBQP2oJBhAZvYcBx6IiUGupM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-122-i1BP4N28P-Gza7I8sd9Uyg-1; Fri, 31 Mar 2023 12:11:06 -0400 X-MC-Unique: i1BP4N28P-Gza7I8sd9Uyg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A8F453C025B9; Fri, 31 Mar 2023 16:11:05 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7A3281415139; Fri, 31 Mar 2023 16:11:03 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [PATCH v3 38/55] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit Date: Fri, 31 Mar 2023 17:08:57 +0100 Message-Id: <20230331160914.1608208-39-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901744745015926?= X-GMAIL-MSGID: =?utf-8?q?1761901744745015926?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header and trailer (if present) are copied into page fragments that can be freed with put_page(). Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/infiniband/sw/siw/siw_qp_tx.c | 234 ++++++-------------------- 1 file changed, 48 insertions(+), 186 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index fa5de40d85d5..fbe80c06d0ca 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -312,114 +312,8 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, return rv; } -/* - * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. - * - * Using sendpage to push page by page appears to be less efficient - * than using sendmsg, even if data are copied. - * - * A general performance limitation might be the extra four bytes - * trailer checksum segment to be pushed after user data. - */ -static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, - size_t size) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = (MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST | - MSG_SPLICE_PAGES), - }; - struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0; - - while (size) { - size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); - - if (size + offset <= PAGE_SIZE) - msg.msg_flags = MSG_MORE | MSG_DONTWAIT; - - tcp_rate_check_app_limited(sk); - bvec_set_page(&bvec, page[i], bytes, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - -try_page_again: - lock_sock(sk); - rv = tcp_sendmsg_locked(sk, &msg, size); - release_sock(sk); - - if (rv > 0) { - size -= rv; - sent += rv; - if (rv != bytes) { - offset += rv; - bytes -= rv; - goto try_page_again; - } - offset = 0; - } else { - if (rv == -EAGAIN || rv == 0) - break; - return rv; - } - i++; - } - return sent; -} - -/* - * siw_0copy_tx() - * - * Pushes list of pages to TCP socket. If pages from multiple - * SGE's, all referenced pages of each SGE are pushed in one - * shot. - */ -static int siw_0copy_tx(struct socket *s, struct page **page, - struct siw_sge *sge, unsigned int offset, - unsigned int size) -{ - int i = 0, sent = 0, rv; - int sge_bytes = min(sge->length - offset, size); - - offset = (sge->laddr + offset) & ~PAGE_MASK; - - while (sent != size) { - rv = siw_tcp_sendpages(s, &page[i], offset, sge_bytes); - if (rv >= 0) { - sent += rv; - if (size == sent || sge_bytes > rv) - break; - - i += PAGE_ALIGN(sge_bytes + offset) >> PAGE_SHIFT; - sge++; - sge_bytes = min(sge->length, size - sent); - offset = sge->laddr & ~PAGE_MASK; - } else { - sent = rv; - break; - } - } - return sent; -} - #define MAX_TRAILER (MPA_CRC_SIZE + 4) -static void siw_unmap_pages(struct kvec *iov, unsigned long kmap_mask, int len) -{ - int i; - - /* - * Work backwards through the array to honor the kmap_local_page() - * ordering requirements. - */ - for (i = (len-1); i >= 0; i--) { - if (kmap_mask & BIT(i)) { - unsigned long addr = (unsigned long)iov[i].iov_base; - - kunmap_local((void *)(addr & PAGE_MASK)); - } - } -} - /* * siw_tx_hdt() tries to push a complete packet to TCP where all * packet fragments are referenced by the elements of one iovec. @@ -439,15 +333,14 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) { struct siw_wqe *wqe = &c_tx->wqe_active; struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx]; - struct kvec iov[MAX_ARRAY]; - struct page *page_array[MAX_ARRAY]; + struct bio_vec bvec[MAX_ARRAY]; struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR }; + void *trl, *t; int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv; unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0, sge_off = c_tx->sge_off, sge_idx = c_tx->sge_idx, pbl_idx = c_tx->pbl_idx; - unsigned long kmap_mask = 0L; if (c_tx->state == SIW_SEND_HDR) { if (c_tx->use_sendpage) { @@ -457,10 +350,15 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) c_tx->state = SIW_SEND_DATA; } else { - iov[0].iov_base = - (char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent; - iov[0].iov_len = hdr_len = - c_tx->ctrl_len - c_tx->ctrl_sent; + const void *hdr = &c_tx->pkt.ctrl + c_tx->ctrl_sent; + void *h; + + rv = -ENOMEM; + hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent; + h = page_frag_memdup(NULL, hdr, hdr_len, GFP_NOFS, ULONG_MAX); + if (!h) + goto done; + bvec_set_virt(&bvec[0], h, hdr_len); seg = 1; } } @@ -478,28 +376,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } else { is_kva = 1; } - if (is_kva && !c_tx->use_sendpage) { - /* - * tx from kernel virtual address: either inline data - * or memory region with assigned kernel buffer - */ - iov[seg].iov_base = - (void *)(uintptr_t)(sge->laddr + sge_off); - iov[seg].iov_len = sge_len; - - if (do_crc) - crypto_shash_update(c_tx->mpa_crc_hd, - iov[seg].iov_base, - sge_len); - sge_off += sge_len; - data_len -= sge_len; - seg++; - goto sge_done; - } while (sge_len) { size_t plen = min((int)PAGE_SIZE - fp_off, sge_len); - void *kaddr; if (!is_kva) { struct page *p; @@ -512,33 +391,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) p = siw_get_upage(mem->umem, sge->laddr + sge_off); if (unlikely(!p)) { - siw_unmap_pages(iov, kmap_mask, seg); wqe->processed -= c_tx->bytes_unsent; rv = -EFAULT; goto done_crc; } - page_array[seg] = p; - - if (!c_tx->use_sendpage) { - void *kaddr = kmap_local_page(p); - - /* Remember for later kunmap() */ - kmap_mask |= BIT(seg); - iov[seg].iov_base = kaddr + fp_off; - iov[seg].iov_len = plen; - - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - iov[seg].iov_base, - plen); - } else if (do_crc) { - kaddr = kmap_local_page(p); - crypto_shash_update(c_tx->mpa_crc_hd, - kaddr + fp_off, - plen); - kunmap_local(kaddr); - } + + bvec_set_page(&bvec[seg], p, plen, fp_off); } else { /* * Cast to an uintptr_t to preserve all 64 bits @@ -552,12 +410,15 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) * bits on a 64 bit platform and 32 bits on a * 32 bit platform. */ - page_array[seg] = virt_to_page((void *)(va & PAGE_MASK)); - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - (void *)va, - plen); + bvec_set_virt(&bvec[seg], (void *)va, plen); + } + + if (do_crc) { + void *kaddr = kmap_local_page(bvec[seg].bv_page); + crypto_shash_update(c_tx->mpa_crc_hd, + kaddr + bvec[seg].bv_offset, + bvec[seg].bv_len); + kunmap_local(kaddr); } sge_len -= plen; @@ -567,13 +428,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) if (++seg > (int)MAX_ARRAY) { siw_dbg_qp(tx_qp(c_tx), "to many fragments\n"); - siw_unmap_pages(iov, kmap_mask, seg-1); wqe->processed -= c_tx->bytes_unsent; rv = -EMSGSIZE; goto done_crc; } } -sge_done: + /* Update SGE variables at end of SGE */ if (sge_off == sge->length && (data_len != 0 || wqe->processed < wqe->bytes)) { @@ -582,15 +442,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) sge_off = 0; } } - /* trailer */ - if (likely(c_tx->state != SIW_SEND_TRAILER)) { - iov[seg].iov_base = &c_tx->trailer.pad[4 - c_tx->pad]; - iov[seg].iov_len = trl_len = MAX_TRAILER - (4 - c_tx->pad); - } else { - iov[seg].iov_base = &c_tx->trailer.pad[c_tx->ctrl_sent]; - iov[seg].iov_len = trl_len = MAX_TRAILER - c_tx->ctrl_sent; - } + /* Set the CRC in the trailer */ if (c_tx->pad) { *(u32 *)c_tx->trailer.pad = 0; if (do_crc) @@ -603,23 +456,29 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) else if (do_crc) crypto_shash_final(c_tx->mpa_crc_hd, (u8 *)&c_tx->trailer.crc); - data_len = c_tx->bytes_unsent; - - if (c_tx->use_sendpage) { - rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx], - c_tx->sge_off, data_len); - if (rv == data_len) { - rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len); - if (rv > 0) - rv += data_len; - else - rv = data_len; - } + /* Copy the trailer and add it to the output list */ + if (likely(c_tx->state != SIW_SEND_TRAILER)) { + trl = &c_tx->trailer.pad[4 - c_tx->pad]; + trl_len = MAX_TRAILER - (4 - c_tx->pad); } else { - rv = kernel_sendmsg(s, &msg, iov, seg + 1, - hdr_len + data_len + trl_len); - siw_unmap_pages(iov, kmap_mask, seg); + trl = &c_tx->trailer.pad[c_tx->ctrl_sent]; + trl_len = MAX_TRAILER - c_tx->ctrl_sent; } + + rv = -ENOMEM; + t = page_frag_memdup(NULL, trl, trl_len, GFP_NOFS, ULONG_MAX); + if (!t) + goto done_crc; + bvec_set_virt(&bvec[seg], t, trl_len); + + data_len = c_tx->bytes_unsent; + + if (c_tx->use_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, seg + 1, + hdr_len + data_len + trl_len); + rv = sock_sendmsg(s, &msg); + if (rv < (int)hdr_len) { /* Not even complete hdr pushed or negative rv */ wqe->processed -= data_len; @@ -680,6 +539,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } done_crc: c_tx->do_crc = 0; + if (c_tx->state == SIW_SEND_HDR) + folio_put(page_folio(bvec[0].bv_page)); + folio_put(page_folio(bvec[seg].bv_page)); done: return rv; } From patchwork Fri Mar 31 16:08:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77871 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp692351vqo; Fri, 31 Mar 2023 09:36:26 -0700 (PDT) X-Google-Smtp-Source: AKy350Y314QQTnu24U7pj9FEE+75D8YB64qvVUVMEPTjCBbKjxbaV0ZKIvRbjrN7OLbwwzsVFWHP X-Received: by 2002:a17:906:e112:b0:933:1b05:8851 with SMTP id gj18-20020a170906e11200b009331b058851mr26239286ejb.16.1680280586365; Fri, 31 Mar 2023 09:36:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280586; cv=none; d=google.com; s=arc-20160816; b=Yo4bMx7zf+YItgG1+gznAqWOcjcvNdiopsEsx2U7nNYKZvGaoI9jddCQNeEz5g1qpp j9qdzcyLckPWumFXRCXE+qNKrO4t7ytldbGq0BtLp3kixgTK0od56aKa0x1GR1NW4HzI 49zPRnvxe1lU3EtoMQVZ2u0MmwsKbZpmKjuTVplG+LHHgX2oZx0vLmknzGhvniNFYtwl vMpZg/HamjRNjMqzHyF4L9z1XHyn6Xwp/tVR6CeoIFBGzcy5m1uFlm44yIsX3A/NcNeg +7x80fsPTpOe2gAqDxc1qw3F2uDCJZB322j5jFlEIKLLNiU/IxbHTfpOrvK9gHHKzrG6 vf4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mcxOsbWhFQyukCMWrPOKlIpXJGoD9GGTGJfL3NnBqs4=; b=bpYZFo4sHagiMxXWPr+JaGW5ZAMQbkVyH3W10SggHh8PvrxBvWVKtWBaoy/2BYE79r 1jCH6cXnNGhwzCg5t2c04hVmjM/dS7SkfxDiyLe4xyzQu6DroQkFCm3O+OTVtWfRA1Kn jLH+V+vNw9Ty05LchsppdvcpCNpxFf+P6MVrQdaAi4C1JPFds9bCjDR7REvHrT+V9SKZ vPTZ5aP3O+Qy0DvqvGc/ttNwzq8NkdO29hd8a4RY48BEBcWwsrDqGciqYG/8bDhOADNN YLy7wBTbDLNaCnblkzWbaYrM6BRwSfrRb6FzUs/+h0xGgoihWFPdAKX8yB6TCOblkpvK yxtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dfTsbcSK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hs16-20020a1709073e9000b00947d7191944si706549ejc.639.2023.03.31.09.36.02; Fri, 31 Mar 2023 09:36:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dfTsbcSK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233506AbjCaQSZ (ORCPT + 99 others); Fri, 31 Mar 2023 12:18:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233452AbjCaQRt (ORCPT ); Fri, 31 Mar 2023 12:17:49 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 653E727823 for ; Fri, 31 Mar 2023 09:11:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279074; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mcxOsbWhFQyukCMWrPOKlIpXJGoD9GGTGJfL3NnBqs4=; b=dfTsbcSKut/DQpIdmO+hZIOUX8TCee/7cDEyOQ/eIsXh3C99iDxj7q6fI2DV3kBW4+dFJg uegJaN2XBiaWPjw7jhM/O+LF17LFy/uXE2oc7QZm8g1xFiHKcuMYL7IPiYrIPp9lETGXLM Isp9lh11a67Sx8SvQENyByCdl65JEUQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-519-amCcfeoRO16U4csq6Trr-w-1; Fri, 31 Mar 2023 12:11:11 -0400 X-MC-Unique: amCcfeoRO16U4csq6Trr-w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BE84088904E; Fri, 31 Mar 2023 16:11:08 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 97DD02166B33; Fri, 31 Mar 2023 16:11:06 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ilya Dryomov , Xiubo Li , ceph-devel@vger.kernel.org Subject: [PATCH v3 39/55] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Fri, 31 Mar 2023 17:08:58 +0100 Message-Id: <20230331160914.1608208-40-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901896520153839?= X-GMAIL-MSGID: =?utf-8?q?1761901896520153839?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v1.c | 58 ++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 39 deletions(-) diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index d664cb1593a7..b2d801a49122 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -74,37 +74,6 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov, return r; } -/* - * @more: either or both of MSG_MORE and MSG_SENDPAGE_NOTLAST - */ -static int ceph_tcp_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int more) -{ - ssize_t (*sendpage)(struct socket *sock, struct page *page, - int offset, size_t size, int flags); - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - int ret; - - /* - * sendpage cannot properly handle pages with page_count == 0, - * we need to fall back to sendmsg if that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag which - * triggers one of hardened usercopy checks. - */ - if (sendpage_ok(page)) - sendpage = sock->ops->sendpage; - else - sendpage = sock_no_sendpage; - - ret = sendpage(sock, page, offset, size, flags); - if (ret == -EAGAIN) - ret = 0; - - return ret; -} - static void con_out_kvec_reset(struct ceph_connection *con) { BUG_ON(con->v1.out_skip); @@ -464,7 +433,6 @@ static int write_partial_message_data(struct ceph_connection *con) struct ceph_msg *msg = con->out_msg; struct ceph_msg_data_cursor *cursor = &msg->cursor; bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC); - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; u32 crc; dout("%s %p msg %p\n", __func__, con, msg); @@ -482,6 +450,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ crc = do_datacrc ? le32_to_cpu(msg->footer.data_crc) : 0; while (cursor->total_resid) { + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST, + }; struct page *page; size_t page_offset; size_t length; @@ -494,9 +466,12 @@ static int write_partial_message_data(struct ceph_connection *con) page = ceph_msg_data_next(cursor, &page_offset, &length); if (length == cursor->total_resid) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, page, page_offset, length, - more); + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, length, page_offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, length); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) { if (do_datacrc) msg->footer.data_crc = cpu_to_le32(crc); @@ -526,7 +501,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ static int write_partial_skip(struct ceph_connection *con) { - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST | MSG_MORE, + }; int ret; dout("%s %p %d left\n", __func__, con, con->v1.out_skip); @@ -534,9 +512,11 @@ static int write_partial_skip(struct ceph_connection *con) size_t size = min(con->v1.out_skip, (int)PAGE_SIZE); if (size == con->v1.out_skip) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, ceph_zero_page, 0, size, - more); + msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; + bvec_set_page(&bvec, ZERO_PAGE(0), size, 0); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) goto out; con->v1.out_skip -= ret; From patchwork Fri Mar 31 16:08:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77890 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp696404vqo; Fri, 31 Mar 2023 09:43:22 -0700 (PDT) X-Google-Smtp-Source: AKy350YsTIV0oo2+LQeH9frOrSSD8ze+vvbfN5n9tbiqSxwMehiztjoVJtMhu3OMJAJLj1UJBFDK X-Received: by 2002:a05:6402:455:b0:4bf:c590:3c57 with SMTP id p21-20020a056402045500b004bfc5903c57mr25299953edw.2.1680281001773; Fri, 31 Mar 2023 09:43:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680281001; cv=none; d=google.com; s=arc-20160816; b=qCQWO3VbXVwJ0wJImOSJi3IXmmgmlH7O5wwVQigWR0oh74bKlMV0QDqwAYy69EjwhV DjEs4bLFpgYF7bxMKW3vNGPZ61mo9AzmxOjxU/enfy4sVWg3BLg07w3aihlHGsooQAAP PbUCov3Wzk59LYT2Va08KchTRGJvAMVPbwbhRQ+dkdDFYbKfSihMF/0MSgqfkxAe6EKv t7v2xFxdPBB44KxeQdlRBLvvkibd8XghJTq09RGXkJbGkV86zaBwmA8pVcMPX2/u9w0+ WuJtKyg/SekyaWy5o7oX2OBZVPHBbmA2cDtPlBW5EK6ucll8LrRh+nXtzG8W6jvUQJKb mi0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gZm+OH3nuU2w5lgBIDJ5zMpRPGFGHnStb+oddo8E9cs=; b=kbG47wzNOaf3TclhbAwtL4IOBoBqQ3juW35ZE4gtTinT5qGcMAuB5X3lbyt4yPXdwv gZ1JaKvj5/VV2HbfK+sf/b7uY+cr5W9MEkb72W8To/fAgbgpl6C+EEZY7gt7KVBqixNY yuHGDONTcG7JUXxfn1lRhGO0QS5i7ba4gKniK9yvrrN3LudP0IEe4BcjuPmQAlJTBw4w f35GFwxdlFn9ofG9Q2QMN/27i1UTywpUt7/hYRl0D+EVIuNIPURi7MzTW7M34BZ+BxFp fTQjk5mpESI/Iin8zwPZAJj0u4sCD3xHzYg68hIISwGzA8A2pavsVlsmIFYYcVD+1tH4 nGZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkkjDuP3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id me17-20020a170906aed100b00921792d069dsi2183275ejb.608.2023.03.31.09.42.58; Fri, 31 Mar 2023 09:43:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkkjDuP3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232429AbjCaQkm (ORCPT + 99 others); Fri, 31 Mar 2023 12:40:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232016AbjCaQkY (ORCPT ); Fri, 31 Mar 2023 12:40:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F84724428 for ; Fri, 31 Mar 2023 09:35:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680280539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gZm+OH3nuU2w5lgBIDJ5zMpRPGFGHnStb+oddo8E9cs=; b=fkkjDuP3HWI8tHVjRzjAkEXMX8/VkyYLHCMGyb62XSNPml4jWxHsijZrWEIV6CwDvubioE w0ExmGXXu6FdJ9dkMobuBt/u4QuVtAED3KIonLz8euHd83zwjUWHeCrjeY9bwHfksZtAoe 7vvhgO6tvgQeCDt/o36ai2sPqcbEmlg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-641-aQ6YNqtSNLCDJnr_-8NcqQ-1; Fri, 31 Mar 2023 12:11:12 -0400 X-MC-Unique: aQ6YNqtSNLCDJnr_-8NcqQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 83939802D1A; Fri, 31 Mar 2023 16:11:11 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5AF014020C82; Fri, 31 Mar 2023 16:11:09 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Martin K. Petersen" , linux-scsi@vger.kernel.org, target-devel@vger.kernel.org Subject: [PATCH v3 40/55] iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Fri, 31 Mar 2023 17:08:59 +0100 Message-Id: <20230331160914.1608208-41-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902331777701275?= X-GMAIL-MSGID: =?utf-8?q?1761902331777701275?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage. This allows multiple pages and multipage folios to be passed through. TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the entire set of pages it's going to transfer plus two for the header and trailer and page fragments to hold the header and trailer - and then call sendmsg once for the entire message. Signed-off-by: David Howells cc: "Martin K. Petersen" cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-scsi@vger.kernel.org cc: target-devel@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/scsi/iscsi_tcp.c | 31 ++++++++++++------------ drivers/scsi/iscsi_tcp.h | 2 +- drivers/target/iscsi/iscsi_target_util.c | 14 ++++++----- 3 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index c76f82fb8b63..cf3eb55d2a76 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -301,35 +301,37 @@ static int iscsi_sw_tcp_xmit_segment(struct iscsi_tcp_conn *tcp_conn, while (!iscsi_tcp_segment_done(tcp_conn, segment, 0, r)) { struct scatterlist *sg; + struct msghdr msg = {}; + union { + struct kvec kv; + struct bio_vec bv; + } vec; unsigned int offset, copy; - int flags = 0; r = 0; offset = segment->copied; copy = segment->size - offset; if (segment->total_copied + segment->size < segment->total_size) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; if (tcp_sw_conn->queue_recv) - flags |= MSG_DONTWAIT; + msg.msg_flags |= MSG_DONTWAIT; - /* Use sendpage if we can; else fall back to sendmsg */ if (!segment->data) { + if (tcp_conn->iscsi_conn->datadgst_en) + msg.msg_flags |= MSG_SPLICE_PAGES; sg = segment->sg; offset += segment->sg_offset + sg->offset; - r = tcp_sw_conn->sendpage(sk, sg_page(sg), offset, - copy, flags); + bvec_set_page(&vec.bv, sg_page(sg), copy, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &vec.bv, 1, copy); } else { - struct msghdr msg = { .msg_flags = flags }; - struct kvec iov = { - .iov_base = segment->data + offset, - .iov_len = copy - }; - - r = kernel_sendmsg(sk, &msg, &iov, 1, copy); + vec.kv.iov_base = segment->data + offset; + vec.kv.iov_len = copy; + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &vec.kv, 1, copy); } + r = sock_sendmsg(sk, &msg); if (r < 0) { iscsi_tcp_segment_unmap(segment); return r; @@ -746,7 +748,6 @@ iscsi_sw_tcp_conn_bind(struct iscsi_cls_session *cls_session, sock_no_linger(sk); iscsi_sw_tcp_conn_set_callbacks(conn); - tcp_sw_conn->sendpage = tcp_sw_conn->sock->ops->sendpage; /* * set receive state machine into initial state */ @@ -778,8 +779,6 @@ static int iscsi_sw_tcp_conn_set_param(struct iscsi_cls_conn *cls_conn, mutex_unlock(&tcp_sw_conn->sock_lock); return -ENOTCONN; } - tcp_sw_conn->sendpage = conn->datadgst_en ? - sock_no_sendpage : tcp_sw_conn->sock->ops->sendpage; mutex_unlock(&tcp_sw_conn->sock_lock); break; case ISCSI_PARAM_MAX_R2T: diff --git a/drivers/scsi/iscsi_tcp.h b/drivers/scsi/iscsi_tcp.h index 68e14a344904..d6ec08d7eb63 100644 --- a/drivers/scsi/iscsi_tcp.h +++ b/drivers/scsi/iscsi_tcp.h @@ -48,7 +48,7 @@ struct iscsi_sw_tcp_conn { uint32_t sendpage_failures_cnt; uint32_t discontiguous_hdr_cnt; - ssize_t (*sendpage)(struct socket *, struct page *, int, size_t, int); + bool can_splice_to_tcp; }; struct iscsi_sw_tcp_host { diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index 26dc8ed3045b..c7d58e41ac3b 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -1078,6 +1078,8 @@ int iscsit_fe_sendpage_sg( struct iscsit_conn *conn) { struct scatterlist *sg = cmd->first_data_sg; + struct bio_vec bvec; + struct msghdr msghdr = { .msg_flags = MSG_SPLICE_PAGES, }; struct kvec iov; u32 tx_hdr_size, data_len; u32 offset = cmd->first_data_sg_off; @@ -1121,17 +1123,17 @@ int iscsit_fe_sendpage_sg( u32 space = (sg->length - offset); u32 sub_len = min_t(u32, data_len, space); send_pg: - tx_sent = conn->sock->ops->sendpage(conn->sock, - sg_page(sg), sg->offset + offset, sub_len, 0); + bvec_set_page(&bvec, sg_page(sg), sub_len, sg->offset + offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, sub_len); + + tx_sent = conn->sock->ops->sendmsg(conn->sock, &msghdr, sub_len); if (tx_sent != sub_len) { if (tx_sent == -EAGAIN) { - pr_err("tcp_sendpage() returned" - " -EAGAIN\n"); + pr_err("sendmsg/splice returned -EAGAIN\n"); goto send_pg; } - pr_err("tcp_sendpage() failure: %d\n", - tx_sent); + pr_err("sendmsg/splice failure: %d\n", tx_sent); return -1; } From patchwork Fri Mar 31 16:09:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77860 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690870vqo; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) X-Google-Smtp-Source: AKy350bfag8M/2kBeYbvTC9NHSM/SsDiNgEi+RtdxZdcclgHU1Ky9lncXejs60gsThzF/F1QzPUy X-Received: by 2002:a17:90b:38c5:b0:23d:3f32:1cd5 with SMTP id nn5-20020a17090b38c500b0023d3f321cd5mr29738507pjb.26.1680280448236; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280448; cv=none; d=google.com; s=arc-20160816; b=T3lKUWLWaIkCGBt7hQusfE0F8pxMBL3P70/w5AzWDw7y/wVXCQeV5SxPjKue6eGTUt UeBAqUFfqe47iDS8YHgTzSN0iCzPoyVo7axu0/7tbGNEXkr3wdV9+24kHDszTfDuShWV 6cFZYrn+6GxbwSRYmW3EqR7N3v3OgjFoacctbY/ULuObEY6nf+X5E5wcv58nS6YtC4yX vjzvSrAG3XmFLZLmsDz87WKx1k7Cb1rFkLI7JrZ3JZN6DvdIOM2tEMLJXMofy2aj2c7v h654d6xswrpHVJzrHPNRFfwMEg6lp/Zi21y8U6qc4YOIxxOwNQ+Z5EcAMRKoC39qEZSN F/TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0V0VYuNRGUsctVySLnEH8GinGQ4l5nQcc0Sx6Q5aq0A=; b=Yp6VpcYqcycKg1LZAAr6qJpVBaHPIpw7GUFuNTK7GvK3HE27OWMxHPu0gTixFIFTVj FlS0WO9EUH/Bs+0QcrNjSuYskq44BSFUUciQaVh2CXHO7nsh3Ef9JNZMism7JwfQwrJa jz3HT7B4SUq8aI/EeOWMsJNgtNmtXWQO2AgjfWc5MonvZiyOI8gYgEaKPIYrswu2wGwf aa0x82bVEaAWeEvMhyGstHz4ge+hsKtsQN/hhEDW4hIv2QaPzg2Bu800ZpVVQPBEOHYQ 59T41znnIihNSDQCorfAxHtdwzMEqseG8XADOpX1segwBuTT4WNBwVGZ8ei4OOyYqp7y haoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BT6zI+kr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z19-20020a63e113000000b00513234112b3si2743380pgh.895.2023.03.31.09.33.55; Fri, 31 Mar 2023 09:34:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BT6zI+kr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231905AbjCaQZZ (ORCPT + 99 others); Fri, 31 Mar 2023 12:25:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231429AbjCaQYg (ORCPT ); Fri, 31 Mar 2023 12:24:36 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DC98236AC for ; Fri, 31 Mar 2023 09:18:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0V0VYuNRGUsctVySLnEH8GinGQ4l5nQcc0Sx6Q5aq0A=; b=BT6zI+krlRfHBKWZr92YUDhKk7lsqDlEC+JZ/0PRUluCkYjAYjEJ1y0V9uzEaXFIh4AYCp F4tQBKg/HJ4naBElISDM96BJ1xCHPOGxlZABb6YvXFCjREaxl3G5P0Uhpk5FJDF9+5JQNs 0MLs72OiglXhJ1tHGGiIPBREyaz4AXg= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-335-JJSK7rSYNzmbFtG3J_VYoA-1; Fri, 31 Mar 2023 12:11:15 -0400 X-MC-Unique: JJSK7rSYNzmbFtG3J_VYoA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6C1CF29ABA1F; Fri, 31 Mar 2023 16:11:14 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 42ABB40B3EDB; Fri, 31 Mar 2023 16:11:12 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Martin K. Petersen" , linux-scsi@vger.kernel.org, target-devel@vger.kernel.org Subject: [PATCH v3 41/55] iscsi: Assume "sendpage" is okay in iscsi_tcp_segment_map() Date: Fri, 31 Mar 2023 17:09:00 +0100 Message-Id: <20230331160914.1608208-42-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901751464310373?= X-GMAIL-MSGID: =?utf-8?q?1761901751464310373?= As iscsi is now using sendmsg() with MSG_SPLICE_PAGES rather than sendpage, assume that sendpage_ok() will return true in iscsi_tcp_segment_map() and leave it to TCP to copy the data if not. Signed-off-by: David Howells cc: "Martin K. Petersen" cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-scsi@vger.kernel.org cc: target-devel@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/scsi/libiscsi_tcp.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c index c182aa83f2c9..07ba0d864820 100644 --- a/drivers/scsi/libiscsi_tcp.c +++ b/drivers/scsi/libiscsi_tcp.c @@ -128,18 +128,11 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv) * coalescing neighboring slab objects into a single frag which * triggers one of hardened usercopy checks. */ - if (!recv && sendpage_ok(sg_page(sg))) + if (!recv) return; - if (recv) { - segment->atomic_mapped = true; - segment->sg_mapped = kmap_atomic(sg_page(sg)); - } else { - segment->atomic_mapped = false; - /* the xmit path can sleep with the page mapped so use kmap */ - segment->sg_mapped = kmap(sg_page(sg)); - } - + segment->atomic_mapped = true; + segment->sg_mapped = kmap_atomic(sg_page(sg)); segment->data = segment->sg_mapped + sg->offset + segment->sg_offset; } From patchwork Fri Mar 31 16:09:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77909 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp706708vqo; Fri, 31 Mar 2023 10:01:50 -0700 (PDT) X-Google-Smtp-Source: AKy350YXq2u/+8rI63jKzqMuxUlCG48Ly/VtgvZsGe6kMQL1hzEv0cXVsgLqleuKhtA9FwfZk6qU X-Received: by 2002:a17:902:d101:b0:19e:7b09:bd4d with SMTP id w1-20020a170902d10100b0019e7b09bd4dmr22730533plw.47.1680282110443; Fri, 31 Mar 2023 10:01:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680282110; cv=none; d=google.com; s=arc-20160816; b=UZBLBX4VZnI6i+CZTVk6jbX7VcpEdDigLdSTbgBoDOIMxGpVUeWRnJoToLpxX5q0xx eD+SGsNcCo0F+2R/aUiBuTrIhWmUjKY8nlx1+VFzUIPkYdF8V1O1w/rg9yJdXvdV4Sn8 giCT95MCKPFeZVudTmyicD2NvDu6cdYjm3X9bIq76+St1jSQV91BgTM/V6erlj6/HnyW PfjZp8gBEIray0Ak/TNCnav4W237xAbtLoiWWRloT+tTz9Rsvk5Z6PaXSgi9RBBGd7fG GApP00S+2q6jDzkAvKiWfZ6Iu/EtN7v0BVZq7HMG9Fzi4jhh5xeAk4b7Cb9WyWVK09Qe S9+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cLmy3QT3PxPcXOXP8Tmrcj7ottmHeL6VPbhO66kxXIU=; b=L0yd8ShYqX+/S+l7pKoGQiT94757MHcTsgOjJ1cP/FyLkcMha/RX2/A1G783H84LL9 YwaTJaj8Z8upaImT64NjsHtWkJcyeY0Z16KqGQ7jZuC5+W7wG074dl5q9oF1trSBAyee 18hcdIKtMCTIiFWXRmbcbsAyadw5CWgPIdygMiTa1fvAIvOFPNNRZIa5sVMhpDVBtF8B vtzs8qYjuWGkisykZR1edprKbbgxy6aJLLgYYMmZ2Kfjjv4Hg4U0LTSGl1b5CbkW9dQI ALsUurXNo0ivWDzebxssnT9ooSfMAc1VhtgerLC0kzBMbA9m0N9VpyCuVYBa64Pqu28T Ctpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R7uY8zA8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ke5-20020a170903340500b0019a826d304dsi2225578plb.633.2023.03.31.10.01.38; Fri, 31 Mar 2023 10:01:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R7uY8zA8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233317AbjCaQbN (ORCPT + 99 others); Fri, 31 Mar 2023 12:31:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232303AbjCaQa6 (ORCPT ); Fri, 31 Mar 2023 12:30:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1F7523B53 for ; Fri, 31 Mar 2023 09:25:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279903; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cLmy3QT3PxPcXOXP8Tmrcj7ottmHeL6VPbhO66kxXIU=; b=R7uY8zA8br+H0XhFiGvK9lT8A/4Nuh3ojrlKUUTgAG6PQyYVA+3PikW9m3vk2mh4tMBuw7 RRGg+AHopwurzGwhwXgikEKYtOA2yUm/tVA2Y4MH9uTMmowaBOi17vPhzJ/IOFUHAfQIWR JAYxjB6tGvJweCnjUUW8JUSo0UI1xjc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-340-Fj3IfQ9LOuGHr1f7veDhWA-1; Fri, 31 Mar 2023 12:11:18 -0400 X-MC-Unique: Fj3IfQ9LOuGHr1f7veDhWA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 30B60886065; Fri, 31 Mar 2023 16:11:17 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 072FE202701E; Fri, 31 Mar 2023 16:11:14 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Fastabend , Jakub Sitnicki , bpf@vger.kernel.org Subject: [PATCH v3 42/55] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) Date: Fri, 31 Mar 2023 17:09:01 +0100 Message-Id: <20230331160914.1608208-43-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761903494088220825?= X-GMAIL-MSGID: =?utf-8?q?1761903494088220825?= Translate tcp_bpf_sendpage() calls to tcp_bpf_sendmsg(MSG_SPLICE_PAGES). Signed-off-by: David Howells cc: John Fastabend cc: Jakub Sitnicki cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: netdev@vger.kernel.org --- net/ipv4/tcp_bpf.c | 49 +++++++++------------------------------------- 1 file changed, 9 insertions(+), 40 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 7f17134637eb..de37a4372437 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -485,49 +485,18 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct sk_msg tmp, *msg = NULL; - int err = 0, copied = 0; - struct sk_psock *psock; - bool enospc = false; - - psock = sk_psock_get(sk); - if (unlikely(!psock)) - return tcp_sendpage(sk, page, offset, size, flags); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES, + }; - lock_sock(sk); - if (psock->cork) { - msg = psock->cork; - } else { - msg = &tmp; - sk_msg_init(msg); - } + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - /* Catch case where ring is full and sendpage is stalled. */ - if (unlikely(sk_msg_full(msg))) - goto out_err; - - sk_msg_page_add(msg, page, size, offset); - sk_mem_charge(sk, size); - copied = size; - if (sk_msg_full(msg)) - enospc = true; - if (psock->cork_bytes) { - if (size > psock->cork_bytes) - psock->cork_bytes = 0; - else - psock->cork_bytes -= size; - if (psock->cork_bytes && !enospc) - goto out_err; - /* All cork bytes are accounted, rerun the prog. */ - psock->eval = __SK_NONE; - psock->cork_bytes = 0; - } + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - err = tcp_bpf_send_verdict(sk, psock, msg, &copied, flags); -out_err: - release_sock(sk); - sk_psock_put(sk, psock); - return copied ? copied : err; + return tcp_bpf_sendmsg(sk, &msg, size); } enum { From patchwork Fri Mar 31 16:09:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77846 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp685695vqo; Fri, 31 Mar 2023 09:26:40 -0700 (PDT) X-Google-Smtp-Source: AKy350YifuVe8dhZ6RScKPbDdpV8gTQRdWQLSux0P2HZn6vAv2KSOSSL/AVVnflyMZWfsTN6rOxd X-Received: by 2002:a17:906:c791:b0:8f2:62a9:6159 with SMTP id cw17-20020a170906c79100b008f262a96159mr29747350ejb.2.1680280000249; Fri, 31 Mar 2023 09:26:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280000; cv=none; d=google.com; s=arc-20160816; b=l/4WVujVD0PoD4Zxh4b0MG9mzbMorTxlYUHDhYnIV9fgbUb8lGq/wE9qh8vMA8XxXc F9YHZFqzY9EjS9zmvxQ8+uKHqRXo7Mj18EZwfFmMJ8T1AfXfGc4lR90upJjO71zsojXt rhJuYhlAuCT6+HYmjYXaJ06KkNLRY72ltcda4tZkT0sJqZfmATefQ+OX2pQO+8CWypgR lb7XDtJl89/mSXSMjV/he4Fl5M9+EH24rvMJfblLiqVLakrioGHqzr3fMz2uNT+Mk9xx Ur86IQuKT5n0LJDVxCXgNd0XrkhQkvLGz6Nw8UkbjCh1UuWHOgPB22x0aQR9tb0LX4zC dz6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=lLisHPaWAEED9gZclxyhbGIkHvgJDoUiXuyeeqOKByg=; b=eeXPC1Ilf7jsfFRmVRzolYaTaWRvm2dlQOGQNl+MBn+BdaFPUwQqhZdAdqyUI9AJzG oRr6kVq3Oxtk7a0y/MTR/JPbdFqhr4JCKHqkJfScrwJKM1GTBzlFNS5qMffgvHrDvuEK NpuDqUuj+0ovHydtStf2u7/LnfWU/SloewORZi9VRBjjadc/4qUpgMqYvFd8zo7LCb+2 dVUrgESMegc3M02TZk99dDa9rfP+rL3gkr0C1ucsH3/DulAW66lGwUwnjw4iNtzjvTVe pI4AMWB5N21hvHRRxE2e6pjt/MJcKbd4Wylbip23RYRThhqqPZfFLVuwW4vcZTVWy39R 13Mg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TpWUvvWH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mj19-20020a170906af9300b009334ffd4422si2051460ejb.93.2023.03.31.09.26.15; Fri, 31 Mar 2023 09:26:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TpWUvvWH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233473AbjCaQR6 (ORCPT + 99 others); Fri, 31 Mar 2023 12:17:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233126AbjCaQRb (ORCPT ); Fri, 31 Mar 2023 12:17:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81332D4FAD for ; Fri, 31 Mar 2023 09:11:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279082; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lLisHPaWAEED9gZclxyhbGIkHvgJDoUiXuyeeqOKByg=; b=TpWUvvWHw4prylM0PXuGjepiRcHc6g52dbf0+Pmhk3HyQ7tEZ6aHgLtEipamrlwmqlCnqt 8bRSQPG1Ai1clNPIlAKJZKU4P2ckcf00djG27RrWVubont0+VO9D7hhOPax9zc4r9qciCb OFvyxRz1ZWYz9wiAArWJHqsXDMWcL6A= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-626-FA2oWpmKPhmcnajIovqg2A-1; Fri, 31 Mar 2023 12:11:20 -0400 X-MC-Unique: FA2oWpmKPhmcnajIovqg2A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C7A3129ABA1C; Fri, 31 Mar 2023 16:11:19 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4EDD2027040; Fri, 31 Mar 2023 16:11:17 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 43/55] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() Date: Fri, 31 Mar 2023 17:09:02 +0100 Message-Id: <20230331160914.1608208-44-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901281545809376?= X-GMAIL-MSGID: =?utf-8?q?1761901281545809376?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in skb_send_sock(). This causes pages to be spliced from the source iterator if possible (the iterator must be ITER_BVEC and the pages must be spliceable). This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Note that this could perhaps be improved to fill out a bvec array with all the frags and then make a single sendmsg call, possibly sticking the header on the front also. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/core/skbuff.c | 49 ++++++++++++++++++++++++++--------------------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 0506e4cf1ed9..3693b3526d33 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2919,32 +2919,32 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset, } EXPORT_SYMBOL_GPL(skb_splice_bits); -static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size) +static int sendmsg_locked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; + size_t size = msg_data_left(msg); if (!sock) return -EINVAL; - return kernel_sendmsg(sock, msg, vec, num, size); + + if (!sock->ops->sendmsg_locked) + return sock_no_sendmsg_locked(sk, msg, size); + + return sock->ops->sendmsg_locked(sk, msg, size); } -static int sendpage_unlocked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; if (!sock) return -EINVAL; - return kernel_sendpage(sock, page, offset, size, flags); + return sock_sendmsg(sock, msg); } -typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size); -typedef int (*sendpage_func)(struct sock *sk, struct page *page, int offset, - size_t size, int flags); +typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg); static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, - int len, sendmsg_func sendmsg, sendpage_func sendpage) + int len, sendmsg_func sendmsg) { unsigned int orig_len = len; struct sk_buff *head = skb; @@ -2964,8 +2964,9 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, memset(&msg, 0, sizeof(msg)); msg.msg_flags = MSG_DONTWAIT; - ret = INDIRECT_CALL_2(sendmsg, kernel_sendmsg_locked, - sendmsg_unlocked, sk, &msg, &kv, 1, slen); + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &kv, 1, slen); + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -2996,11 +2997,17 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, slen = min_t(size_t, len, skb_frag_size(frag) - offset); while (slen) { - ret = INDIRECT_CALL_2(sendpage, kernel_sendpage_locked, - sendpage_unlocked, sk, - skb_frag_page(frag), - skb_frag_off(frag) + offset, - slen, MSG_DONTWAIT); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT, + }; + + bvec_set_page(&bvec, skb_frag_page(frag), slen, + skb_frag_off(frag) + offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, slen); + + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -3037,16 +3044,14 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, kernel_sendmsg_locked, - kernel_sendpage_locked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_locked); } EXPORT_SYMBOL_GPL(skb_send_sock_locked); /* Send skb data on a socket. Socket must be unlocked. */ int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked, - sendpage_unlocked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked); } /** From patchwork Fri Mar 31 16:09:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77850 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp687731vqo; Fri, 31 Mar 2023 09:29:48 -0700 (PDT) X-Google-Smtp-Source: AKy350aNTRLnwv2QloW5dck+qQEa9fl3IwrWF8uP3vGtmoof9cz30D5oMkMpUXFpRIb2/Hemt7f8 X-Received: by 2002:a17:902:d4c6:b0:1a1:cef2:acd4 with SMTP id o6-20020a170902d4c600b001a1cef2acd4mr11538838plg.21.1680280188573; Fri, 31 Mar 2023 09:29:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280188; cv=none; d=google.com; s=arc-20160816; b=c+k5TcMwe2U/XsbYYNnQN6/z4vvAjuF2KLXCYSb7t+UTEaQgcrcgKNLw+3aBEyFl9V WiklASd+Hojsj2EbHuaJNM36nJn8dZIIr0taZmYJe5QiR+gZmPORGrbI4x9FZWOvQ0sA 61mtPqvcFoHX1jnQp8vJ8cKsZybx5trLXXm4Lt+sOc1CB7j71g4/sn867DFn13vCW1Uz yVXJ+EfiA37/stugQcLMfK3uKfUHek5OE7FSgDaBBGBwvB8EKa21rYbFu/hjtuP1X2Pg L/MBMkTd9lzZVzeAxJX8vuzwibAFpLH3ag1UAFw2lppcCg9Jy+cK7bMq26PQ6Gk2ArtW /Opw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=taClZ7tV+6NGiC2MYzyqiyjdYhpNZJPUqY1OQ7XAIPM=; b=uatH20e1WxeJ7QrjVJ6RBqhtSAb5zfBK/jkJuY57UPnvyxWA/0l/QsfjIaTlxbJmL2 89sk2fUhqsykw86Fy1KwWOWSprpM6P2afSfF6x6KFtfAxwCvkkHA/rZbDXGnqp9przTh tmymIgLxfXwVteDNUMxlhKlLYhYL4GJN+kFsCbu+piYJZjgrGR0hRuiH2f7/wuQUaZjM yB8EL3Dl9jl7lY+IO1RgmLmFcSgP5piY6xj8tctLjcaFNfMl3TfuKa+1Skl+HLatUBmq 0C0nitBeq8LHz17b7jD7L1sHjk4A6gDkSgq+QqC8PxWIw1CkjNt6AZwO/amG1G9f7jZH SVnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R0Sg5eMd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r20-20020a170902be1400b0019cd86735f1si2457959pls.481.2023.03.31.09.29.36; Fri, 31 Mar 2023 09:29:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R0Sg5eMd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233519AbjCaQSb (ORCPT + 99 others); Fri, 31 Mar 2023 12:18:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230354AbjCaQRx (ORCPT ); Fri, 31 Mar 2023 12:17:53 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E79FC27834 for ; Fri, 31 Mar 2023 09:12:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=taClZ7tV+6NGiC2MYzyqiyjdYhpNZJPUqY1OQ7XAIPM=; b=R0Sg5eMdYDWAmJU4bBXEz48OcwQgL8pj88EwXyY0WpMwdq7L4bds1imLsE+M1+1UiyizXr T8Mg30P6csW6IeZxcHR/8oud79064trYOqjQySck5LIYEuuk/avtRTrTkY70842UOE12PX P5jFkV09zMOgt7IdRxYJo3UWyWWUzNE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-321-c9EO5cpKMmGD23YMrtD9Yg-1; Fri, 31 Mar 2023 12:11:23 -0400 X-MC-Unique: c9EO5cpKMmGD23YMrtD9Yg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 73F4B811E7C; Fri, 31 Mar 2023 16:11:22 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 628494020C82; Fri, 31 Mar 2023 16:11:20 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [PATCH v3 44/55] algif: Remove hash_sendpage*() Date: Fri, 31 Mar 2023 17:09:03 +0100 Message-Id: <20230331160914.1608208-45-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901478902818844?= X-GMAIL-MSGID: =?utf-8?q?1761901478902818844?= Remove hash_sendpage*().. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/algif_hash.c | 66 --------------------------------------------- 1 file changed, 66 deletions(-) diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index b89c2c50cecc..dc6c45637b2d 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -162,58 +162,6 @@ static int hash_sendmsg(struct socket *sock, struct msghdr *msg, goto unlock; } -static ssize_t hash_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct sock *sk = sock->sk; - struct alg_sock *ask = alg_sk(sk); - struct hash_ctx *ctx = ask->private; - int err; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - lock_sock(sk); - sg_init_table(ctx->sgl.sgl, 1); - sg_set_page(ctx->sgl.sgl, page, size, offset); - - if (!(flags & MSG_MORE)) { - err = hash_alloc_result(sk, ctx); - if (err) - goto unlock; - } else if (!ctx->more) - hash_free_result(sk, ctx); - - ahash_request_set_crypt(&ctx->req, ctx->sgl.sgl, ctx->result, size); - - if (!(flags & MSG_MORE)) { - if (ctx->more) - err = crypto_ahash_finup(&ctx->req); - else - err = crypto_ahash_digest(&ctx->req); - } else { - if (!ctx->more) { - err = crypto_ahash_init(&ctx->req); - err = crypto_wait_req(err, &ctx->wait); - if (err) - goto unlock; - } - - err = crypto_ahash_update(&ctx->req); - } - - err = crypto_wait_req(err, &ctx->wait); - if (err) - goto unlock; - - ctx->more = flags & MSG_MORE; - -unlock: - release_sock(sk); - - return err ?: size; -} - static int hash_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { @@ -318,7 +266,6 @@ static struct proto_ops algif_hash_ops = { .release = af_alg_release, .sendmsg = hash_sendmsg, - .sendpage = hash_sendpage, .recvmsg = hash_recvmsg, .accept = hash_accept, }; @@ -370,18 +317,6 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return hash_sendmsg(sock, msg, size); } -static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = hash_check_key(sock); - if (err) - return err; - - return hash_sendpage(sock, page, offset, size, flags); -} - static int hash_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -420,7 +355,6 @@ static struct proto_ops algif_hash_ops_nokey = { .release = af_alg_release, .sendmsg = hash_sendmsg_nokey, - .sendpage = hash_sendpage_nokey, .recvmsg = hash_recvmsg_nokey, .accept = hash_accept_nokey, }; From patchwork Fri Mar 31 16:09:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77885 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp694101vqo; Fri, 31 Mar 2023 09:39:32 -0700 (PDT) X-Google-Smtp-Source: AKy350ZRC4lShqf/p+bTz80ttPRQnVDSmU4GqyzeDfmPhS4CghBmdW0aYVB+no8XZg3Yk9vJ9Dqx X-Received: by 2002:a05:6402:45:b0:4ea:a9b0:a511 with SMTP id f5-20020a056402004500b004eaa9b0a511mr26224669edu.37.1680280772349; Fri, 31 Mar 2023 09:39:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280772; cv=none; d=google.com; s=arc-20160816; b=Jsm90Ly7qUAm+hon0YdNhVfrIS4bLNxIedjfwEY9H0T/R1m0JVMra05h7EpWB0puRx MqATPUlKy865YKVb3HVXaQnAvZWnrX9M8IlQQf8XvKpj7GGJ/sohMJTG+kTHEEEnXRWM RYwhkukirB1jwid4STIWn9htVFPmVQJdmABiNH5uDk0SXrlovBKLST+36uxx9TNKxa9S AYIEJAO0qN8WOx2J8kQ2L3kbD53xqqyUVYcivbIr33H7AZI387BD+u0yE4MZ+pdkHkfn YpseBRDTUz0R3TBtr6lADbsu4AKRU9ToWdZMVdOcojhLBRkl2wX7V54wulA0L6VqebUh jcrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=14pAi6JHwU3C53h/ZQrPmmNB396J3kzB8Cmz3sg/vjY=; b=IEAlzPdQPXUk+oMfEhN8cRt0CMFdznfGHKvCkgzO6rNaztHPXBL4KBSDucJ8mR91X7 bzXL4HdrKdU9JOPJ2jsabWWdqG4KYvK7QXfOr0zDskdW6hechRXOJKXhxeN5k0LRWg4B xXgv98nqtlNcu6ZRub7OShyrt6EsbpyZal39BWx7ieiSeR3oJ3C0WAlPVNZmv454974O xDdrdjXbSDiXVpL0nHMXLMElNqbOXYjiE5OTYGq45XpaLeZK+lsKmb8VmU87r94Wd4IY OyjSOByPx9TUbfsrVRf9/V9A8dA3drwQjR/G29/tZEe6loIiy4cSTgMqWFmX4h1f9gN6 JBIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dAqkhGkQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q11-20020a056402040b00b004fd2b154c18si2192530edv.500.2023.03.31.09.39.08; Fri, 31 Mar 2023 09:39:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dAqkhGkQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230104AbjCaQci (ORCPT + 99 others); Fri, 31 Mar 2023 12:32:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231736AbjCaQcX (ORCPT ); Fri, 31 Mar 2023 12:32:23 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E01E8C66F for ; Fri, 31 Mar 2023 09:27:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680280056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=14pAi6JHwU3C53h/ZQrPmmNB396J3kzB8Cmz3sg/vjY=; b=dAqkhGkQIescNnFx1nV9t30S4wWMPAg5ATwRx0uTH0nK6LlKF6KlqzzdGdASCUxiezgjt2 72RhH9X7d3xs8QZFj3iDlpG1PpKkvG6xQ9wgz22FXJOGYd5ZhkwPYRZEd3c+R+0uJNk+dY wtm/vGTXxaeCCdJ0TONaeT1tSmWDdIw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-186-Zs9E43zDOwakJiLhI1Mt6A-1; Fri, 31 Mar 2023 12:11:26 -0400 X-MC-Unique: Zs9E43zDOwakJiLhI1Mt6A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A289F3C0F383; Fri, 31 Mar 2023 16:11:25 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2D14E2166B33; Fri, 31 Mar 2023 16:11:23 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ilya Dryomov , Xiubo Li , ceph-devel@vger.kernel.org Subject: [PATCH v3 45/55] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Fri, 31 Mar 2023 17:09:04 +0100 Message-Id: <20230331160914.1608208-46-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902091415217738?= X-GMAIL-MSGID: =?utf-8?q?1761902091415217738?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v2.c | 89 +++++++++-------------------------------- 1 file changed, 18 insertions(+), 71 deletions(-) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index 301a991dc6a6..1637a0c21126 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -117,91 +117,38 @@ static int ceph_tcp_recv(struct ceph_connection *con) return ret; } -static int do_sendmsg(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - int ret; - - msg.msg_iter = *it; - while (iov_iter_count(it)) { - ret = sock_sendmsg(sock, &msg); - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - WARN_ON(msg_data_left(&msg)); - return 1; -} - -static int do_try_sendpage(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - struct bio_vec bv; - int ret; - - if (WARN_ON(!iov_iter_is_bvec(it))) - return -EINVAL; - - while (iov_iter_count(it)) { - /* iov_iter_iovec() for ITER_BVEC */ - bvec_set_page(&bv, it->bvec->bv_page, - min(iov_iter_count(it), - it->bvec->bv_len - it->iov_offset), - it->bvec->bv_offset + it->iov_offset); - - /* - * sendpage cannot properly handle pages with - * page_count == 0, we need to fall back to sendmsg if - * that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag - * which triggers one of hardened usercopy checks. - */ - if (sendpage_ok(bv.bv_page)) { - ret = sock->ops->sendpage(sock, bv.bv_page, - bv.bv_offset, bv.bv_len, - CEPH_MSG_FLAGS); - } else { - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len); - ret = sock_sendmsg(sock, &msg); - } - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - return 1; -} - /* * Write as much as possible. The socket is expected to be corked, * so we don't bother with MSG_MORE/MSG_SENDPAGE_NOTLAST here. * * Return: - * 1 - done, nothing (else) to write + * >0 - done, nothing (else) to write * 0 - socket is full, need to wait * <0 - error */ static int ceph_tcp_send(struct ceph_connection *con) { + struct msghdr msg = { + .msg_iter = con->v2.out_iter, + .msg_flags = CEPH_MSG_FLAGS, + }; int ret; + if (WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter))) + return -EINVAL; + + if (con->v2.out_iter_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + dout("%s con %p have %zu try_sendpage %d\n", __func__, con, iov_iter_count(&con->v2.out_iter), con->v2.out_iter_sendpage); - if (con->v2.out_iter_sendpage) - ret = do_try_sendpage(con->sock, &con->v2.out_iter); - else - ret = do_sendmsg(con->sock, &con->v2.out_iter); + + ret = sock_sendmsg(con->sock, &msg); + if (ret > 0) + iov_iter_advance(&con->v2.out_iter, ret); + else if (ret == -EAGAIN) + ret = 0; + dout("%s con %p ret %d left %zu\n", __func__, con, ret, iov_iter_count(&con->v2.out_iter)); return ret; From patchwork Fri Mar 31 16:09:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77845 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp685277vqo; Fri, 31 Mar 2023 09:25:50 -0700 (PDT) X-Google-Smtp-Source: AKy350b1uNBY2MVdMBoCoK7W44cQXBDfDCmp59HW1qBYVKhBrKsNGa6XHNJVgG2LC+FfIb6nfXIY X-Received: by 2002:aa7:dc1a:0:b0:502:7767:3c73 with SMTP id b26-20020aa7dc1a000000b0050277673c73mr3507533edu.22.1680279950142; Fri, 31 Mar 2023 09:25:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680279950; cv=none; d=google.com; s=arc-20160816; b=0hhhFYO+O0AdoqXGOOPhefGrPs2TC6PTroXrqkQDvm0gnC8IDAnVJrPe7UI3SpVyB3 xmieUwBT02idfpYkqf9t86TLZFOLSs2A9zWkxX8vmUU3FMc19JtUCKjIPLk3XncuuaOt RrA+y8L5P65Qj0OIAxr6h6IhBCuJmautGPZH+6sbG7GB+Q/0cqpP5bVvfE3spwKNKZDp RuKsn5FXeStT9YE6n3VRwzpsf3XfwkWiMG8VfKrpDGCfY4pvgOMRrbCZKV/wF9YnQgsm rg2gKwmKEKSWgA5fITUIFyi3z8Lvu92KcuIk700H+W+v+gBZMB85whCgPxMwFOso57tY rzag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pMIF11EWm/8cej/KXQhM8DA1wUu2yCmqcuxOX1F+AiY=; b=Mctq03JNQswnBtgOUjNpA1a2Awi61yZ2uHKNLB6DRuKpOFrFbpovSCb34PFCFYRUF/ sGAKp1cr0uqQHCJeNJUyza8UuwTqWoHP8Cx/YDGT4onmJAEy1FA7XG1a23raaFteIORE JjJX0ZxQJUB3LV7L3zOgI6BzXSnWLS1k1m3FbM65Qn2wlrtOrsIr7I/+U0yHuaDL2Zcb bghK4Yeo6gx1yr2RK92IQeW73mNNQh6WUZGACJN5rD87Q/q1thx/mdHM/9edBBB2msSt UDhhMGPdQHHI2LeLyXbea2BN4YosZI9BgCSGtbtSJ/w/aHN9aat9LpRGMEKOI700dxeX TDMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=A05aVZMF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v18-20020aa7dbd2000000b005002e6f0f60si2058939edt.576.2023.03.31.09.25.25; Fri, 31 Mar 2023 09:25:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=A05aVZMF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233527AbjCaQTE (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233472AbjCaQR5 (ORCPT ); Fri, 31 Mar 2023 12:17:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2C0627839 for ; Fri, 31 Mar 2023 09:12:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279096; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pMIF11EWm/8cej/KXQhM8DA1wUu2yCmqcuxOX1F+AiY=; b=A05aVZMFapSVtfyfb3gVBN6B/1XRMMKQJxZBbbzX7wqA1AFG/pHttGlOlr1kleLRFJp5oo f/WW+5UOg9qbujMseFG46zWlypBWGnWQPBlfaZQ+uzFccXb5ANF8+nBSIk9bnHnpsxLKlr KMeB1QsiXsgHagLVDs21nsL47MRMRWU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-237-7dchcmswOjSwjHDFElj0Zg-1; Fri, 31 Mar 2023 12:11:30 -0400 X-MC-Unique: 7dchcmswOjSwjHDFElj0Zg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9D36988904B; Fri, 31 Mar 2023 16:11:28 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 724AC4020C82; Fri, 31 Mar 2023 16:11:26 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Santosh Shilimkar , linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com Subject: [PATCH v3 46/55] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Fri, 31 Mar 2023 17:09:05 +0100 Message-Id: <20230331160914.1608208-47-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901229115518835?= X-GMAIL-MSGID: =?utf-8?q?1761901229115518835?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header and data pages. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header are copied into memory acquired from zcopy_alloc() which just breaks a page up into small pieces that can be freed with put_page(). Signed-off-by: David Howells cc: Santosh Shilimkar cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: rds-devel@oss.oracle.com cc: netdev@vger.kernel.org --- net/rds/tcp_send.c | 86 +++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 46 deletions(-) diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c index 8c4d1d6e9249..660d9f203d99 100644 --- a/net/rds/tcp_send.c +++ b/net/rds/tcp_send.c @@ -52,29 +52,24 @@ void rds_tcp_xmit_path_complete(struct rds_conn_path *cp) tcp_sock_set_cork(tc->t_sock->sk, false); } -/* the core send_sem serializes this with other xmit and shutdown */ -static int rds_tcp_sendmsg(struct socket *sock, void *data, unsigned int len) -{ - struct kvec vec = { - .iov_base = data, - .iov_len = len, - }; - struct msghdr msg = { - .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL, - }; - - return kernel_sendmsg(sock, &msg, &vec, 1, vec.iov_len); -} - /* the core send_sem serializes this with other xmit and shutdown */ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, unsigned int hdr_off, unsigned int sg, unsigned int off) { struct rds_conn_path *cp = rm->m_inc.i_conn_path; struct rds_tcp_connection *tc = cp->cp_transport_data; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; + struct bio_vec *bvec; + unsigned int i, size = 0, ix = 0; + bool free_hdr = false; int done = 0; - int ret = 0; - int more; + int ret = -ENOMEM; + + bvec = kmalloc_array(1 + sg, sizeof(struct bio_vec), GFP_KERNEL); + if (!bvec) + goto out; if (hdr_off == 0) { /* @@ -99,43 +94,37 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, if (hdr_off < sizeof(struct rds_header)) { /* see rds_tcp_write_space() */ + void *p; + set_bit(SOCK_NOSPACE, &tc->t_sock->sk->sk_socket->flags); - ret = rds_tcp_sendmsg(tc->t_sock, - (void *)&rm->m_inc.i_hdr + hdr_off, - sizeof(rm->m_inc.i_hdr) - hdr_off); - if (ret < 0) - goto out; - done += ret; - if (hdr_off + done != sizeof(struct rds_header)) + ret = -ENOMEM; + p = page_frag_memdup(NULL, + (void *)&rm->m_inc.i_hdr + hdr_off, + sizeof(rm->m_inc.i_hdr) - hdr_off, + GFP_KERNEL, ULONG_MAX); + if (!p) goto out; + bvec_set_virt(&bvec[ix], p, sizeof(rm->m_inc.i_hdr) - hdr_off); + free_hdr = true; + size += bvec[ix].bv_len; + ix++; } - more = rm->data.op_nents > 1 ? (MSG_MORE | MSG_SENDPAGE_NOTLAST) : 0; - while (sg < rm->data.op_nents) { - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - - ret = tc->t_sock->ops->sendpage(tc->t_sock, - sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, - rm->data.op_sg[sg].length - off, - flags); - rdsdebug("tcp sendpage %p:%u:%u ret %d\n", (void *)sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, rm->data.op_sg[sg].length - off, - ret); - if (ret <= 0) - break; - - off += ret; - done += ret; - if (off == rm->data.op_sg[sg].length) { - off = 0; - sg++; - } - if (sg == rm->data.op_nents - 1) - more = 0; + for (i = sg; i < rm->data.op_nents; i++) { + bvec_set_page(&bvec[ix], + sg_page(&rm->data.op_sg[i]), + rm->data.op_sg[i].length - off, + rm->data.op_sg[i].offset + off); + off = 0; + size += bvec[ix].bv_len; + ix++; } + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, ix, size); + ret = sock_sendmsg(tc->t_sock, &msg); + rdsdebug("tcp sendmsg-splice %u,%u ret %d\n", ix, size, ret); + out: if (ret <= 0) { /* write_space will hit after EAGAIN, all else fatal */ @@ -158,6 +147,11 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, } if (done == 0) done = ret; + if (bvec) { + if (free_hdr) + put_page(bvec[0].bv_page); + kfree(bvec); + } return done; } From patchwork Fri Mar 31 16:09:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77863 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp691329vqo; Fri, 31 Mar 2023 09:34:56 -0700 (PDT) X-Google-Smtp-Source: AKy350anxUpxEftzgmF96f73pMTr3mFvwgP2D7L31s/1VwSaRK9sHOuc92/1TegaTnJJyTonJm2j X-Received: by 2002:a17:906:dfe4:b0:8f6:5a70:cccc with SMTP id lc4-20020a170906dfe400b008f65a70ccccmr26109680ejc.66.1680280496496; Fri, 31 Mar 2023 09:34:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280496; cv=none; d=google.com; s=arc-20160816; b=UHrui/4F3+GU18ubYZVa4F2JM8apbSF2wdnGsph3LOeF4TAxYwYKM+JFjhz2r+KNJ8 DnlXZ1OkHMpwiQ2KvBvEWja2A0lswzSoS/Zn96Q3sxYH17hNhco2f4ZH26pt8aUiJvkZ X5FZm63BylJ99ZnxjivoLygoxzGtuFLBps5DbIYEoNlW81rJdaSOVY5Hd1qXnubABtBo hNWn3wMQn2z3K+ZMbNotp3+lTYVkZkcen0Jy8M12OURw4ZAlqKAsXegX6wdXRO8DwLgp BW7laKb+GRmo6cYpfdQTQi9FEf/CNfrqrtX98CfXPEmJKE9KcuvC6LtYd/iKL7LVwxy/ ZlVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6d/Oxl1Cdu8ybDL5tOgPD1gv3cW2fyMj4HBClKMadWo=; b=0K8+MVVo9NZw7pk+HdWpdCR3NXya2x42TufLpj0yB0k21i2CiqLAJVcrUat2Ig+xsq N5FAzLzKQ3/TEbNVzlzd4wmc/zNfQiBoXpzvpvkUJSvM8BXTZZf0MwE554p9Q6asv0M4 SNrBJUTHvTVtwr5nVwMDAQ1bZk6ftSyz1IzaHADGLlGNB/GwUFd+QwHTqs/MsRwEac0l szEyrLQfmdE2qteafTiHMZixuc2XSOo479NpTSe7kTHbGWr+GbskqjhIEWEsR2SZTeBh 7FnOpcvSX9E+5sEgb1fXzwsMFOwGvGOiddqg8ndp6nAg2d4TY9FIYwJIUSIW6IN0peXx ju2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XQ7Xl11J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id la2-20020a170907780200b0093f3a78f267si2220247ejc.1014.2023.03.31.09.34.32; Fri, 31 Mar 2023 09:34:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XQ7Xl11J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233437AbjCaQTV (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232661AbjCaQSO (ORCPT ); Fri, 31 Mar 2023 12:18:14 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34E682953E for ; Fri, 31 Mar 2023 09:12:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6d/Oxl1Cdu8ybDL5tOgPD1gv3cW2fyMj4HBClKMadWo=; b=XQ7Xl11JcU0g0QoyHLqRjt1H/ub1Cf5WUiTkse8UlTEcpT1CQIgceHxljMidQ2N6VzMet0 NqAgMlT8Lb0P43qvPb/4zgn/K1g6nfNrvvCax/d86CHZwFB85Y3/RDp6YrhpgB4Vf3nDVV ZoG4xjpV4r0QwovLZ5WaLe8TzkVSGDg= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-611-KDykX2VqNS617UfimOZiaQ-1; Fri, 31 Mar 2023 12:11:32 -0400 X-MC-Unique: KDykX2VqNS617UfimOZiaQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6DE3E29ABA19; Fri, 31 Mar 2023 16:11:31 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AF201121314; Fri, 31 Mar 2023 16:11:29 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christine Caulfield , David Teigland , cluster-devel@redhat.com Subject: [PATCH v3 47/55] dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Fri, 31 Mar 2023 17:09:06 +0100 Message-Id: <20230331160914.1608208-48-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901801557902443?= X-GMAIL-MSGID: =?utf-8?q?1761901801557902443?= When transmitting data, call down a layer using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather using sendpage. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Christine Caulfield cc: David Teigland cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: cluster-devel@redhat.com cc: netdev@vger.kernel.org --- fs/dlm/lowcomms.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index a9b14f81d655..9c0c691b6106 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1394,8 +1394,11 @@ int dlm_lowcomms_resend_msg(struct dlm_msg *msg) /* Send a message */ static int send_to_sock(struct connection *con) { - const int msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL; struct writequeue_entry *e; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; int len, offset, ret; spin_lock_bh(&con->writequeue_lock); @@ -1411,8 +1414,9 @@ static int send_to_sock(struct connection *con) WARN_ON_ONCE(len == 0 && e->users == 0); spin_unlock_bh(&con->writequeue_lock); - ret = kernel_sendpage(con->sock, e->page, offset, len, - msg_flags); + bvec_set_page(&bvec, e->page, len, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(con->sock, &msg); trace_dlm_send(con->nodeid, ret); if (ret == -EAGAIN || ret == 0) { lock_sock(con->sock->sk); From patchwork Fri Mar 31 16:09:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77854 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690175vqo; Fri, 31 Mar 2023 09:33:03 -0700 (PDT) X-Google-Smtp-Source: AKy350bpzj1BMN1nzNE4Ovyme1ZkfF9NXxTLcvW3zAf0r251drFp26j4releFWsXjjMnZQoif9Vg X-Received: by 2002:a17:90b:1e0e:b0:23d:3698:8ee8 with SMTP id pg14-20020a17090b1e0e00b0023d36988ee8mr31292742pjb.31.1680280382902; Fri, 31 Mar 2023 09:33:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280382; cv=none; d=google.com; s=arc-20160816; b=CNElkeCgsJmXzr/Ebd5lDad6AVSNrgcKl+lpCAYiPXB34jX2rLC+lMRfngunF7rkBs TfZFvbOcJrWfsGZ7jOJtcXO8u9TjlLJkMi9hqeWg4ROH9fxWJ1LZjPlgFZ01QalCT1CB FKoCensH326P+lxZqUyfiHlmhGQhCiW5IHOLkNBM+MSfDiQyMBH5DmthVqo+lLIxptDM 1mX9fvhLF0rF/5ClN0cz2sR/O7wxBb9yMsqYmns+v9hSpwx4LHmAA/+mntBJw8QaLM09 qFTUDAxo7QAoX5seYOfXWfCiMGI+4aiTHPwVgC9PehAK0a4g06iek50wrlabTdH967Oz JBMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=v5XUK+PQOR7d9klL6/6O6KO8pAUYr8F8Qal1kb1WUAE=; b=JrenPVWKm6gc/kNDSLbxl7naBwgRAt1SyHM/k4N9/hjY3fc/2huQS28PC4Bl7Cc2Rf FwcnknJXuIlboFJ9WxhglZTNxKUbaIUjNQOQxnld+aAQe3goyjeN6IWcivQReYuphkc/ hIwhjCcOWngl8CG12i8O2Rk5yztIRsBbcmhUfw7b+j7fZZ0aV3ezoR+3IccdTPs9bTUe GhqTWe0pMsfmP7mQhCAhskYWzsZxUkKbG4SoxqkW2UnvgEh2v9YvOxOh7/9fF8QqCoLK Nz1F4SiXULbvB/ruRqYquW3YLjExi2u0AI9FU9dV/tYMp8qJs63LRIj3rmHkD8wqUBaN 2puQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IowyM2Wr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z19-20020a17090ab11300b0023fcb58b0e0si2399280pjq.162.2023.03.31.09.32.50; Fri, 31 Mar 2023 09:33:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IowyM2Wr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233481AbjCaQTG (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233169AbjCaQSJ (ORCPT ); Fri, 31 Mar 2023 12:18:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A07D821A93 for ; Fri, 31 Mar 2023 09:12:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v5XUK+PQOR7d9klL6/6O6KO8pAUYr8F8Qal1kb1WUAE=; b=IowyM2WrpAh7y8MuhIweL3+bf651A3RYPxq5Pd6p00dY2hnZaHXOzMZYp55zdFXYfxq2IP nuzBvXocwMI5kPpjrcrciQB2k0Is+IgaS5DfImKckcYG6wE0eGur/7ymC3dqdGcbVf9d/L dE+aJnTvPjbob7VJPxFXCQoozItWqcg= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-256-fHdJvov2Og26gxE1ES-_Qg-1; Fri, 31 Mar 2023 12:11:35 -0400 X-MC-Unique: fHdJvov2Og26gxE1ES-_Qg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4E74229ABA17; Fri, 31 Mar 2023 16:11:34 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 22BEA492C3E; Fri, 31 Mar 2023 16:11:32 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Trond Myklebust , Anna Schumaker , linux-nfs@vger.kernel.org Subject: [PATCH v3 48/55] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Fri, 31 Mar 2023 17:09:07 +0100 Message-Id: <20230331160914.1608208-49-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901682540989719?= X-GMAIL-MSGID: =?utf-8?q?1761901682540989719?= When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells Acked-by: Chuck Lever cc: Trond Myklebust cc: Anna Schumaker cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nfs@vger.kernel.org cc: netdev@vger.kernel.org --- include/linux/sunrpc/svc.h | 11 +++++------ net/sunrpc/svcsock.c | 38 ++++++++++++-------------------------- 2 files changed, 17 insertions(+), 32 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 877891536c2f..456ae554aa11 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -161,16 +161,15 @@ static inline bool svc_put_not_last(struct svc_serv *serv) extern u32 svc_max_payload(const struct svc_rqst *rqstp); /* - * RPC Requsts and replies are stored in one or more pages. + * RPC Requests and replies are stored in one or more pages. * We maintain an array of pages for each server thread. * Requests are copied into these pages as they arrive. Remaining * pages are available to write the reply into. * - * Pages are sent using ->sendpage so each server thread needs to - * allocate more to replace those used in sending. To help keep track - * of these pages we have a receive list where all pages initialy live, - * and a send list where pages are moved to when there are to be part - * of a reply. + * Pages are sent using ->sendmsg with MSG_SPLICE_PAGES so each server thread + * needs to allocate more to replace those used in sending. To help keep track + * of these pages we have a receive list where all pages initialy live, and a + * send list where pages are moved to when there are to be part of a reply. * * We use xdr_buf for holding responses as it fits well with NFS * read responses (that have a header, and some data pages, and possibly diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 03a4f5615086..3a015abac5bd 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1063,13 +1063,14 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp) static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec, int flags) { - return kernel_sendpage(sock, virt_to_page(vec->iov_base), - offset_in_page(vec->iov_base), - vec->iov_len, flags); + struct msghdr msg = { .msg_flags = MSG_SPLICE_PAGES | flags, }; + + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 1, vec->iov_len); + return sock_sendmsg(sock, &msg); } /* - * kernel_sendpage() is used exclusively to reduce the number of + * MSG_SPLICE_PAGES is used exclusively to reduce the number of * copy operations in this path. Therefore the caller must ensure * that the pages backing @xdr are unchanging. * @@ -1109,28 +1110,13 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr, if (ret != head->iov_len) goto out; - if (xdr->page_len) { - unsigned int offset, len, remaining; - struct bio_vec *bvec; - - bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT); - offset = offset_in_page(xdr->page_base); - remaining = xdr->page_len; - while (remaining > 0) { - len = min(remaining, bvec->bv_len - offset); - ret = kernel_sendpage(sock, bvec->bv_page, - bvec->bv_offset + offset, - len, 0); - if (ret < 0) - return ret; - *sentp += ret; - if (ret != len) - goto out; - remaining -= len; - offset = 0; - bvec++; - } - } + msg.msg_flags = MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec, + xdr_buf_pagecount(xdr), xdr->page_len); + ret = sock_sendmsg(sock, &msg); + if (ret < 0) + return ret; + *sentp += ret; if (tail->iov_len) { ret = svc_tcp_send_kvec(sock, tail, 0); From patchwork Fri Mar 31 16:09:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77862 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp690954vqo; Fri, 31 Mar 2023 09:34:15 -0700 (PDT) X-Google-Smtp-Source: AKy350Zqr3Y/Q/SDKOPI3wHEQeGvGH4gAxJ7ABQOarIeRREIxLPRg9CtkcgFuGbSoZNxQLQ2qsvA X-Received: by 2002:a17:902:f54b:b0:1a2:9288:e0b7 with SMTP id h11-20020a170902f54b00b001a29288e0b7mr7761490plf.56.1680280455210; Fri, 31 Mar 2023 09:34:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280455; cv=none; d=google.com; s=arc-20160816; b=KfjPCTbFb89sWmQUpvWHQNqA9nBH/RlzmBCyDYQZxUM4LodMxr/Z0nrimSdzTGodPM CZ1BUTLYIKaLcpS8ji8aLMBfFcJb4thwD4I6FEmiodGMSgBD0NlsOORbu2KUKnh3i8Tm MqddybE674eo0S3oyy5T8+GO11Vlwe61KwkC18PgkjeVpXggI5AXYzeTEd/KVFkbHC8h HccDykCEMpvO51ODTX1KDT939nEkwhLhWgZgWLfvJnoONkYMCYBw4+1rPfIb/SSMCBUe 05rciTQxGkUilHaXtAc3VvTnTEP7Y/LVyDclKoX7BbCXjOfFztxnq59wcmjtKObkCEMI VDUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XvqldHWl7xh/L1V+KGWEm9YizW/K8SDxYbPNnLnjdW8=; b=L5RAxNX7Veqh/t1hMT3iNK1x2dKFj6BSjBTlCg+gmZ9KrdANFbd3nJJHzshkShRksA cAJhkFRtl7hNaxz7+Cs0b7L2cmmz2eon5hI70ruUx7qk9RCJpiD3RQwHw4ZtPea8D9v3 s8Ht5N+uH376buV83hQQSBz4Og640ILQPWXO2oZfv9L0nj2eyc9qbbDIhQgxFG2f4Gtx lDn7VLz02l0GBntt3wDvzclW9A1jjESkV2CbFSONC4iGp8mKI62R0Zl2ySjrc2dbz0v6 J9txEmvKTtLTsYtw3rNfoAqE1eXZO+U8b0WvM7c1qPusxDU9oRlH4lNOb7wZU8gbRX+q LEhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WeIQf3dv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w20-20020a1709029a9400b0019a7ef5e9acsi2340119plp.305.2023.03.31.09.34.02; Fri, 31 Mar 2023 09:34:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WeIQf3dv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233574AbjCaQTn (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233505AbjCaQSZ (ORCPT ); Fri, 31 Mar 2023 12:18:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E32E2D4FB0 for ; Fri, 31 Mar 2023 09:12:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279104; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XvqldHWl7xh/L1V+KGWEm9YizW/K8SDxYbPNnLnjdW8=; b=WeIQf3dvNL/Rz1+oCVb2FEVxC9zygSUxmS/OqJSnrk+Y/ZuKHo4+ll0VZoBN2TTTpTkwpG cx+UWHVAgpP8yO04hmKGk5ZGVSRwipl8K12Bc4Q8cqz1e67vPRU6FvW1yHJe4D4+u4rdRZ eVCX5isSZC+iQxWj8UWCcdCxqs5diAc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-173-1ZXUWRtzOLGY7fvr_wG5SQ-1; Fri, 31 Mar 2023 12:11:39 -0400 X-MC-Unique: 1ZXUWRtzOLGY7fvr_wG5SQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5FC60889047; Fri, 31 Mar 2023 16:11:37 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DD7E5202701E; Fri, 31 Mar 2023 16:11:34 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Chaitanya Kulkarni , linux-nvme@lists.infradead.org Subject: [PATCH v3 49/55] nvme: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Fri, 31 Mar 2023 17:09:08 +0100 Message-Id: <20230331160914.1608208-50-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901759053553569?= X-GMAIL-MSGID: =?utf-8?q?1761901759053553569?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells cc: Keith Busch cc: Jens Axboe cc: Christoph Hellwig cc: Sagi Grimberg cc: Chaitanya Kulkarni cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nvme@lists.infradead.org cc: netdev@vger.kernel.org --- drivers/nvme/host/tcp.c | 44 ++++++++++++++++++------------------ drivers/nvme/target/tcp.c | 47 +++++++++++++++++++++++++-------------- 2 files changed, 52 insertions(+), 39 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index fa32969b532f..cc617692702d 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -979,25 +979,23 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req) u32 h2cdata_left = req->h2cdata_left; while (true) { + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; struct page *page = nvme_tcp_req_cur_page(req); size_t offset = nvme_tcp_req_cur_offset(req); size_t len = nvme_tcp_req_cur_length(req); bool last = nvme_tcp_pdu_last_send(req, len); int req_data_sent = req->data_sent; - int ret, flags = MSG_DONTWAIT; + int ret; if (last && !queue->data_digest && !nvme_tcp_queue_more(queue)) - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; else - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; - if (sendpage_ok(page)) { - ret = kernel_sendpage(queue->sock, page, offset, len, - flags); - } else { - ret = sock_no_sendpage(queue->sock, page, offset, len, - flags); - } + bvec_set_page(&bvec, page, len, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (ret <= 0) return ret; @@ -1036,22 +1034,24 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req) { struct nvme_tcp_queue *queue = req->queue; struct nvme_tcp_cmd_pdu *pdu = req->pdu; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; bool inline_data = nvme_tcp_has_inline_data(req); u8 hdgst = nvme_tcp_hdgst_len(queue); int len = sizeof(*pdu) + hdgst - req->offset; - int flags = MSG_DONTWAIT; int ret; if (inline_data || nvme_tcp_queue_more(queue)) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; if (queue->hdr_digest && !req->offset) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); - ret = kernel_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, flags); + bvec_set_virt(&bvec, (void *)pdu + req->offset, len); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (unlikely(ret <= 0)) return ret; @@ -1075,6 +1075,8 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req) { struct nvme_tcp_queue *queue = req->queue; struct nvme_tcp_data_pdu *pdu = req->pdu; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_MORE, }; u8 hdgst = nvme_tcp_hdgst_len(queue); int len = sizeof(*pdu) - req->offset + hdgst; int ret; @@ -1083,13 +1085,11 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); if (!req->h2cdata_left) - ret = kernel_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, - MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST); - else - ret = sock_no_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, - MSG_DONTWAIT | MSG_MORE); + msg.msg_flags |= MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST; + + bvec_set_virt(&bvec, (void *)pdu + req->offset, len); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (unlikely(ret <= 0)) return ret; diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index d6cc557cc539..00b491abf50f 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -548,13 +548,18 @@ static void nvmet_tcp_execute_request(struct nvmet_tcp_cmd *cmd) static int nvmet_try_send_data_pdu(struct nvmet_tcp_cmd *cmd) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = (MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST | + MSG_SPLICE_PAGES), + }; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->data_pdu) - cmd->offset + hdgst; int ret; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->data_pdu), - offset_in_page(cmd->data_pdu) + cmd->offset, - left, MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST); + bvec_set_virt(&bvec, (void *)cmd->data_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; @@ -575,17 +580,21 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch) int ret; while (cmd->cur_sg) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, + }; struct page *page = sg_page(cmd->cur_sg); u32 left = cmd->cur_sg->length - cmd->offset; - int flags = MSG_DONTWAIT; if ((!last_in_batch && cmd->queue->send_list_len) || cmd->wbytes_done + left < cmd->req.transfer_len || queue->data_digest || !queue->nvme_sq.sqhd_disabled) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; - ret = kernel_sendpage(cmd->queue->sock, page, cmd->offset, - left, flags); + bvec_set_page(&bvec, page, left, cmd->offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; @@ -621,18 +630,20 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch) static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd, bool last_in_batch) { + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->rsp_pdu) - cmd->offset + hdgst; - int flags = MSG_DONTWAIT; int ret; if (!last_in_batch && cmd->queue->send_list_len) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->rsp_pdu), - offset_in_page(cmd->rsp_pdu) + cmd->offset, left, flags); + bvec_set_virt(&bvec, (void *)cmd->rsp_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; cmd->offset += ret; @@ -649,18 +660,20 @@ static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd, static int nvmet_try_send_r2t(struct nvmet_tcp_cmd *cmd, bool last_in_batch) { + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->r2t_pdu) - cmd->offset + hdgst; - int flags = MSG_DONTWAIT; int ret; if (!last_in_batch && cmd->queue->send_list_len) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->r2t_pdu), - offset_in_page(cmd->r2t_pdu) + cmd->offset, left, flags); + bvec_set_virt(&bvec, (void *)cmd->r2t_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; cmd->offset += ret; From patchwork Fri Mar 31 16:09:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77864 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp691349vqo; Fri, 31 Mar 2023 09:34:59 -0700 (PDT) X-Google-Smtp-Source: AKy350Z2wotYYlOEzXOt8vx4JYDqTJgT9caSagg2EcO3grvGrA22FgnuLyDEVntZBk/FcAixSKAa X-Received: by 2002:a17:906:a3ca:b0:926:e917:133c with SMTP id ca10-20020a170906a3ca00b00926e917133cmr26773329ejb.47.1680280499464; Fri, 31 Mar 2023 09:34:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280499; cv=none; d=google.com; s=arc-20160816; b=LG1kNv7AYW3Q57u8CVWz7oCcaSZVJkhYjxeR32UovHBoPn9AdR14eKOsgi3xm//60m H924JBo1fcUib3N472RUjVMxCzwT8V9zOKboUz69N706xcnI36AFll9eK35GINr3pUZs TR/aCz5xWSRbpXUyU2Vf4GzN9z1bo3xi1c0w92iFQn4AZjkPc+YxqrQAV2caCjz/Okoa 4D4IXPP0UF+WRrJwRStya7sYp4kdrgapnGTedE2lq0BdYS6Jxp/yB5BXbe8En2xb9Cr/ hOE6+qkcPbeMdFZinmZU9Qd/gNluKgWHOuCId9/RCLS+9dTKtwXWEczYJyPTq5n/qQKd Zrlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=V7IwXZpp4caT5icSmRzfTKcX2kIg430bJYtEqUuaqXc=; b=I8AhtmmjlEHSkfvPrqWZWroMLpwPaVGog51rZwIsX5yL+/fu3B5wSjSaO53Rn/68al UGw9J1ydNrO0JvXAy6w3aFLm8hCbQUAZTbK9oVqJe+O3W50JyK1rXVNRp7si3tyjmUH+ wGNQEX2FS4ooB5eFJNNQ2OULU0Jho6yfh0JOPsk93vFjCVWkXuyYvz8diJRdOBzViEGB zw7+o9ylUdVqV+d2sBAu2RDEPjxNarqRkadxJtTMIl1LkprL2Ti+saawQaxzTGueK/P1 k95DwQkAqYSEU/pxkbm2DxWjpKbogT4I3OdrVDajFrGK/4i32WN9mXKzD82/qNPlHinG LOCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TGf1+sUT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i17-20020a170906a29100b00947734b9d71si2033709ejz.685.2023.03.31.09.34.35; Fri, 31 Mar 2023 09:34:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TGf1+sUT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233557AbjCaQTf (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233495AbjCaQSW (ORCPT ); Fri, 31 Mar 2023 12:18:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BD98D4FA6 for ; Fri, 31 Mar 2023 09:12:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279103; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V7IwXZpp4caT5icSmRzfTKcX2kIg430bJYtEqUuaqXc=; b=TGf1+sUTpzZ1q2YPwYW1QjbEoAU6RtF3MnAwdeFcWhNliuv8nBjwMVjnJenSRKqJpDxKfl 9gyUPyq1KmCKvfScplb5l4pgUt/0gEAHEuXJ5axZVgHhClzomzK4kKplezQvl2Ed8vQYCp 5d1fDhT87kGmopQ+ryHLE5nf3bfqHDk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-499-KUirUmF0P5GTXUc33yxBYQ-1; Fri, 31 Mar 2023 12:11:41 -0400 X-MC-Unique: KUirUmF0P5GTXUc33yxBYQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2135D10119EB; Fri, 31 Mar 2023 16:11:40 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 120582166B33; Fri, 31 Mar 2023 16:11:37 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Tom Herbert , Tom Herbert Subject: [PATCH v3 50/55] kcm: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Fri, 31 Mar 2023 17:09:09 +0100 Message-Id: <20230331160914.1608208-51-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901804785689448?= X-GMAIL-MSGID: =?utf-8?q?1761901804785689448?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells cc: Tom Herbert cc: Tom Herbert cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/kcm/kcmsock.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index d77d28fbf389..9c9d379aafb1 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -641,6 +641,10 @@ static int kcm_write_msgs(struct kcm_sock *kcm) for (fragidx = 0; fragidx < skb_shinfo(skb)->nr_frags; fragidx++) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, + }; skb_frag_t *frag; frag_offset = 0; @@ -651,11 +655,12 @@ static int kcm_write_msgs(struct kcm_sock *kcm) goto out; } - ret = kernel_sendpage(psock->sk->sk_socket, - skb_frag_page(frag), - skb_frag_off(frag) + frag_offset, - skb_frag_size(frag) - frag_offset, - MSG_DONTWAIT); + bvec_set_page(&bvec, + skb_frag_page(frag), + skb_frag_size(frag) - frag_offset, + skb_frag_off(frag) + frag_offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bvec.bv_len); + ret = sock_sendmsg(psock->sk->sk_socket, &msg); if (ret <= 0) { if (ret == -EAGAIN) { /* Save state to try again when there's From patchwork Fri Mar 31 16:09:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77894 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp698440vqo; Fri, 31 Mar 2023 09:46:55 -0700 (PDT) X-Google-Smtp-Source: AKy350Yujr5y7ab4nDqNINAF4uRaIjDDcf1ZHxq2xuKVLzVYZw8kYiP0grU44vSsDWEeeB/KSdcs X-Received: by 2002:aa7:cacf:0:b0:502:32ba:37a9 with SMTP id l15-20020aa7cacf000000b0050232ba37a9mr21979906edt.0.1680281214966; Fri, 31 Mar 2023 09:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680281214; cv=none; d=google.com; s=arc-20160816; b=EcrPb/bNwvYaKNSduVr0oLjplB+3AKkhf+mehbksZSbg0WyuxOTyTxsWS7OgWaMtuC uOtpOaz5XMoAIG5LN5PZAKgV+/Lndz7eutyTToL8RrQ6zlPEwiHr6U6bzJQN658fhjvB eeYR3ZtnXcRuyFy+q3nkDV8GNnMklbtt9N0gTDAKqTItRTLYwKrS7Jv5d6dQcwoDPoe8 IqA+KqUVjIT5j5G8hPnZnZtDPBt/fFpzII12t9Rps1JYR/0ojUkuLTzS1XbhedZjbHJY jhmrNwyamZFauOWl5npDp/ymZKNI3NZAvaD9iGF4ZsGID/iMlTrw1iAPgM54iz1zwoIC PlIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7fCBjYGLxHqikc/tNao5Lfq+HewWC5ZgTjyFTAyeW0Q=; b=eQ01W9ibKNWLVHMJKaNbePu0JWRGz2ixIX4tVZUfnlRqN+gIvjaReA5ZBIzoZ/NlcE J4hd8YEvBybaBS5c/FrOKFSXXxP+JPI9sR04jT1XFklrslBABe7mrTZDL1sD09DhTXG+ GvPKt/OPzbYSRhnQXa0UKVxKs8eWbXA88REEBy7A469e+ETZ9EG0v4rNQhq/EaSxLleI rpvmoqStYZ92Kcs4sY7WUwYh9xT8D7os15wkGUDPaeU043MUs0LTVxiz6MUZxq1b8c8b UJwjpUKaoMLDl35PmX+rZ5CBGM3c1mUAOt7ySQ0+ksxfA/H45cxFsi3LN0K08XxXtH4Q Ewhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PEtnKC7P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r16-20020aa7cb90000000b004fc66bd07dbsi2349963edt.369.2023.03.31.09.46.31; Fri, 31 Mar 2023 09:46:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PEtnKC7P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232628AbjCaQms (ORCPT + 99 others); Fri, 31 Mar 2023 12:42:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231818AbjCaQmL (ORCPT ); Fri, 31 Mar 2023 12:42:11 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30FE2236BF for ; Fri, 31 Mar 2023 09:37:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680280629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7fCBjYGLxHqikc/tNao5Lfq+HewWC5ZgTjyFTAyeW0Q=; b=PEtnKC7P0Cm9lXn3Z3f+5Y5kteGRZq3QsyuzyDCmDsjJXjKMl7/rwtfyOww994hRZnWo8k x5qsZa9sEK1y7t4nQEAT5Xj0XXM1R86dl/J8Gi5BfvNxz03PPak/Te+KVTCu9+HK8tQWaa N2uLlr2GUhmmH2PG57tvMGg/PsKQR8c= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-47-U_ILvhbNNGasHHvG4A8zOg-1; Fri, 31 Mar 2023 12:11:44 -0400 X-MC-Unique: U_ILvhbNNGasHHvG4A8zOg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3770D3C0F38A; Fri, 31 Mar 2023 16:11:43 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id AF40D140E94F; Fri, 31 Mar 2023 16:11:40 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Karsten Graul , Wenjia Zhang , Jan Karcher , linux-s390@vger.kernel.org Subject: [PATCH v3 51/55] smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES Date: Fri, 31 Mar 2023 17:09:10 +0100 Message-Id: <20230331160914.1608208-52-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902554897972090?= X-GMAIL-MSGID: =?utf-8?q?1761902554897972090?= Drop the smc_sendpage() code as smc_sendmsg() just passes the call down to the underlying TCP socket and smc_tx_sendpage() is just a wrapper around its sendmsg implementation. Signed-off-by: David Howells cc: Karsten Graul cc: Wenjia Zhang cc: Jan Karcher cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-s390@vger.kernel.org cc: netdev@vger.kernel.org --- net/smc/af_smc.c | 29 ----------------------------- net/smc/smc_stats.c | 2 +- net/smc/smc_stats.h | 1 - net/smc/smc_tx.c | 16 ---------------- net/smc/smc_tx.h | 2 -- 5 files changed, 1 insertion(+), 49 deletions(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index a4cccdfdc00a..d4113c8a7cda 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -3125,34 +3125,6 @@ static int smc_ioctl(struct socket *sock, unsigned int cmd, return put_user(answ, (int __user *)arg); } -static ssize_t smc_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct sock *sk = sock->sk; - struct smc_sock *smc; - int rc = -EPIPE; - - smc = smc_sk(sk); - lock_sock(sk); - if (sk->sk_state != SMC_ACTIVE) { - release_sock(sk); - goto out; - } - release_sock(sk); - if (smc->use_fallback) { - rc = kernel_sendpage(smc->clcsock, page, offset, - size, flags); - } else { - lock_sock(sk); - rc = smc_tx_sendpage(smc, page, offset, size, flags); - release_sock(sk); - SMC_STAT_INC(smc, sendpage_cnt); - } - -out: - return rc; -} - /* Map the affected portions of the rmbe into an spd, note the number of bytes * to splice in conn->splice_pending, and press 'go'. Delays consumer cursor * updates till whenever a respective page has been fully processed. @@ -3224,7 +3196,6 @@ static const struct proto_ops smc_sock_ops = { .sendmsg = smc_sendmsg, .recvmsg = smc_recvmsg, .mmap = sock_no_mmap, - .sendpage = smc_sendpage, .splice_read = smc_splice_read, }; diff --git a/net/smc/smc_stats.c b/net/smc/smc_stats.c index e80e34f7ac15..ca14c0f3a07d 100644 --- a/net/smc/smc_stats.c +++ b/net/smc/smc_stats.c @@ -227,7 +227,7 @@ static int smc_nl_fill_stats_tech_data(struct sk_buff *skb, SMC_NLA_STATS_PAD)) goto errattr; if (nla_put_u64_64bit(skb, SMC_NLA_STATS_T_SENDPAGE_CNT, - smc_tech->sendpage_cnt, + 0, SMC_NLA_STATS_PAD)) goto errattr; if (nla_put_u64_64bit(skb, SMC_NLA_STATS_T_CORK_CNT, diff --git a/net/smc/smc_stats.h b/net/smc/smc_stats.h index 84b7ecd8c05c..b60fe1eb37ab 100644 --- a/net/smc/smc_stats.h +++ b/net/smc/smc_stats.h @@ -71,7 +71,6 @@ struct smc_stats_tech { u64 clnt_v2_succ_cnt; u64 srv_v1_succ_cnt; u64 srv_v2_succ_cnt; - u64 sendpage_cnt; u64 urg_data_cnt; u64 splice_cnt; u64 cork_cnt; diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c index f4b6a71ac488..d31ce8209fa2 100644 --- a/net/smc/smc_tx.c +++ b/net/smc/smc_tx.c @@ -298,22 +298,6 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len) return rc; } -int smc_tx_sendpage(struct smc_sock *smc, struct page *page, int offset, - size_t size, int flags) -{ - struct msghdr msg = {.msg_flags = flags}; - char *kaddr = kmap(page); - struct kvec iov; - int rc; - - iov.iov_base = kaddr + offset; - iov.iov_len = size; - iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &iov, 1, size); - rc = smc_tx_sendmsg(smc, &msg, size); - kunmap(page); - return rc; -} - /***************************** sndbuf consumer *******************************/ /* sndbuf consumer: actual data transfer of one target chunk with ISM write */ diff --git a/net/smc/smc_tx.h b/net/smc/smc_tx.h index 34b578498b1f..a59f370b8b43 100644 --- a/net/smc/smc_tx.h +++ b/net/smc/smc_tx.h @@ -31,8 +31,6 @@ void smc_tx_pending(struct smc_connection *conn); void smc_tx_work(struct work_struct *work); void smc_tx_init(struct smc_sock *smc); int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len); -int smc_tx_sendpage(struct smc_sock *smc, struct page *page, int offset, - size_t size, int flags); int smc_tx_sndbuf_nonempty(struct smc_connection *conn); void smc_tx_sndbuf_nonfull(struct smc_sock *smc); void smc_tx_consumer_update(struct smc_connection *conn, bool force); From patchwork Fri Mar 31 16:09:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77880 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp693107vqo; Fri, 31 Mar 2023 09:37:46 -0700 (PDT) X-Google-Smtp-Source: AKy350b5C5+g4YJYaxMi4zu5aHhvYarp3ANapZCtkhRunsmNwxy48Y647dfVi1peOsr1wFMkS5wz X-Received: by 2002:a17:906:b159:b0:93d:b767:9fea with SMTP id bt25-20020a170906b15900b0093db7679feamr26296365ejb.31.1680280666398; Fri, 31 Mar 2023 09:37:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280666; cv=none; d=google.com; s=arc-20160816; b=cEDxGgAlfSmU0k4qVgv523Qvbs/BdirPwmMVrjWeuP9q7I9NuIYk7i+ctw6PUOSNGZ 9d5R5ehIRsqq8okF/8lzjDos0Db1qjyyCvrJTd98nKpXX/tjwHtyJAxO1G9B5RtJQdet mRpirs78GLoFUoU0cV6QYZuCzCWyCGhEl/JRCLOlSeg2Rm+Cd1E9hwNOK3dTLQqWbeKl WLhKT5IFHokNOkSsyhtcqroXh54KthdyBBb2bZeebVcsJycoqwLwX4L5rgVQl60nMdRq abh6AgdrB9sXk1gL9O+OfFwnpy5xgNbDQOy7JMZQoT1+T2pvoRTsRpbWT6VQaGQVviuc qodA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UieBw+fd7YOFzWwOOxTgfiaOtpF+6tMbkd08W9f/ez8=; b=igktXuKvTOCwNJ5nxes68IomGco16gni6OVF3PkLnco8Lw1hIJPmcxKCcmHxYfpkes pN/4oEnJhvW5gMnpFCT/8eRhpj1apgw/vX5d7D+ipltFgir9P/PTw+dNieivnUJRVkUi 72Oa5gImzXQoCZeWlimZh/eQUoSkifFlHvbw0f1L/aEYlzrvpj065AvHTLrjb3p1z+wt cmzsuiflisYvlkBw6S9YBxOJMfcDGDx6fKHpSh7vLrSIvy7duoX8Ov96m3wKlOAv0Uzm L4EbXul6xWsC9mNyNUUXqaGMjdb5GYdbZ6Kd/Jn/QvLVBTH55sMThx4wzoHVoUnNKSEV EdSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RGUdxISA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id de25-20020a1709069bd900b0092222faa1c0si1971461ejc.65.2023.03.31.09.37.22; Fri, 31 Mar 2023 09:37:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RGUdxISA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233565AbjCaQTi (ORCPT + 99 others); Fri, 31 Mar 2023 12:19:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233454AbjCaQS0 (ORCPT ); Fri, 31 Mar 2023 12:18:26 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D44E922236 for ; Fri, 31 Mar 2023 09:12:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279113; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UieBw+fd7YOFzWwOOxTgfiaOtpF+6tMbkd08W9f/ez8=; b=RGUdxISAHnDhbJEDl8uYi4bp9L1CA9BYKby6epc6Byd3Ay5PpFsMBw6BR9dv6kYCn5CpOP 9w11FH6GwEspX5NLvkCvnZ8Ke6ZqsCLLtmPlaErdvfxbCWC3M6B2moL4HTxiPp1xtTH53M 2/SAv7zukZPWNvOWUtBbZf6VvfNd6vc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-383-U63PXVGmN_KbEEPQKdH0Bg-1; Fri, 31 Mar 2023 12:11:48 -0400 X-MC-Unique: U63PXVGmN_KbEEPQKdH0Bg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E02A185A7A4; Fri, 31 Mar 2023 16:11:46 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DD6B914171B6; Fri, 31 Mar 2023 16:11:43 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mark Fasheh , Joel Becker , Joseph Qi , ocfs2-devel@oss.oracle.com Subject: [PATCH v3 52/55] ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Fri, 31 Mar 2023 17:09:11 +0100 Message-Id: <20230331160914.1608208-53-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761901979931566702?= X-GMAIL-MSGID: =?utf-8?q?1761901979931566702?= Fix ocfs2 to use the page fragment allocator rather than kzalloc in order to allocate the buffers for the handshake message and keepalive request and reply messages. Slab pages should not be given to sendpage, but fragments can be. Switch from using sendpage() to using sendmsg() + MSG_SPLICE_PAGES so that sendpage can be phased out. Signed-off-by: David Howells cc: Mark Fasheh cc: Joel Becker cc: Joseph Qi cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: ocfs2-devel@oss.oracle.com cc: netdev@vger.kernel.org --- fs/ocfs2/cluster/tcp.c | 107 ++++++++++++++++++++++------------------- 1 file changed, 58 insertions(+), 49 deletions(-) diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index aecbd712a00c..e568ad2f34bf 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -110,9 +110,6 @@ static struct work_struct o2net_listen_work; static struct o2hb_callback_func o2net_hb_up, o2net_hb_down; #define O2NET_HB_PRI 0x1 -static struct o2net_handshake *o2net_hand; -static struct o2net_msg *o2net_keep_req, *o2net_keep_resp; - static int o2net_sys_err_translations[O2NET_ERR_MAX] = {[O2NET_ERR_NONE] = 0, [O2NET_ERR_NO_HNDLR] = -ENOPROTOOPT, @@ -930,19 +927,22 @@ static int o2net_send_tcp_msg(struct socket *sock, struct kvec *vec, } static void o2net_sendpage(struct o2net_sock_container *sc, - void *kmalloced_virt, - size_t size) + void *virt, size_t size) { struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num); + struct msghdr msg = {}; + struct bio_vec bv; ssize_t ret; + bvec_set_virt(&bv, virt, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, size); + while (1) { + msg.msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES; mutex_lock(&sc->sc_send_lock); - ret = sc->sc_sock->ops->sendpage(sc->sc_sock, - virt_to_page(kmalloced_virt), - offset_in_page(kmalloced_virt), - size, MSG_DONTWAIT); + ret = sock_sendmsg(sc->sc_sock, &msg); mutex_unlock(&sc->sc_send_lock); + if (ret == size) break; if (ret == (ssize_t)-EAGAIN) { @@ -1168,6 +1168,7 @@ static int o2net_process_message(struct o2net_sock_container *sc, struct o2net_msg *hdr) { struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num); + struct o2net_msg *keep_resp; int ret = 0, handler_status; enum o2net_system_error syserr; struct o2net_msg_handler *nmh = NULL; @@ -1186,8 +1187,16 @@ static int o2net_process_message(struct o2net_sock_container *sc, be32_to_cpu(hdr->status)); goto out; case O2NET_MSG_KEEP_REQ_MAGIC: - o2net_sendpage(sc, o2net_keep_resp, - sizeof(*o2net_keep_resp)); + keep_resp = page_frag_alloc(NULL, sizeof(*keep_resp), + GFP_KERNEL); + if (!keep_resp) { + ret = -ENOMEM; + goto out; + } + memset(keep_resp, 0, sizeof(*keep_resp)); + keep_resp->magic = cpu_to_be16(O2NET_MSG_KEEP_RESP_MAGIC); + o2net_sendpage(sc, keep_resp, sizeof(*keep_resp)); + folio_put(virt_to_folio(keep_resp)); goto out; case O2NET_MSG_KEEP_RESP_MAGIC: goto out; @@ -1439,15 +1448,22 @@ static void o2net_rx_until_empty(struct work_struct *work) sc_put(sc); } -static void o2net_initialize_handshake(void) +static struct o2net_handshake *o2net_initialize_handshake(void) { - o2net_hand->o2hb_heartbeat_timeout_ms = cpu_to_be32( - O2HB_MAX_WRITE_TIMEOUT_MS); - o2net_hand->o2net_idle_timeout_ms = cpu_to_be32(o2net_idle_timeout()); - o2net_hand->o2net_keepalive_delay_ms = cpu_to_be32( - o2net_keepalive_delay()); - o2net_hand->o2net_reconnect_delay_ms = cpu_to_be32( - o2net_reconnect_delay()); + struct o2net_handshake *hand; + + hand = page_frag_alloc(NULL, sizeof(*hand), GFP_KERNEL); + if (!hand) + return NULL; + + memset(hand, 0, sizeof(*hand)); + hand->protocol_version = cpu_to_be64(O2NET_PROTOCOL_VERSION); + hand->connector_id = cpu_to_be64(1); + hand->o2hb_heartbeat_timeout_ms = cpu_to_be32(O2HB_MAX_WRITE_TIMEOUT_MS); + hand->o2net_idle_timeout_ms = cpu_to_be32(o2net_idle_timeout()); + hand->o2net_keepalive_delay_ms = cpu_to_be32(o2net_keepalive_delay()); + hand->o2net_reconnect_delay_ms = cpu_to_be32(o2net_reconnect_delay()); + return hand; } /* ------------------------------------------------------------ */ @@ -1456,16 +1472,22 @@ static void o2net_initialize_handshake(void) * rx path will see the response and mark the sc valid */ static void o2net_sc_connect_completed(struct work_struct *work) { + struct o2net_handshake *hand; struct o2net_sock_container *sc = container_of(work, struct o2net_sock_container, sc_connect_work); + hand = o2net_initialize_handshake(); + if (!hand) + goto out; + mlog(ML_MSG, "sc sending handshake with ver %llu id %llx\n", (unsigned long long)O2NET_PROTOCOL_VERSION, - (unsigned long long)be64_to_cpu(o2net_hand->connector_id)); + (unsigned long long)be64_to_cpu(hand->connector_id)); - o2net_initialize_handshake(); - o2net_sendpage(sc, o2net_hand, sizeof(*o2net_hand)); + o2net_sendpage(sc, hand, sizeof(*hand)); + folio_put(virt_to_folio(hand)); +out: sc_put(sc); } @@ -1475,8 +1497,15 @@ static void o2net_sc_send_keep_req(struct work_struct *work) struct o2net_sock_container *sc = container_of(work, struct o2net_sock_container, sc_keepalive_work.work); + struct o2net_msg *keep_req; - o2net_sendpage(sc, o2net_keep_req, sizeof(*o2net_keep_req)); + keep_req = page_frag_alloc(NULL, sizeof(*keep_req), GFP_KERNEL); + if (keep_req) { + memset(keep_req, 0, sizeof(*keep_req)); + keep_req->magic = cpu_to_be16(O2NET_MSG_KEEP_REQ_MAGIC); + o2net_sendpage(sc, keep_req, sizeof(*keep_req)); + folio_put(virt_to_folio(keep_req)); + } sc_put(sc); } @@ -1780,6 +1809,7 @@ static int o2net_accept_one(struct socket *sock, int *more) struct socket *new_sock = NULL; struct o2nm_node *node = NULL; struct o2nm_node *local_node = NULL; + struct o2net_handshake *hand; struct o2net_sock_container *sc = NULL; struct o2net_node *nn; unsigned int nofs_flag; @@ -1882,8 +1912,11 @@ static int o2net_accept_one(struct socket *sock, int *more) o2net_register_callbacks(sc->sc_sock->sk, sc); o2net_sc_queue_work(sc, &sc->sc_rx_work); - o2net_initialize_handshake(); - o2net_sendpage(sc, o2net_hand, sizeof(*o2net_hand)); + hand = o2net_initialize_handshake(); + if (hand) { + o2net_sendpage(sc, hand, sizeof(*hand)); + folio_put(virt_to_folio(hand)); + } out: if (new_sock) @@ -2090,21 +2123,8 @@ int o2net_init(void) unsigned long i; o2quo_init(); - o2net_debugfs_init(); - o2net_hand = kzalloc(sizeof(struct o2net_handshake), GFP_KERNEL); - o2net_keep_req = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL); - o2net_keep_resp = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL); - if (!o2net_hand || !o2net_keep_req || !o2net_keep_resp) - goto out; - - o2net_hand->protocol_version = cpu_to_be64(O2NET_PROTOCOL_VERSION); - o2net_hand->connector_id = cpu_to_be64(1); - - o2net_keep_req->magic = cpu_to_be16(O2NET_MSG_KEEP_REQ_MAGIC); - o2net_keep_resp->magic = cpu_to_be16(O2NET_MSG_KEEP_RESP_MAGIC); - for (i = 0; i < ARRAY_SIZE(o2net_nodes); i++) { struct o2net_node *nn = o2net_nn_from_num(i); @@ -2122,21 +2142,10 @@ int o2net_init(void) } return 0; - -out: - kfree(o2net_hand); - kfree(o2net_keep_req); - kfree(o2net_keep_resp); - o2net_debugfs_exit(); - o2quo_exit(); - return -ENOMEM; } void o2net_exit(void) { o2quo_exit(); - kfree(o2net_hand); - kfree(o2net_keep_req); - kfree(o2net_keep_resp); o2net_debugfs_exit(); } From patchwork Fri Mar 31 16:09:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77883 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp693813vqo; Fri, 31 Mar 2023 09:39:00 -0700 (PDT) X-Google-Smtp-Source: AKy350YNnn8x95sN9IYmyFlZT20HB6MgQuofIBGPukK+VO3QzalQM9zauTi9DGmsuHEKUPuUAi4i X-Received: by 2002:a17:906:19c3:b0:93e:3194:99cd with SMTP id h3-20020a17090619c300b0093e319499cdmr27363196ejd.12.1680280739854; Fri, 31 Mar 2023 09:38:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680280739; cv=none; d=google.com; s=arc-20160816; b=oNMRUwYKE1PwEQSepvqkk0MrEOLCzQkf0M2LFM6ctbshibvr7bzPdADk6LdFdG8Xft NiPAe+H403XMIgG41UUSj6hFw64FOcT5DCGUspKu4AVo0iVjVNAzx3QmV+j0yFGzE58P cl057Le0ihKe3B8OOSfvDiJgUBz9OuhHFdorbsuumTv31tgZO1AePkOg6OzECT7nG8yY s/nK3LcwAXsdK1wu18dCgSHcS7AzVfBXXXw2uBfKPTpNBr3etXQ1RAx8TyMDki5r5j1J wKNO8KkXvvrxDnaRgmkED6FxXW3WdKq6mWbZyJGEHlZ7h0gU/LA0qm1QNooAR6rrO/72 OnYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZJvfcCj7beM2l3aL2q8ZqMjrzC72DifYH9bqCAjpsUU=; b=LR+JmvqP5fYpyvTzw/zrSzMR1LSPGv5N/RWSOi3+Q3UM3HbYTGOxP4Xtx5FsgM/ba5 Sx/zIbfwMp3eiskO3hlG4AX9/C7+wJSFZpx2AsuroGVxg5XuDtAMLBnn0aAY2ke9DhGs QDwy00I6UqjuH2oYZOPemIVGe/oV8tOi/OIS6n8Jg81V2qkRQR85mP94xwHJfN2vXCFJ w8iZnseEVzuMmxYx7v5W3sUp7ofJEMkvGptIlJ4E5rsU0DhzQ1V/lRLBZGh0KkqPtshB N/2Bo89ycoPLwMe9dECCadaKiA3u32KdoKb+kFDiLrMKSGGUPKhRbX16BTyCzJ2hP6yG HjJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RpwMYxKf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y3-20020a170906448300b008d28622c8d9si2192345ejo.728.2023.03.31.09.38.36; Fri, 31 Mar 2023 09:38:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RpwMYxKf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232006AbjCaQbL (ORCPT + 99 others); Fri, 31 Mar 2023 12:31:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233131AbjCaQax (ORCPT ); Fri, 31 Mar 2023 12:30:53 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E3692D7D2 for ; Fri, 31 Mar 2023 09:25:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279909; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZJvfcCj7beM2l3aL2q8ZqMjrzC72DifYH9bqCAjpsUU=; b=RpwMYxKfUP/IW3brryEcHB2hG7vV9/sy77Mf22rNHIwBEtsGsVBf5vpTfG2tB5gaS3m9iO yP4Bq4y+w6mOLItWsBa/NMj80l66CPBSOeC4jFj5XFXpvnRlNxlHvgGlXNOl36oq2sEKSo FD2jwGSIv6HJNFM93kq78DbxF0Vhq8s= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-207-st4ET5OBOuuljcmLdN9DFg-1; Fri, 31 Mar 2023 12:11:56 -0400 X-MC-Unique: st4ET5OBOuuljcmLdN9DFg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 39B1529AA2DF; Fri, 31 Mar 2023 16:11:49 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE1C52027040; Fri, 31 Mar 2023 16:11:46 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH v3 53/55] drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendmsg() Date: Fri, 31 Mar 2023 17:09:12 +0100 Message-Id: <20230331160914.1608208-54-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902057384350418?= X-GMAIL-MSGID: =?utf-8?q?1761902057384350418?= Use sendmsg() conditionally with MSG_SPLICE_PAGES in _drbd_send_page() rather than calling sendpage() or _drbd_no_send_page(). Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/block/drbd/drbd_main.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 2c764f7ee4a7..e5f90abd29b6 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1532,7 +1532,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa int offset, size_t size, unsigned msg_flags) { struct socket *socket = peer_device->connection->data.socket; - int len = size; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = msg_flags, }; int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1541,33 +1542,33 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (drbd_disable_sendpage || !sendpage_ok(page)) - return _drbd_no_send_page(peer_device, page, offset, size, msg_flags); + if (!drbd_disable_sendpage && sendpage_ok(page)) + msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; + + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - msg_flags |= MSG_NOSIGNAL; drbd_update_congested(peer_device->connection); do { int sent; - sent = socket->ops->sendpage(socket, page, offset, len, msg_flags); + sent = sock_sendmsg(socket, &msg); if (sent <= 0) { if (sent == -EAGAIN) { if (we_should_drop_the_connection(peer_device->connection, socket)) break; continue; } - drbd_warn(peer_device->device, "%s: size=%d len=%d sent=%d\n", - __func__, (int)size, len, sent); + drbd_warn(peer_device->device, "%s: size=%d len=%zu sent=%d\n", + __func__, (int)size, msg_data_left(&msg), sent); if (sent < 0) err = sent; break; } - len -= sent; - offset += sent; - } while (len > 0 /* THINK && device->cstate >= C_CONNECTED*/); + } while (msg_data_left(&msg) /* THINK && device->cstate >= C_CONNECTED*/); clear_bit(NET_CONGESTED, &peer_device->connection->flags); - if (len == 0) { + if (!msg_data_left(&msg)) { err = 0; peer_device->device->send_cnt += size >> 9; } From patchwork Fri Mar 31 16:09:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77919 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp714705vqo; Fri, 31 Mar 2023 10:12:35 -0700 (PDT) X-Google-Smtp-Source: AK7set//C70ah5HUHGZiUzoWZodKzppaoSHv4mdo4lXund+wOPNV4gdGTP/weJU1Yyr8ZUfcFO/m X-Received: by 2002:a05:6a20:3ba7:b0:da:5772:5997 with SMTP id b39-20020a056a203ba700b000da57725997mr27470700pzh.36.1680282755378; Fri, 31 Mar 2023 10:12:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680282755; cv=none; d=google.com; s=arc-20160816; b=F8/EguLX2cSVm4l6z1LQp1wi8fxN4ER2ae38NiTew4MF24HLM/We5hHFn+tK+r0kGD d9VqL+T/PMn80Bor5yMo86ecya7Q2A06M7dBrYpsxRMjtEoFh4NzKtcvaI6Md3swl0Lf Hnc1pBKiC2YXVB42+Uz9yAQpel4B5qx2kXHqv9z8dxy30gwUMNXWXL/PgEL8BvNIPwPO sr6I6NgVO/UJ7IJdsl+LvxP+O+MR6ntC7MC3+kalZ8j3WWmJ6Y5ifaRBCagUkUCWPwyJ ebdeTiOzzcMVzvSC3KQK8D9eh9HVPuYvTmwZ7/AzfNMRozNsjkaC+ylDiRdihnOnK3pT 2How== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Q/kAjj+4zLWTXkqKqP2B/MWL9umh9MFmha+7vx/dBng=; b=eMgyBv/pKVETJ5C40ghqOVsH3wKNVIo6QwMJoJXYjaGeuqsDtyh1XdPbIb7MK2sOno ihFckwXA7lJFuYjpcDnRW45geuY4XJGY1cBNkOgSwsrH+T4e+nteH/HswDvR4dSehe5w bLy5q4ThSEkWsJadLvRVecn6dGXdPLGq+nSp1T/z3ohrcQEIIZ3IIHLQyZ4qyyA5aQ9A vMqi2btBMb4yMwDl9hh+iDu5IED9h9mYfm03EEOpVN1AmTV4iga3nfeRQkpZ+UMFGN9u RKfBi78ssEf3PGdW/nC2GijLHpOYdbs6fEYC2WMIRH3xr1M8z6f0pY8bFt9Hfw6Mj5F2 VCqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JjyINlia; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bv70-20020a632e49000000b00513234112b7si2755249pgb.899.2023.03.31.10.12.21; Fri, 31 Mar 2023 10:12:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JjyINlia; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233037AbjCaQUh (ORCPT + 99 others); Fri, 31 Mar 2023 12:20:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233707AbjCaQUS (ORCPT ); Fri, 31 Mar 2023 12:20:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 112A32D7DB for ; Fri, 31 Mar 2023 09:13:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680279120; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q/kAjj+4zLWTXkqKqP2B/MWL9umh9MFmha+7vx/dBng=; b=JjyINliaaP+WnkIBXKWQlZTtxJKgC2AOz3JLneeMsTPQIo9vCluBKu+G55OrVg75WQRF/6 PCy8UycEEasWbnzKVS2W+eRTZlcTVqLcoSO0e51W2Od9IDlqA+ketbvoQAvEz+peNcdiWk t/UjSP/NhPgxgKMfluB6hutivCQ3lZo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-54-NrSaPjGbPqK-bssP12GZRQ-1; Fri, 31 Mar 2023 12:11:53 -0400 X-MC-Unique: NrSaPjGbPqK-bssP12GZRQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 48C5E29AA2CF; Fri, 31 Mar 2023 16:11:52 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id E198914171B8; Fri, 31 Mar 2023 16:11:49 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH v3 54/55] drdb: Send an entire bio in a single sendmsg Date: Fri, 31 Mar 2023 17:09:13 +0100 Message-Id: <20230331160914.1608208-55-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761904170598471226?= X-GMAIL-MSGID: =?utf-8?q?1761904170598471226?= Since _drdb_sendpage() is now using sendmsg to send the pages rather sendpage, pass the entire bio in one go using a bvec iterator instead of doing it piecemeal. Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/block/drbd/drbd_main.c | 77 +++++++++++----------------------- 1 file changed, 25 insertions(+), 52 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index e5f90abd29b6..ab63d6138407 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1512,28 +1512,15 @@ static void drbd_update_congested(struct drbd_connection *connection) * As a workaround, we disable sendpage on pages * with page_count == 0 or PageSlab. */ -static int _drbd_no_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) -{ - struct socket *socket; - void *addr; - int err; - - socket = peer_device->connection->data.socket; - addr = kmap(page) + offset; - err = drbd_send_all(peer_device->connection, socket, addr, size, msg_flags); - kunmap(page); - if (!err) - peer_device->device->send_cnt += size >> 9; - return err; -} - -static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) +static int _drbd_send_pages(struct drbd_peer_device *peer_device, + struct iov_iter *iter, unsigned msg_flags) { struct socket *socket = peer_device->connection->data.socket; - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = msg_flags, }; + struct msghdr msg = { + .msg_flags = msg_flags | MSG_NOSIGNAL, + .msg_iter = *iter, + }; + size_t size = iov_iter_count(iter); int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1542,11 +1529,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (!drbd_disable_sendpage && sendpage_ok(page)) - msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; - - bvec_set_page(&bvec, page, offset, size); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + if (drbd_disable_sendpage) + msg.msg_flags &= ~(MSG_NOSIGNAL | MSG_SPLICE_PAGES); drbd_update_congested(peer_device->connection); do { @@ -1577,39 +1561,22 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa static int _drbd_send_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_no_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) - ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, 0); } static int _drbd_send_zc_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, MSG_SPLICE_PAGES); } static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, @@ -1621,10 +1588,16 @@ static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, /* hint all but last page with MSG_MORE */ page_chain_for_each(page) { + struct iov_iter iter; + struct bio_vec bvec; unsigned l = min_t(unsigned, len, PAGE_SIZE); - err = _drbd_send_page(peer_device, page, 0, l, - page_chain_next(page) ? MSG_MORE : 0); + bvec_set_page(&bvec, page, 0, l); + iov_iter_bvec(&iter, ITER_SOURCE, &bvec, 1, l); + + err = _drbd_send_pages(peer_device, &iter, + MSG_SPLICE_PAGES | + (page_chain_next(page) ? MSG_MORE : 0)); if (err) return err; len -= l; From patchwork Fri Mar 31 16:09:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 77891 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp696496vqo; Fri, 31 Mar 2023 09:43:32 -0700 (PDT) X-Google-Smtp-Source: AKy350bA9DsIWRneGQcTTrlLPA5moKGRD9ZXRizggFczk3kH5Ebaus1v2Tc12254YbsRzPksu5A3 X-Received: by 2002:a17:906:a24f:b0:931:bc69:8f94 with SMTP id bi15-20020a170906a24f00b00931bc698f94mr28016157ejb.45.1680281011833; Fri, 31 Mar 2023 09:43:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680281011; cv=none; d=google.com; s=arc-20160816; b=jDSN8hx6kKaRHP12nJEN2nCJwYSWe4RYfS4oWtYBYq6DbKnVs6L+0OZJAQtuS0Ozb2 QnHSkEJ/xW20WItYf/Hu49ixcIQQ8ynfxihglUEJe+zSP1TUegB02/pbKDICY6q/8iVU /zJ84SOpU7ryoozeCpXCRkn8mtNTQhvbwY55zNrOcwHX9fU15l7atAIy8PwmSL4byuL4 VVe6eoUOyTCbFcj9sLoVONmKWSAH9cO2p+khp0aLG7yl1wiRXqOj4aG0A28+99V5T8v3 ASB2lzHWZM3hAqW6vKX56iXbjPWOwAqPiWeAJ1dDBg3u5CGu3UZa5ylAZ5u1d5VogJ1i +7VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ejDFB6GslgSZqdhXNjSzgpO0TYgGHDllqYvzU5uLDLo=; b=fw8qirctMi3SvIhTB/lrEEH4qPIKZlnwRDdUsDnrdjPkUpTa7o/7W4eeOeEJ2wa2ev zvMxFIQ73qB8ZFWVmk82FlXkExe50R/c+Ft6yCb7KdplFk0X07ET6Z/1zfOmPC4eVNwt YFIRjk3Q5V7bNk6WuEOpbrpIGWlQIwM7OfiW0gYrsqtWkGgO1uoJtoamYQ9dGgSoqJxb dvgSmxnDoMvlJQGLHtSQWfdxmEb3fweqd37e0XaedUgkKGL3ffRlNz8TocNptnKWE9Sn sSeoQxoQMHGcu4V8M5OgJWccz/QEyE0Dwbr2B2BwKGWgSwlTfZVkPMSOFbA77H6HZJTq iOkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JQ6Xw0Gr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sa7-20020a170906eda700b00930679e9d4fsi2322329ejb.751.2023.03.31.09.43.07; Fri, 31 Mar 2023 09:43:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JQ6Xw0Gr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232094AbjCaQmG (ORCPT + 99 others); Fri, 31 Mar 2023 12:42:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231317AbjCaQlq (ORCPT ); Fri, 31 Mar 2023 12:41:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC404EB4C for ; Fri, 31 Mar 2023 09:37:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680280587; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ejDFB6GslgSZqdhXNjSzgpO0TYgGHDllqYvzU5uLDLo=; b=JQ6Xw0GrF/Z/aZnD864cvcWezfg7jFVI+bQhVDUDI/WRHcUPhSgXfZacfWfSoKLyKZE+bN 1ZQjeQ+ZC5AOVAjs3naCh9bo3QyHfP6CNwcgOAnRcFwzq4jUce/rGbRNjZOQ9Pfhu2IR6e 4H2sz5H3hrrcZ2pylMnza2os1t7VX5k= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-616-yCDTb6EqOze4c3qxvyUDQA-1; Fri, 31 Mar 2023 12:11:58 -0400 X-MC-Unique: yCDTb6EqOze4c3qxvyUDQA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A69C73801EC3; Fri, 31 Mar 2023 16:11:56 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DA58B1415139; Fri, 31 Mar 2023 16:11:52 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marc Kleine-Budde , bpf@vger.kernel.org, dccp@vger.kernel.org, linux-afs@lists.infradead.org, linux-arm-msm@vger.kernel.org, linux-can@vger.kernel.org, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-hams@vger.kernel.org, linux-rdma@vger.kernel.org, linux-sctp@vger.kernel.org, linux-wpan@vger.kernel.org, linux-x25@vger.kernel.org, mptcp@lists.linux.dev, rds-devel@oss.oracle.com, tipc-discussion@lists.sourceforge.net, virtualization@lists.linux-foundation.org Subject: [PATCH v3 55/55] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Date: Fri, 31 Mar 2023 17:09:14 +0100 Message-Id: <20230331160914.1608208-56-dhowells@redhat.com> In-Reply-To: <20230331160914.1608208-1-dhowells@redhat.com> References: <20230331160914.1608208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761902342487310067?= X-GMAIL-MSGID: =?utf-8?q?1761902342487310067?= Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells Acked-by: Marc Kleine-Budde # for net/can cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: dccp@vger.kernel.org cc: linux-afs@lists.infradead.org cc: linux-arm-msm@vger.kernel.org cc: linux-can@vger.kernel.org cc: linux-crypto@vger.kernel.org cc: linux-doc@vger.kernel.org cc: linux-hams@vger.kernel.org cc: linux-kernel@vger.kernel.org cc: linux-rdma@vger.kernel.org cc: linux-sctp@vger.kernel.org cc: linux-wpan@vger.kernel.org cc: linux-x25@vger.kernel.org cc: mptcp@lists.linux.dev cc: netdev@vger.kernel.org cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org --- Documentation/networking/scaling.rst | 4 +- crypto/af_alg.c | 29 ---- crypto/algif_aead.c | 22 +-- crypto/algif_rng.c | 2 - crypto/algif_skcipher.c | 14 -- .../chelsio/inline_crypto/chtls/chtls.h | 2 - .../chelsio/inline_crypto/chtls/chtls_io.c | 14 -- .../chelsio/inline_crypto/chtls/chtls_main.c | 1 - include/linux/net.h | 8 - include/net/inet_common.h | 2 - include/net/sock.h | 6 - net/appletalk/ddp.c | 1 - net/atm/pvc.c | 1 - net/atm/svc.c | 1 - net/ax25/af_ax25.c | 1 - net/caif/caif_socket.c | 2 - net/can/bcm.c | 1 - net/can/isotp.c | 1 - net/can/j1939/socket.c | 1 - net/can/raw.c | 1 - net/core/sock.c | 35 +---- net/dccp/ipv4.c | 1 - net/dccp/ipv6.c | 1 - net/ieee802154/socket.c | 2 - net/ipv4/af_inet.c | 21 --- net/ipv4/tcp.c | 34 ----- net/ipv4/tcp_bpf.c | 21 +-- net/ipv4/tcp_ipv4.c | 1 - net/ipv4/udp.c | 22 --- net/ipv4/udp_impl.h | 2 - net/ipv4/udplite.c | 1 - net/ipv6/af_inet6.c | 3 - net/ipv6/raw.c | 1 - net/ipv6/tcp_ipv6.c | 1 - net/kcm/kcmsock.c | 20 --- net/key/af_key.c | 1 - net/l2tp/l2tp_ip.c | 1 - net/l2tp/l2tp_ip6.c | 1 - net/llc/af_llc.c | 1 - net/mctp/af_mctp.c | 1 - net/mptcp/protocol.c | 2 - net/netlink/af_netlink.c | 1 - net/netrom/af_netrom.c | 1 - net/packet/af_packet.c | 2 - net/phonet/socket.c | 2 - net/qrtr/af_qrtr.c | 1 - net/rds/af_rds.c | 1 - net/rose/af_rose.c | 1 - net/rxrpc/af_rxrpc.c | 1 - net/sctp/protocol.c | 1 - net/socket.c | 48 ------ net/tipc/socket.c | 3 - net/tls/tls_main.c | 7 - net/unix/af_unix.c | 139 ------------------ net/vmw_vsock/af_vsock.c | 3 - net/x25/af_x25.c | 1 - net/xdp/xsk.c | 1 - 57 files changed, 9 insertions(+), 491 deletions(-) diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst index 3d435caa3ef2..92c9fb46d6a2 100644 --- a/Documentation/networking/scaling.rst +++ b/Documentation/networking/scaling.rst @@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes. rps_sock_flow_table is a global flow table that contains the *desired* CPU for flows: the CPU that is currently processing the flow in userspace. Each table value is a CPU index that is updated during calls to recvmsg -and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage() -and tcp_splice_read()). +and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and +tcp_splice_read()). When the scheduler moves a thread to a new CPU while it has outstanding receive packets on the old CPU, packets may arrive out of order. To diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 686610a4986f..9f84816dcabf 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .sendmsg = sock_no_sendmsg, .recvmsg = sock_no_recvmsg, @@ -1110,34 +1109,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, } EXPORT_SYMBOL_GPL(af_alg_sendmsg); -/** - * af_alg_sendpage - sendpage system call handler - * @sock: socket of connection to user space to write to - * @page: data to send - * @offset: offset into page to begin sending - * @size: length of data - * @flags: message send/receive flags - * - * This is a generic implementation of sendpage to fill ctx->tsgl_list. - */ -ssize_t af_alg_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return sock_sendmsg(sock, &msg); -} -EXPORT_SYMBOL_GPL(af_alg_sendpage); - /** * af_alg_free_resources - release resources required for crypto request * @areq: Request holding the TX and RX SGL diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index b16111a3025a..37b08e5f9114 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage. Filling up - * the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg (maybe with with + * MSG_SPLICE_PAGES). Filling up the TX SGL does not cause a crypto operation + * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg + * call, the caller must provide a buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed @@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = { .release = af_alg_release, .sendmsg = aead_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = aead_recvmsg, .poll = af_alg_poll, }; @@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return aead_sendmsg(sock, msg, size); } -static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = aead_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = { .release = af_alg_release, .sendmsg = aead_sendmsg_nokey, - .sendpage = aead_sendpage_nokey, .recvmsg = aead_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c index 407408c43730..10c41adac3b1 100644 --- a/crypto/algif_rng.c +++ b/crypto/algif_rng.c @@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = { .bind = sock_no_bind, .accept = sock_no_accept, .sendmsg = sock_no_sendmsg, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_recvmsg, @@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = { .mmap = sock_no_mmap, .bind = sock_no_bind, .accept = sock_no_accept, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_test_recvmsg, diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index b1f321b9f846..9ada9b741af8 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = { .release = af_alg_release, .sendmsg = skcipher_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = skcipher_recvmsg, .poll = af_alg_poll, }; @@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return skcipher_sendmsg(sock, msg, size); } -static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = skcipher_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = { .release = af_alg_release, .sendmsg = skcipher_sendmsg_nokey, - .sendpage = skcipher_sendpage_nokey, .recvmsg = skcipher_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h index 41714203ace8..94760a681566 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h @@ -568,8 +568,6 @@ void chtls_destroy_sock(struct sock *sk); int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); int chtls_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); -int chtls_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int send_tx_flowc_wr(struct sock *sk, int compl, u32 snd_nxt, u32 rcv_nxt); void chtls_tcp_push(struct sock *sk, int flags); diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index 5c397cb57300..fb44333efa3e 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c @@ -1285,20 +1285,6 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) goto done; } -int chtls_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, offset, size); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return chtls_sendmsg(sk, &msg, size); -} - static void chtls_select_window(struct sock *sk) { struct chtls_sock *csk = rcu_dereference_sk_user_data(sk); diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c index 1e55b12fee51..1b8e6994e8fe 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -606,7 +606,6 @@ static void __init chtls_init_ulp_ops(void) chtls_cpl_prot.destroy = chtls_destroy_sock; chtls_cpl_prot.shutdown = chtls_shutdown; chtls_cpl_prot.sendmsg = chtls_sendmsg; - chtls_cpl_prot.sendpage = chtls_sendpage; chtls_cpl_prot.recvmsg = chtls_recvmsg; chtls_cpl_prot.setsockopt = chtls_setsockopt; chtls_cpl_prot.getsockopt = chtls_getsockopt; diff --git a/include/linux/net.h b/include/linux/net.h index b73ad8e3c212..e5794968ac9f 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -206,8 +206,6 @@ struct proto_ops { size_t total_len, int flags); int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); - ssize_t (*sendpage) (struct socket *sock, struct page *page, - int offset, size_t size, int flags); ssize_t (*splice_read)(struct socket *sock, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); int (*set_peek_off)(struct sock *sk, int val); @@ -220,8 +218,6 @@ struct proto_ops { sk_read_actor_t recv_actor); /* This is different from read_sock(), it reads an entire skb at a time. */ int (*read_skb)(struct sock *sk, skb_read_actor_t recv_actor); - int (*sendpage_locked)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg, size_t size); int (*set_rcvlowat)(struct sock *sk, int val); @@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, int flags); int kernel_getsockname(struct socket *sock, struct sockaddr *addr); int kernel_getpeername(struct socket *sock, struct sockaddr *addr); -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how); /* Routine returns the IP overhead imposed by a (caller-protected) socket. */ diff --git a/include/net/inet_common.h b/include/net/inet_common.h index cec453c18f1d..054c3388fa51 100644 --- a/include/net/inet_common.h +++ b/include/net/inet_common.h @@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags, bool kern); int inet_send_prepare(struct sock *sk); int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); int inet_shutdown(struct socket *sock, int how); diff --git a/include/net/sock.h b/include/net/sock.h index 573f2bf7e0de..4618cd21e16b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1265,8 +1265,6 @@ struct proto { size_t len); int (*recvmsg)(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); - int (*sendpage)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int (*bind)(struct sock *sk, struct sockaddr *addr, int addr_len); int (*bind_add)(struct sock *sk, @@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int); int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct *vma); -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags); /* * Functions to fill in entries in struct proto_ops when a protocol diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c index a06f4d4a6f47..8978fb6212ff 100644 --- a/net/appletalk/ddp.c +++ b/net/appletalk/ddp.c @@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = { .sendmsg = atalk_sendmsg, .recvmsg = atalk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block ddp_notifier = { diff --git a/net/atm/pvc.c b/net/atm/pvc.c index 53e7d3f39e26..66d9a9bd5896 100644 --- a/net/atm/pvc.c +++ b/net/atm/pvc.c @@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/atm/svc.c b/net/atm/svc.c index 4a02bcaad279..289240fe234e 100644 --- a/net/atm/svc.c +++ b/net/atm/svc.c @@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index d8da400cb4de..5db805d5f74d 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = { .sendmsg = ax25_sendmsg, .recvmsg = ax25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c index 4eebcc66c19a..9c82698da4f5 100644 --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = { .sendmsg = caif_seqpkt_sendmsg, .recvmsg = caif_seqpkt_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops caif_stream_ops = { @@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = { .sendmsg = caif_stream_sendmsg, .recvmsg = caif_stream_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* This function is called when a socket is finally destroyed. */ diff --git a/net/can/bcm.c b/net/can/bcm.c index 27706f6ace34..65a946a36d92 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = { .sendmsg = bcm_sendmsg, .recvmsg = bcm_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto bcm_proto __read_mostly = { diff --git a/net/can/isotp.c b/net/can/isotp.c index 9bc344851704..0c3d11c29a2b 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = { .sendmsg = isotp_sendmsg, .recvmsg = isotp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto isotp_proto __read_mostly = { diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 7e90f9e61d9b..2bfe4f79bb67 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = { .sendmsg = j1939_sk_sendmsg, .recvmsg = j1939_sk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto j1939_proto __read_mostly = { diff --git a/net/can/raw.c b/net/can/raw.c index f64469b98260..15c79b079184 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = { .sendmsg = raw_sendmsg, .recvmsg = raw_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto raw_proto __read_mostly = { diff --git a/net/core/sock.c b/net/core/sock.c index 341c565dbc26..c2ae77bb2075 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file) } } -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg(sock, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage); - -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage_locked); - /* * Default Socket Callbacks */ @@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) { seq_printf(seq, "%-9s %4u %6d %6ld %-3s %6u %-3s %-10s " - "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", + "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", proto->name, proto->obj_size, sock_prot_inuse_get(seq_file_net(seq), proto), @@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) proto_method_implemented(proto->getsockopt), proto_method_implemented(proto->sendmsg), proto_method_implemented(proto->recvmsg), - proto_method_implemented(proto->sendpage), proto_method_implemented(proto->bind), proto_method_implemented(proto->backlog_rcv), proto_method_implemented(proto->hash), @@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v) "maxhdr", "slab", "module", - "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n"); + "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n"); else proto_seq_printf(seq, list_entry(v, struct proto, node)); return 0; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index b780827f5e0a..ea808de374ea 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw dccp_v4_protosw = { diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index b9d7c3dd1cb3..23eb8159e3cd 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 1fa2fe041ec0..1238f036117f 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* DGRAM Sockets (802.15.4 dataframes) */ @@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void ieee802154_sock_destruct(struct sock *sk) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 8db6747f892f..869b49933f15 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) } EXPORT_SYMBOL(inet_sendmsg); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - struct sock *sk = sock->sk; - const struct proto *prot; - - if (unlikely(inet_send_prepare(sk))) - return -EAGAIN; - - /* IPV6_ADDRFORM can change sk->sk_prot under us. */ - prot = READ_ONCE(sk->sk_prot); - if (prot->sendpage) - return prot->sendpage(sk, page, offset, size, flags); - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(inet_sendpage); - INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *, size_t, int, int *)); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, @@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = { #ifdef CONFIG_MMU .mmap = tcp_mmap, #endif - .sendpage = inet_sendpage, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .peek_len = tcp_peek_len, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = { .read_skb = udp_read_skb, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, #endif diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a8f8ccaed10e..bd01a1b23c7b 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,40 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_sendmsg_locked(sk, &msg, size); -} -EXPORT_SYMBOL_GPL(tcp_sendpage_locked); - -int tcp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - int ret; - - lock_sock(sk); - ret = tcp_sendpage_locked(sk, page, offset, size, flags); - release_sock(sk); - - return ret; -} -EXPORT_SYMBOL(tcp_sendpage); - void tcp_free_fastopen_req(struct tcp_sock *tp) { if (tp->fastopen_req) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index de37a4372437..ab83cfb9de22 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) return copied ? copied : err; } -static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_bpf_sendmsg(sk, &msg, size); -} - enum { TCP_BPF_IPV4, TCP_BPF_IPV6, @@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS], prot[TCP_BPF_TX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_TX].sendmsg = tcp_bpf_sendmsg; - prot[TCP_BPF_TX].sendpage = tcp_bpf_sendpage; prot[TCP_BPF_RX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_RX].recvmsg = tcp_bpf_recvmsg_parser; @@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops) * indeed valid assumptions. */ return ops->recvmsg == tcp_recvmsg && - ops->sendmsg == tcp_sendmsg && - ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP; + ops->sendmsg == tcp_sendmsg ? 0 : -ENOTSUPP; } int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ea370afa70ed..5c2e1c1ca329 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3112,7 +3112,6 @@ struct proto tcp_prot = { .keepalive = tcp_set_keepalive, .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v4_do_rcv, .release_cb = tcp_release_cb, .hash = inet_hash, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 097feb92e215..85bd5960f7ef 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) } EXPORT_SYMBOL(udp_sendmsg); -int udp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE - }; - int ret; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - lock_sock(sk); - ret = udp_sendmsg(sk, &msg, size); - release_sock(sk); - return ret; -} - #define UDP_SKB_IS_STATELESS 0x80000000 /* all head states (dst, sk, nf conntrack) except skb extensions are @@ -2926,7 +2905,6 @@ struct proto udp_prot = { .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, - .sendpage = udp_sendpage, .release_cb = ip4_datagram_release_cb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h index 4ba7a88a1b1d..e1ff3a375996 100644 --- a/net/ipv4/udp_impl.h +++ b/net/ipv4/udp_impl.h @@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname, int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); -int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, - int flags); void udp_destroy_sock(struct sock *sk); #ifdef CONFIG_PROC_FS diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c index e0c9cc39b81e..69870f0afc6c 100644 --- a/net/ipv4/udplite.c +++ b/net/ipv4/udplite.c @@ -54,7 +54,6 @@ struct proto udplite_prot = { .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, - .sendpage = udp_sendpage, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 38689bedfce7..769c76d59053 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = { #ifdef CONFIG_MMU .mmap = tcp_mmap, #endif - .sendpage = inet_sendpage, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, @@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = { .recvmsg = inet6_recvmsg, /* retpoline's sake */ .read_skb = udp_read_skb, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index bac9ba747bde..c6c062678c0e 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = { .sendmsg = inet_sendmsg, /* ok */ .recvmsg = sock_common_recvmsg, /* ok */ .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 1bf93b61aa06..03ba1e389901 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = { .keepalive = tcp_set_keepalive, .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v6_do_rcv, .release_cb = tcp_release_cb, .hash = inet6_hash, diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 9c9d379aafb1..94442e359fe2 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -1005,24 +1005,6 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) return err; } -static ssize_t kcm_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) - -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - if (flags & MSG_OOB) - return -EOPNOTSUPP; - - bvec_set_page(&bvec, page, offset, size); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return kcm_sendmsg(sock, &msg, size); -} - static int kcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { @@ -1810,7 +1792,6 @@ static const struct proto_ops kcm_dgram_ops = { .sendmsg = kcm_sendmsg, .recvmsg = kcm_recvmsg, .mmap = sock_no_mmap, - .sendpage = kcm_sendpage, }; static const struct proto_ops kcm_seqpacket_ops = { @@ -1831,7 +1812,6 @@ static const struct proto_ops kcm_seqpacket_ops = { .sendmsg = kcm_sendmsg, .recvmsg = kcm_recvmsg, .mmap = sock_no_mmap, - .sendpage = kcm_sendpage, .splice_read = kcm_splice_read, }; diff --git a/net/key/af_key.c b/net/key/af_key.c index a815f5ab4c49..bf59d42dc697 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, /* Now the operations that really occur. */ .release = pfkey_release, diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c index 4db5a554bdbd..d0dcbe3a4cd7 100644 --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw l2tp_ip_protosw = { diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 2478aa60145f..49296ce14a90 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c index da7fe94bea2e..addd94da2a81 100644 --- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = { .sendmsg = llc_ui_sendmsg, .recvmsg = llc_ui_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const char llc_proc_err_msg[] __initconst = diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c index 3150f3f0c872..c6fe2e6b85dd 100644 --- a/net/mctp/af_mctp.c +++ b/net/mctp/af_mctp.c @@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = { .sendmsg = mctp_sendmsg, .recvmsg = mctp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = mctp_compat_ioctl, #endif diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3ad9c46202fc..ade89b8d0082 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, }; static struct inet_protosw mptcp_protosw = { @@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = { .sendmsg = inet6_sendmsg, .recvmsg = inet6_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index c64277659753..f70073a3bb49 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = { .sendmsg = netlink_sendmsg, .recvmsg = netlink_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family netlink_family_ops = { diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c index 5a4cb796150f..eb8ccbd58df7 100644 --- a/net/netrom/af_netrom.c +++ b/net/netrom/af_netrom.c @@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = { .sendmsg = nr_sendmsg, .recvmsg = nr_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block nr_dev_notifier = { diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index d4e76e2ae153..385bd4982b80 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = { .sendmsg = packet_sendmsg_spkt, .recvmsg = packet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops packet_ops = { @@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = { .sendmsg = packet_sendmsg, .recvmsg = packet_recvmsg, .mmap = packet_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family packet_family_ops = { diff --git a/net/phonet/socket.c b/net/phonet/socket.c index 71e2caf6ab85..a246f7d0a817 100644 --- a/net/phonet/socket.c +++ b/net/phonet/socket.c @@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; const struct proto_ops phonet_stream_ops = { @@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; EXPORT_SYMBOL(phonet_stream_ops); diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c index 5c2fb992803b..5bb7d680bd5f 100644 --- a/net/qrtr/af_qrtr.c +++ b/net/qrtr/af_qrtr.c @@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = { .shutdown = sock_no_shutdown, .release = qrtr_release, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto qrtr_proto = { diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index 3ff6995244e5..01c4cdfef45d 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = { .sendmsg = rds_sendmsg, .recvmsg = rds_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void rds_sock_destruct(struct sock *sk) diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index ca2b17f32670..49dafe9ac72f 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = { .sendmsg = rose_sendmsg, .recvmsg = rose_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block rose_dev_notifier = { diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index 102f5cbff91a..182495804f8f 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = { .sendmsg = rxrpc_sendmsg, .recvmsg = rxrpc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto rxrpc_proto = { diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index c365df24ad33..acb2d2a69268 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* Registration with AF_INET family. */ diff --git a/net/socket.c b/net/socket.c index 3e9bd8261357..8c7437c983da 100644 --- a/net/socket.c +++ b/net/socket.c @@ -3543,54 +3543,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr) } EXPORT_SYMBOL(kernel_getpeername); -/** - * kernel_sendpage - send a &page through a socket (kernel space) - * @sock: socket - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - */ - -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - if (sock->ops->sendpage) { - /* Warn in case the improper page to zero-copy send */ - WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send"); - return sock->ops->sendpage(sock, page, offset, size, flags); - } - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage); - -/** - * kernel_sendpage_locked - send a &page through the locked sock (kernel space) - * @sk: sock - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - * Caller must hold @sk. - */ - -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct socket *sock = sk->sk_socket; - - if (sock->ops->sendpage_locked) - return sock->ops->sendpage_locked(sk, page, offset, size, - flags); - - return sock_no_sendpage_locked(sk, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage_locked); - /** * kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space) * @sock: socket diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 37edfe10f8c6..d2072fbf3272 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = { .sendmsg = tipc_sendmsg, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops packet_ops = { @@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = { .sendmsg = tipc_send_packet, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops stream_ops = { @@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = { .sendmsg = tipc_sendstream, .recvmsg = tipc_recvstream, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct net_proto_family tipc_family_ops = { diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 35b2f7ee2fa3..ff02697f484b 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -936,7 +936,6 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG] ops[TLS_BASE][TLS_BASE] = *base; ops[TLS_SW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE]; - ops[TLS_SW ][TLS_BASE].sendpage_locked = tls_sw_sendpage_locked; ops[TLS_BASE][TLS_SW ] = ops[TLS_BASE][TLS_BASE]; ops[TLS_BASE][TLS_SW ].splice_read = tls_sw_splice_read; @@ -946,17 +945,14 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG] #ifdef CONFIG_TLS_DEVICE ops[TLS_HW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE]; - ops[TLS_HW ][TLS_BASE].sendpage_locked = NULL; ops[TLS_HW ][TLS_SW ] = ops[TLS_BASE][TLS_SW ]; - ops[TLS_HW ][TLS_SW ].sendpage_locked = NULL; ops[TLS_BASE][TLS_HW ] = ops[TLS_BASE][TLS_SW ]; ops[TLS_SW ][TLS_HW ] = ops[TLS_SW ][TLS_SW ]; ops[TLS_HW ][TLS_HW ] = ops[TLS_HW ][TLS_SW ]; - ops[TLS_HW ][TLS_HW ].sendpage_locked = NULL; #endif #ifdef CONFIG_TLS_TOE ops[TLS_HW_RECORD][TLS_HW_RECORD] = *base; @@ -1004,7 +1000,6 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], prot[TLS_SW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; prot[TLS_SW][TLS_BASE].sendmsg = tls_sw_sendmsg; - prot[TLS_SW][TLS_BASE].sendpage = tls_sw_sendpage; prot[TLS_BASE][TLS_SW] = prot[TLS_BASE][TLS_BASE]; prot[TLS_BASE][TLS_SW].recvmsg = tls_sw_recvmsg; @@ -1019,11 +1014,9 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], #ifdef CONFIG_TLS_DEVICE prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; prot[TLS_HW][TLS_BASE].sendmsg = tls_device_sendmsg; - prot[TLS_HW][TLS_BASE].sendpage = tls_device_sendpage; prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW]; prot[TLS_HW][TLS_SW].sendmsg = tls_device_sendmsg; - prot[TLS_HW][TLS_SW].sendpage = tls_device_sendpage; prot[TLS_BASE][TLS_HW] = prot[TLS_BASE][TLS_SW]; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 88b91005567e..751715ade2ae 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon static int unix_shutdown(struct socket *, int); static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int); -static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset, - size_t size, int flags); static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, struct pipe_inode_info *, size_t size, unsigned int flags); @@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = { .recvmsg = unix_stream_recvmsg, .read_skb = unix_stream_read_skb, .mmap = sock_no_mmap, - .sendpage = unix_stream_sendpage, .splice_read = unix_stream_splice_read, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, @@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = { .read_skb = unix_read_skb, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = { .sendmsg = unix_seqpacket_sendmsg, .recvmsg = unix_seqpacket_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock, } } -static int maybe_init_creds(struct scm_cookie *scm, - struct socket *socket, - const struct sock *other) -{ - int err; - struct msghdr msg = { .msg_controllen = 0 }; - - err = scm_send(socket, &msg, scm, false); - if (err) - return err; - - if (unix_passcred_enabled(socket, other)) { - scm->pid = get_pid(task_tgid(current)); - current_uid_gid(&scm->creds.uid, &scm->creds.gid); - } - return err; -} - static bool unix_skb_scm_eq(struct sk_buff *skb, struct scm_cookie *scm) { @@ -2349,122 +2326,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, return sent ? : err; } -static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page, - int offset, size_t size, int flags) -{ - int err; - bool send_sigpipe = false; - bool init_scm = true; - struct scm_cookie scm; - struct sock *other, *sk = socket->sk; - struct sk_buff *skb, *newskb = NULL, *tail = NULL; - - if (flags & MSG_OOB) - return -EOPNOTSUPP; - - other = unix_peer(sk); - if (!other || sk->sk_state != TCP_ESTABLISHED) - return -ENOTCONN; - - if (false) { -alloc_skb: - unix_state_unlock(other); - mutex_unlock(&unix_sk(other)->iolock); - newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT, - &err, 0); - if (!newskb) - goto err; - } - - /* we must acquire iolock as we modify already present - * skbs in the sk_receive_queue and mess with skb->len - */ - err = mutex_lock_interruptible(&unix_sk(other)->iolock); - if (err) { - err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS; - goto err; - } - - if (sk->sk_shutdown & SEND_SHUTDOWN) { - err = -EPIPE; - send_sigpipe = true; - goto err_unlock; - } - - unix_state_lock(other); - - if (sock_flag(other, SOCK_DEAD) || - other->sk_shutdown & RCV_SHUTDOWN) { - err = -EPIPE; - send_sigpipe = true; - goto err_state_unlock; - } - - if (init_scm) { - err = maybe_init_creds(&scm, socket, other); - if (err) - goto err_state_unlock; - init_scm = false; - } - - skb = skb_peek_tail(&other->sk_receive_queue); - if (tail && tail == skb) { - skb = newskb; - } else if (!skb || !unix_skb_scm_eq(skb, &scm)) { - if (newskb) { - skb = newskb; - } else { - tail = skb; - goto alloc_skb; - } - } else if (newskb) { - /* this is fast path, we don't necessarily need to - * call to kfree_skb even though with newskb == NULL - * this - does no harm - */ - consume_skb(newskb); - newskb = NULL; - } - - if (skb_append_pagefrags(skb, page, offset, size)) { - tail = skb; - goto alloc_skb; - } - - skb->len += size; - skb->data_len += size; - skb->truesize += size; - refcount_add(size, &sk->sk_wmem_alloc); - - if (newskb) { - err = unix_scm_to_skb(&scm, skb, false); - if (err) - goto err_state_unlock; - spin_lock(&other->sk_receive_queue.lock); - __skb_queue_tail(&other->sk_receive_queue, newskb); - spin_unlock(&other->sk_receive_queue.lock); - } - - unix_state_unlock(other); - mutex_unlock(&unix_sk(other)->iolock); - - other->sk_data_ready(other); - scm_destroy(&scm); - return size; - -err_state_unlock: - unix_state_unlock(other); -err_unlock: - mutex_unlock(&unix_sk(other)->iolock); -err: - kfree_skb(newskb); - if (send_sigpipe && !(flags & MSG_NOSIGNAL)) - send_sig(SIGPIPE, current, 0); - if (!init_scm) - scm_destroy(&scm); - return err; -} - static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 19aea7cba26e..d0e476755cdc 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = { .sendmsg = vsock_dgram_sendmsg, .recvmsg = vsock_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static int vsock_transport_cancel_pkt(struct vsock_sock *vsk) @@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_rcvlowat = vsock_set_rcvlowat, }; @@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static int vsock_create(struct net *net, struct socket *sock, diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index 5c7ad301d742..0fb5143bec7a 100644 --- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = { .sendmsg = x25_sendmsg, .recvmsg = x25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct packet_type x25_packet_type __read_mostly = { diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 2ac58b282b5e..eff1f0aaa4b5 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = { .sendmsg = xsk_sendmsg, .recvmsg = xsk_recvmsg, .mmap = xsk_mmap, - .sendpage = sock_no_sendpage, }; static void xsk_destruct(struct sock *sk)