From patchwork Thu Mar 16 15:25:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70840 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp552462wrt; Thu, 16 Mar 2023 08:38:58 -0700 (PDT) X-Google-Smtp-Source: AK7set//AadHITvSdm8ZmmhWrnsJ8EUqlVi9GqkM3OOLccLMQIUizI5rZ2syyD/obDTJip+C74ii X-Received: by 2002:a17:90b:4b46:b0:23e:fa90:ba3e with SMTP id mi6-20020a17090b4b4600b0023efa90ba3emr4376648pjb.13.1678981138255; Thu, 16 Mar 2023 08:38:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981138; cv=none; d=google.com; s=arc-20160816; b=Kk8Aqmz6sHz/Y1LSOqPgD74nfHSV44z7/qPR0cVAdLD61vbUhmmtxxH2Q6KvZTBn3w ewVLm/TUbpIomW4gGCgoINOvslfI5QC7DDXOWEzn8BhZoo/bR62bsM3H90y/bZdK+WPn ypov2mDxH5J2cU0INODq4dhRkBH6aagPRsi63WeaGD7Dp33S6ZBcGgYxwWgin2WZg/n9 7bMuiDOmJFzyLp2JSbIFwOBj0ZjUmqGhzt+EfQByXi3EpyYW/PoIEDgTaE0Ll5/aXxtE ubRdDx9J1C+9KMveF69N/o/bGLn+WyYtKjKGZLjpb5v82bsm+H9HvxD7fI/DuZzaTVDC WgBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Jp818uqIiXwVZdp69UNUHXRADr94Nvu0NDkz6WG9MO0=; b=buU4Jsxc9psThbYQOReE5f5tosyWeQmI+QdmGvNQg2EkPNfcRMzHKbjzBa4VNlSzem 5A3gR20/iaqrYi5eQgb54MOxPzmM9sXKbXZN4R0Ful6+cEUfKAzc5d57/iSw7Jk90Ny4 yRYLMlDXoRH4g/XWEGPa207qfpm7z5FNLJfjWSus1ZWM+ipO9+TLfSD//74n9wZMiqay K5ZBbkeNdUVSRF19bARUAA0j+G/niymNwmpWjYPXqT+wkOlvDiM0b548Oiu4ZgUhKCsA mYdXJlBrWcVvqC2wZfP7nzh4ZzML7WOoBNvKq6ORkcK52v87fzQcW7FOeViTdzPDD4Xc /CAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DdNW20To; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d18-20020a170903231200b0019d04cfd41asi8689793plh.593.2023.03.16.08.38.44; Thu, 16 Mar 2023 08:38:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DdNW20To; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230478AbjCPP1g (ORCPT + 99 others); Thu, 16 Mar 2023 11:27:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230447AbjCPP1W (ORCPT ); Thu, 16 Mar 2023 11:27:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8040474EE for ; Thu, 16 Mar 2023 08:26:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980393; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jp818uqIiXwVZdp69UNUHXRADr94Nvu0NDkz6WG9MO0=; b=DdNW20To3nB5pOnHOFwkpbjoh2hoGYrsGJWA0VSgdqAKIuytTaKOL1umn1XqmgjrR5ds8T GXT36qvhNl+gtOI2EtJn7FF7UnuxJsXgq+qs2hDzXQP8I8VqQOkWiE5/LR49dudrxdlhaA KTbJxqmxiljRlbZssgaJCuCckBkAjcU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-208-5oyLTBYpPFiR48_BvOmftQ-1; Thu, 16 Mar 2023 11:26:26 -0400 X-MC-Unique: 5oyLTBYpPFiR48_BvOmftQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E00D528237D2; Thu, 16 Mar 2023 15:26:25 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2204B40C6E67; Thu, 16 Mar 2023 15:26:24 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag Date: Thu, 16 Mar 2023 15:25:51 +0000 Message-Id: <20230316152618.711970-2-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539325756513486?= X-GMAIL-MSGID: =?utf-8?q?1760539325756513486?= Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a network protocol that it should splice pages from the source iterator rather than copying the data if it can. This is intended as a replacement for the ->sendpage() op, allowing a way to splice in several multipage folios in one go. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/linux/socket.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index 13c3a237b9c9..a67d02da3c54 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -327,6 +327,7 @@ struct ucred { */ #define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */ +#define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */ #define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */ #define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file descriptor received through From patchwork Thu Mar 16 15:25:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70867 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp555075wrt; Thu, 16 Mar 2023 08:44:14 -0700 (PDT) X-Google-Smtp-Source: AK7set8CyJp8/fkoz7LbR2P0sAQHo8Mu4R9RauStDyAfcHq9l6OdM2gthVsqlhKJgMHRHx30rOv6 X-Received: by 2002:a17:90b:17c9:b0:23f:4dfd:4fdf with SMTP id me9-20020a17090b17c900b0023f4dfd4fdfmr140225pjb.15.1678981454672; Thu, 16 Mar 2023 08:44:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981454; cv=none; d=google.com; s=arc-20160816; b=0ZT4xyBXT1HTtdZhFUXbJTDebuGzSM7xgWteuiYQnvsxs47aI1xVJca+oymUIVN5FK wMRXXATplXjV/PrPXIkqKfJk3f43uHmLEdGxlCGo5jqAr4kAysVRag+7v5lQ8KAF4spV vJxf6nBUP72r09U6BkPYo8AhOElyrYk5OJ5anRtkFTcWBz1z+VmEPgPQxa6cG0Sdm5Zm Hf/wqv9yeIkiubelGGNPJq5/1y2LZw+yaVTI6jkQBl8P5+9PcPA5xEc8TySzFsmsSRPe iATzDJ6NoF9ugvUKfPAWFkduMrIZX8An3ZICZdb4RqbIAtQFp0cyI369BhGwJvNjx/tR Z4DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hvyVk2Tm5p69fLI8jM9sCOn8INezqsFYzfTdfGnmvK4=; b=Ixab6OFinWZGoP9zB1X7NNLkzZHl6h0Y09ynMWcYT/uoGorlGDQZx0oHsFuFUw2LPC 1p7ai4JzE10WzwfO5lKycebpGadqSDaMJjQ/2r5cXL5PKXhz/L3q2SLCe7zeEzBy8oyH PdjcPp+A63ehCh5ny9CBxOpW6XyNb2a/YOop5TuYxxYget6H3g5MAxVpkINsbdlSGOy4 oArodPxWlAAUCNPDKJTXdDDgSrLznU4su7XOSoalbL+T+QoG9qZuMB2zJJVzip+zL5RE ofdEiUkv9uDrvGEzHkSpZF/F5wnVofX5vNsLZcLk1fZt/A0G/1JzQHh+tFGRMPqQcE/W k+2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R0mDkgEE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h9-20020a631209000000b00502d825633asi8042813pgl.639.2023.03.16.08.43.58; Thu, 16 Mar 2023 08:44:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R0mDkgEE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230224AbjCPP1k (ORCPT + 99 others); Thu, 16 Mar 2023 11:27:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231164AbjCPP1Y (ORCPT ); Thu, 16 Mar 2023 11:27:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8A525CEC4 for ; Thu, 16 Mar 2023 08:26:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hvyVk2Tm5p69fLI8jM9sCOn8INezqsFYzfTdfGnmvK4=; b=R0mDkgEElnm8uquEX2a0lLlWma7/9g44uTT/Hik1FVo5tbdGR1gDwSD0pIeH94zfJ+8w+5 r1RarA/mCxIHWEApHGisx/NNIIkzVAz3OAM/8wnREBWq5I9CKWxP8MrRXTgTy74rINlYrT echQco/77ZT3u/DSCTL3kJ+ObnqqXgw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-523-Km8kakyRNF2yUKxaser2SQ-1; Thu, 16 Mar 2023 11:26:29 -0400 X-MC-Unique: Km8kakyRNF2yUKxaser2SQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ADED738149BC; Thu, 16 Mar 2023 15:26:28 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7ACDF492B00; Thu, 16 Mar 2023 15:26:26 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:25:52 +0000 Message-Id: <20230316152618.711970-3-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539657828045985?= X-GMAIL-MSGID: =?utf-8?q?1760539657828045985?= If a network protocol sendmsg() sees MSG_SPLICE_DATA, it expects that the iterator is of ITER_BVEC type and that all the pages can have refs taken on them with get_page() and discarded with put_page(). Bits of network filesystem protocol data, however, are typically contained in slab memory for which the cleanup method is kfree(), not put_page(), so this doesn't work. Provide a simple allocator, zcopy_alloc(), that allocates a page at a time per-cpu and sequentially breaks off pieces and hands them out with a ref as it's asked for them. The caller disposes of the memory it was given by calling put_page(). When a page is all parcelled out, it is abandoned by the allocator and another page is obtained. The page will get cleaned up when the last skbuff fragment is destroyed. A helper function, zcopy_memdup() is provided to call zcopy_alloc() and copy the data it is given into it. [!] I'm not sure this is the best way to do things. A better way might be to make the network protocol look at the page and copy it if it's a slab object rather than taking a ref on it. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- include/linux/zcopy_alloc.h | 16 +++++ mm/Makefile | 2 +- mm/zcopy_alloc.c | 129 ++++++++++++++++++++++++++++++++++++ 3 files changed, 146 insertions(+), 1 deletion(-) create mode 100644 include/linux/zcopy_alloc.h create mode 100644 mm/zcopy_alloc.c diff --git a/include/linux/zcopy_alloc.h b/include/linux/zcopy_alloc.h new file mode 100644 index 000000000000..8eb205678073 --- /dev/null +++ b/include/linux/zcopy_alloc.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Defs for for zerocopy filler fragment allocator. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#ifndef _LINUX_ZCOPY_ALLOC_H +#define _LINUX_ZCOPY_ALLOC_H + +struct bio_vec; + +int zcopy_alloc(size_t size, struct bio_vec *bvec, gfp_t gfp); +int zcopy_memdup(size_t size, const void *p, struct bio_vec *bvec, gfp_t gfp); + +#endif /* _LINUX_ZCOPY_ALLOC_H */ diff --git a/mm/Makefile b/mm/Makefile index 8e105e5b3e29..3848f43751ee 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -52,7 +52,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ readahead.o swap.o truncate.o vmscan.o shmem.o \ util.o mmzone.o vmstat.o backing-dev.o \ mm_init.o percpu.o slab_common.o \ - compaction.o \ + compaction.o zcopy_alloc.o \ interval_tree.o list_lru.o workingset.o \ debug.o gup.o mmap_lock.o $(mmu-y) diff --git a/mm/zcopy_alloc.c b/mm/zcopy_alloc.c new file mode 100644 index 000000000000..7b219392e829 --- /dev/null +++ b/mm/zcopy_alloc.c @@ -0,0 +1,129 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Allocator for zerocopy filler fragments + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * Provide a facility whereby pieces of bufferage can be allocated for + * insertion into bio_vec arrays intended for zerocopying, allowing protocol + * stuff to be mixed in with data. + * + * Unlike objects allocated from the slab, the lifetime of these pieces of + * buffer are governed purely by the refcount of the page in which they reside. + */ + +#include +#include +#include +#include +#include + +struct zcopy_alloc_info { + struct folio *folio; /* Page currently being allocated from */ + struct folio *spare; /* Spare page */ + unsigned int used; /* Amount of folio used */ + spinlock_t lock; /* Allocation lock (needs bh-disable) */ +}; + +static struct zcopy_alloc_info __percpu *zcopy_alloc_info; + +static int __init zcopy_alloc_init(void) +{ + zcopy_alloc_info = alloc_percpu(struct zcopy_alloc_info); + if (!zcopy_alloc_info) + panic("Unable to set up zcopy_alloc allocator\n"); + return 0; +} +subsys_initcall(zcopy_alloc_init); + +/** + * zcopy_alloc - Allocate some memory for use in zerocopy + * @size: The amount of memory (maximum 1/2 page). + * @bvec: Where to store the details of the memory + * @gfp: Allocation flags under which to make an allocation + * + * Allocate some memory for use with zerocopy where protocol bits have to be + * mixed in with spliced/zerocopied data. Unlike memory allocated from the + * slab, this memory's lifetime is purely dependent on the folio's refcount. + * + * The way it works is that a folio is allocated and pieces are broken off + * sequentially and given to the allocators with a ref until it no longer has + * enough spare space, at which point the allocator's ref is dropped and a new + * folio is allocated. The folio remains in existence until the last ref held + * by, say, a sk_buff is discarded and then the page is returned to the + * allocator. + * + * Returns 0 on success and -ENOMEM on allocation failure. If successful, the + * details of the allocated memory are placed in *%bvec. + * + * The allocated memory should be disposed of with folio_put(). + */ +int zcopy_alloc(size_t size, struct bio_vec *bvec, gfp_t gfp) +{ + struct zcopy_alloc_info *info; + struct folio *folio, *spare = NULL; + size_t full_size = round_up(size, 8); + + if (WARN_ON_ONCE(full_size > PAGE_SIZE / 2)) + return -ENOMEM; /* Allocate pages */ + +try_again: + info = get_cpu_ptr(zcopy_alloc_info); + + folio = info->folio; + if (folio && folio_size(folio) - info->used < full_size) { + folio_put(folio); + folio = info->folio = NULL; + } + if (spare && !info->spare) { + info->spare = spare; + spare = NULL; + } + if (!folio && info->spare) { + folio = info->folio = info->spare; + info->spare = NULL; + info->used = 0; + } + if (folio) { + bvec_set_folio(bvec, folio, size, info->used); + info->used += full_size; + if (info->used < folio_size(folio)) + folio_get(folio); + else + info->folio = NULL; + } + + put_cpu_ptr(zcopy_alloc_info); + if (folio) { + if (spare) + folio_put(spare); + return 0; + } + + spare = folio_alloc(gfp, 0); + if (!spare) + return -ENOMEM; + goto try_again; +} +EXPORT_SYMBOL(zcopy_alloc); + +/** + * zcopy_memdup - Allocate some memory for use in zerocopy and fill it + * @size: The amount of memory to copy (maximum 1/2 page). + * @p: The source data to copy + * @bvec: Where to store the details of the memory + * @gfp: Allocation flags under which to make an allocation + */ +int zcopy_memdup(size_t size, const void *p, struct bio_vec *bvec, gfp_t gfp) +{ + void *q; + + if (zcopy_alloc(size, bvec, gfp) < 0) + return -ENOMEM; + + q = kmap_local_folio(page_folio(bvec->bv_page), bvec->bv_offset); + memcpy(q, p, size); + kunmap_local(q); + return 0; +} +EXPORT_SYMBOL(zcopy_memdup); From patchwork Thu Mar 16 15:25:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70844 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp552703wrt; Thu, 16 Mar 2023 08:39:27 -0700 (PDT) X-Google-Smtp-Source: AK7set9Hzx2TRUfsY/LEUFgRUy2McYjL7qKlzsmhJmyxbIdYkPTQQCKlZDHC79vOmvNrU8BW9y4L X-Received: by 2002:a05:6a20:699a:b0:cc:a8d7:ad7e with SMTP id t26-20020a056a20699a00b000cca8d7ad7emr4908209pzk.60.1678981166963; Thu, 16 Mar 2023 08:39:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981166; cv=none; d=google.com; s=arc-20160816; b=eM0eV3KkmLM8OVpXCGLAv2pMJb0T7AU3VYNMXIazu+ZtKyCR2sUjdZ5vZmWlaxszdk w6CHhvJybBRRpO1ghnO1FDA/RF2oc6FU8l8Nvq1ECBOKRLPpKBrWM3Ikwp5FwzlJM79X Q9wUWiGhnqodUpgxIQCuwYts7CDEvVZ7lg9KsKs1vw7LLvZxktvsKqgMJ1OusAJCG4da JsAh+DzinkEdvT1homTnXk5Da55a5LmZZr4igIyIVihZbJp9xEeLpiBtiWVBAKSe7tXJ qvvMIbTpPkNiYjZsr/Sr0imm8iyN8x8TYAEAwSjMNsrojJyuwlZ/+kIuUFo0iL2q0Eim uURg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jyeyKmuw2xi8Qa/hUDKrm3skE7lHLzOU8pZdDmCLqIg=; b=tXQ6L/FIqgSpnHdu5x3ibeG681tqwHM7KY6K3o4HW+5vC1Jz4pOrXe2tRSDTElRTRs V5VPgvZU3PZ0MulUK8dBwJA2fyggMKqhDNov191EWtcZ0t944RgRsUK1X7XgFjiJfTja g5/XALY1/Eu3+FPis4mFvxt7YoAMa4HcaU272xrhiha4BJN2KDcFpDwA/iU2MwM+peqn FlMkXuzszdtkPue5txYCQZSHqdLWCK+pxo+eJEdz8vWL2Y/eELcj9zETVCZx9JXERaTE XxMaHa3Z4rpNuMz+fv/0UZD0fho0qMC0WSFGap5P1CFUr2fh4CzUa0to5vixpQDHPkFe 1yOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="e/Xz7X0a"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w38-20020a631626000000b004fc2f19bacasi8283050pgl.120.2023.03.16.08.39.14; Thu, 16 Mar 2023 08:39:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="e/Xz7X0a"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231431AbjCPP2I (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231432AbjCPP2E (ORCPT ); Thu, 16 Mar 2023 11:28:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 589A96C1B9 for ; Thu, 16 Mar 2023 08:26:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980396; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jyeyKmuw2xi8Qa/hUDKrm3skE7lHLzOU8pZdDmCLqIg=; b=e/Xz7X0a+lgQkATgHxO1ri+gYBzZPtEMhNMUaJKJ5hjMTuTuP/ITgLKFnSCyrcuP4TpZvA mSPLLVplYjYSxB7UylMqr9rE05n7tcHt5G2rlMAJ/JtwImpl1HR6+4yXUjM7r5E4q4ALI5 XQfPwS7S327FnjauJhTt+taeLB5cwjs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-473-njCfjHCZMGSNtwp8bzoGhw-1; Thu, 16 Mar 2023 11:26:32 -0400 X-MC-Unique: njCfjHCZMGSNtwp8bzoGhw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2889C38149BF; Thu, 16 Mar 2023 15:26:31 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6026F492B00; Thu, 16 Mar 2023 15:26:29 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:25:53 +0000 Message-Id: <20230316152618.711970-4-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539355742111070?= X-GMAIL-MSGID: =?utf-8?q?1760539355742111070?= Make TCP's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible (the iterator must be ITER_BVEC and the pages must be spliceable). This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 53 insertions(+), 6 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 288693981b00..77c0c69208a5 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) int flags, err, copied = 0; int mss_now = 0, size_goal, copied_syn = 0; int process_backlog = 0; - bool zc = false; + int zc = 0; long timeo; flags = msg->msg_flags; @@ -1231,17 +1231,24 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (msg->msg_ubuf) { uarg = msg->msg_ubuf; net_zcopy_get(uarg); - zc = sk->sk_route_caps & NETIF_F_SG; + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); if (!uarg) { err = -ENOBUFS; goto out_err; } - zc = sk->sk_route_caps & NETIF_F_SG; - if (!zc) + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; + else uarg_to_msgzc(uarg)->zerocopy = 0; } + } else if (unlikely(flags & MSG_SPLICE_PAGES) && size) { + if (!iov_iter_is_bvec(&msg->msg_iter)) + return -EINVAL; + if (sk->sk_route_caps & NETIF_F_SG) + zc = 2; } if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) && @@ -1345,7 +1352,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (copy > msg_data_left(msg)) copy = msg_data_left(msg); - if (!zc) { + if (zc == 0) { bool merge = true; int i = skb_shinfo(skb)->nr_frags; struct page_frag *pfrag = sk_page_frag(sk); @@ -1390,7 +1397,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) page_ref_inc(pfrag->page); } pfrag->offset += copy; - } else { + } else if (zc == 1) { /* First append to a fragless skb builds initial * pure zerocopy skb */ @@ -1411,6 +1418,46 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (err < 0) goto do_error; copy = err; + } else if (zc == 2) { + /* Splice in data. */ + const struct bio_vec *bv = msg->msg_iter.bvec; + size_t seg = iov_iter_single_seg_count(&msg->msg_iter); + size_t off = bv->bv_offset + msg->msg_iter.iov_offset; + bool can_coalesce; + int i = skb_shinfo(skb)->nr_frags; + + if (copy > seg) + copy = seg; + + can_coalesce = skb_can_coalesce(skb, i, bv->bv_page, off); + if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { + tcp_mark_push(tp, skb); + goto new_segment; + } + if (tcp_downgrade_zcopy_pure(sk, skb)) + goto wait_for_space; + + copy = tcp_wmem_schedule(sk, copy); + if (!copy) + goto wait_for_space; + + if (can_coalesce) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + get_page(bv->bv_page); + skb_fill_page_desc_noacc(skb, i, bv->bv_page, off, copy); + } + iov_iter_advance(&msg->msg_iter, copy); + + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; + + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + sk_wmem_queued_add(sk, copy); + sk_mem_charge(sk, copy); + } if (!copied) From patchwork Thu Mar 16 15:25:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70855 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554314wrt; Thu, 16 Mar 2023 08:42:46 -0700 (PDT) X-Google-Smtp-Source: AK7set97RUUmWOBiwk4OxkBSCf1Wf7yqcnb7xdYHOd0G1T2CLbV+yyZ2ipjlKHY9lbulx8Nz27Qf X-Received: by 2002:a05:6a20:1325:b0:d6:936a:664e with SMTP id g37-20020a056a20132500b000d6936a664emr3440392pzh.59.1678981365991; Thu, 16 Mar 2023 08:42:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981365; cv=none; d=google.com; s=arc-20160816; b=ov2v49sNDOLjlwZNCP1bRMeww9Dg2i5R6Zk8t96NjXuiKaRc6vgMb1p2QqSof9GdfT TiAsgsPOEdvj1unk+upQ2EZ7EoAw1JYgLWI5HZp4Eu656lRavsFn1CPmSpY/E08W+Y68 J55FLLEduSUUp6PhzSyMOeuv28mbzJB/hgHXPsBQe5xZgdOKWtq1YGwTKSFdEikP8axL NtxozPG/Zc6CneJ0ejVmmvItLUTMKUGmevCET7rNH5vgkRZ3N7ShGWxjfUNYxpTkIJhn VWKqQb8NcuMG6UwdyCK512+YE83PZ2CY1AdaG64pbQ+wSGD6xvlQB3Aql+T3QOmv4zLd 475Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+jxYzo5nnHDy2PfhlAaDDi56GwEdHMnIZIIG4o6/j/8=; b=jc8TBuQdJ2OOu9s5+qnxFoYcL1ER/JxWaTg+rAJFvyRNsDmy0SGOiC1cMfni/VCS2g LRkmuQN6yd+LC9niSMHKFiOqsWgSsT5k+jmSWuMorpm+Ng3OBWQRL0fp7Azmvtq0rTmj x9VYD7V4j8jjSq39I3T/4qg3U6iPqSYEJbcf/VTPS0Xn4rYs2o5UVuMXz4dL11CN4zxr 6xrZGkCms2ue3VZdu8PYth99GIM2vnnzJZ9MmCUyzLfOxyybuBoVoa9Qv+HubKvC43nE tb9mSv8MLw8aZAqUOZMhirJ/kqOLnJ054aHybAxBLx/QuwtPgbc8BwsN3xLkQuoVqP7h MhGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ItE3BXpI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j28-20020a63551c000000b0050bc14fc7absi8126448pgb.106.2023.03.16.08.42.30; Thu, 16 Mar 2023 08:42:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ItE3BXpI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231436AbjCPP2S (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231438AbjCPP2F (ORCPT ); Thu, 16 Mar 2023 11:28:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D777D86DDB for ; Thu, 16 Mar 2023 08:26:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980402; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+jxYzo5nnHDy2PfhlAaDDi56GwEdHMnIZIIG4o6/j/8=; b=ItE3BXpI0gk5J1T6uIjHx1JHqguNxp3pCcFGWHtcqGrFpI7A3ObzG03KDOUJQ+c1EZE0AI fjhtxTGr1/TOCzHRyK7nBEU4kESNZ1uLBm8ekgvCg+5HgBjMdO2zbcv+EocjPMotMdlyN0 J/XsClbCvq/9xu/yH/SFDJRRXpS+kVI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-59-UIpyeEHENxKhYGTwrn-jBw-1; Thu, 16 Mar 2023 11:26:38 -0400 X-MC-Unique: UIpyeEHENxKhYGTwrn-jBw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 82972101A552; Thu, 16 Mar 2023 15:26:33 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id B820F35453; Thu, 16 Mar 2023 15:26:31 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:25:54 +0000 Message-Id: <20230316152618.711970-5-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539565059761041?= X-GMAIL-MSGID: =?utf-8?q?1760539565059761041?= Convert do_tcp_sendpages() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. do_tcp_sendpages() can then be inlined in subsequent patches into its callers. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 160 +++---------------------------------------------- 1 file changed, 9 insertions(+), 151 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 77c0c69208a5..7c3acc5673e9 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,163 +971,21 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, - struct page *page, int offset, size_t *size) -{ - struct sk_buff *skb = tcp_write_queue_tail(sk); - struct tcp_sock *tp = tcp_sk(sk); - bool can_coalesce; - int copy, i; - - if (!skb || (copy = size_goal - skb->len) <= 0 || - !tcp_skb_can_collapse_to(skb)) { -new_segment: - if (!sk_stream_memory_free(sk)) - return NULL; - - skb = tcp_stream_alloc_skb(sk, 0, sk->sk_allocation, - tcp_rtx_and_write_queues_empty(sk)); - if (!skb) - return NULL; - -#ifdef CONFIG_TLS_DEVICE - skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED); -#endif - tcp_skb_entail(sk, skb); - copy = size_goal; - } - - if (copy > *size) - copy = *size; - - i = skb_shinfo(skb)->nr_frags; - can_coalesce = skb_can_coalesce(skb, i, page, offset); - if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { - tcp_mark_push(tp, skb); - goto new_segment; - } - if (tcp_downgrade_zcopy_pure(sk, skb)) - return NULL; - - copy = tcp_wmem_schedule(sk, copy); - if (!copy) - return NULL; - - if (can_coalesce) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); - } else { - get_page(page); - skb_fill_page_desc_noacc(skb, i, page, offset, copy); - } - - if (!(flags & MSG_NO_SHARED_FRAGS)) - skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; - - skb->len += copy; - skb->data_len += copy; - skb->truesize += copy; - sk_wmem_queued_add(sk, copy); - sk_mem_charge(sk, copy); - WRITE_ONCE(tp->write_seq, tp->write_seq + copy); - TCP_SKB_CB(skb)->end_seq += copy; - tcp_skb_pcount_set(skb, 0); - - *size = copy; - return skb; -} - ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct tcp_sock *tp = tcp_sk(sk); - int mss_now, size_goal; - int err; - ssize_t copied; - long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); - - if (IS_ENABLED(CONFIG_DEBUG_VM) && - WARN_ONCE(!sendpage_ok(page), - "page must not be a Slab one and have page_count > 0")) - return -EINVAL; - - /* Wait for a connection to finish. One exception is TCP Fast Open - * (passive side) where data is allowed to be sent before a connection - * is fully established. - */ - if (((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) && - !tcp_passive_fastopen(sk)) { - err = sk_stream_wait_connect(sk, &timeo); - if (err != 0) - goto out_err; - } - - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); - - mss_now = tcp_send_mss(sk, &size_goal, flags); - copied = 0; - - err = -EPIPE; - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) - goto out_err; - - while (size > 0) { - struct sk_buff *skb; - size_t copy = size; - - skb = tcp_build_frag(sk, size_goal, flags, page, offset, ©); - if (!skb) - goto wait_for_space; - - if (!copied) - TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH; - - copied += copy; - offset += copy; - size -= copy; - if (!size) - goto out; - - if (skb->len < size_goal || (flags & MSG_OOB)) - continue; - - if (forced_push(tp)) { - tcp_mark_push(tp, skb); - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_PUSH); - } else if (skb == tcp_send_head(sk)) - tcp_push_one(sk, mss_now); - continue; - -wait_for_space: - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - tcp_push(sk, flags & ~MSG_MORE, mss_now, - TCP_NAGLE_PUSH, size_goal); - - err = sk_stream_wait_memory(sk, &timeo); - if (err != 0) - goto do_error; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES, + }; - mss_now = tcp_send_mss(sk, &size_goal, flags); - } + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); -out: - if (copied) { - tcp_tx_timestamp(sk, sk->sk_tsflags); - if (!(flags & MSG_SENDPAGE_NOTLAST)) - tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); - } - return copied; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; -do_error: - tcp_remove_empty_skb(sk); - if (copied) - goto out; -out_err: - /* make sure we wake any epoll edge trigger waiter */ - if (unlikely(tcp_rtx_and_write_queues_empty(sk) && err == -EAGAIN)) { - sk->sk_write_space(sk); - tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED); - } - return sk_stream_error(sk, flags, err); + return tcp_sendmsg_locked(sk, &msg, size); } EXPORT_SYMBOL_GPL(do_tcp_sendpages); From patchwork Thu Mar 16 15:25:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70850 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp553958wrt; Thu, 16 Mar 2023 08:41:58 -0700 (PDT) X-Google-Smtp-Source: AK7set9qaGa8rDfYtIFLQ3LrBCWu3NAw1yIuQyDwBJlP0F2zX8CkckANX8h9jrxMfzaxjMEGqwA7 X-Received: by 2002:a17:903:22c8:b0:19e:8075:5545 with SMTP id y8-20020a17090322c800b0019e80755545mr4519202plg.54.1678981318625; Thu, 16 Mar 2023 08:41:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981318; cv=none; d=google.com; s=arc-20160816; b=yE7JAJ8jwbYtvDYpmmHw90AQ57BDOmUX2V8VyuLfzyGUM/27hQR5NqLtRy1HQ67Chh NmYVACXwQRMbzPmjg7d6nc6UQGRWEelW+QQ/kYMlUaw6HiQdQYYCvNiIkgqQ1Vw3mawY +fWXmbXmg3QjHP6Aty+6958HgmyKxGcnTRxkKpwfOVbWDpLOcQxJ93GGiMNjlBNSZZZ8 ZOm2ZomEIt333XWf8arT5TadWbs13N9Vyy+Il/AfN0wEDfqK7ZFJSmoN86fZ/lgimDx/ ud4Z4jhjXfDiTMDNDWvPNFXH+J+eRJYfcS0ro4p00ptDqYmfDM09rYIFH2TfGiqlwiKd +mcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=V4A9+ThbnxeOTamldoFRxBPwB7/WQbFmJXJUgDasL0c=; b=W5Z3QjhGoCUxy7zrzb0IQ1Zm0ueVNPlCFDlKAoZAOZVJ6+qTdl9FVnPT4Iuuvo18nk zucHU2vBoRMAjCiHU8rp/QUhJsZ4x5KrUQuzlPXIcIX3Am59Q2bS4XsO+rg1NlW2foSY 6K+2B7PfAbLh7bZC4F5LOxiSJONpL23lgR30Sows9qcAFwnNvsK/hXnS+kOlR/sX8dIv YSbdPCSXO33GhKD2fsBopYZguhg8vllRtrOQC7r19JpOXx/ahA92Jnpnj3EFiptXDg18 8LPliMyOifUP/DZtUJeQyg9sMV9/7APJZ13jLbk/6bG+2w2pQoUtifPVIQchVXj7BO7m 0LRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XDoeGy+w; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c3-20020a170902d48300b001a19536c6b2si1832472plg.103.2023.03.16.08.41.43; Thu, 16 Mar 2023 08:41:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XDoeGy+w; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231407AbjCPP2A (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229530AbjCPP16 (ORCPT ); Thu, 16 Mar 2023 11:27:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 839F5C154 for ; Thu, 16 Mar 2023 08:26:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980399; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4A9+ThbnxeOTamldoFRxBPwB7/WQbFmJXJUgDasL0c=; b=XDoeGy+wBqXj7kfaKbpwycMT3pRBq4SXc55ose9sxxGmSlDiiBOFBx6UwEqu0K/xLlxzqg 0ktDbKlmI6o8760ftGakNL9Io7juJ69pkgZ+A+8nWe6A9ZvIOwSXMlpfchygAkrb4BI8sF 5yWfnb6h7bicr2xU5JyCUjGgEfufjaE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-592-b1SC9PDcPYSYvZwVDt6M2w-1; Thu, 16 Mar 2023 11:26:37 -0400 X-MC-Unique: b1SC9PDcPYSYvZwVDt6M2w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 31C5638149B9; Thu, 16 Mar 2023 15:26:36 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1CF1E1121315; Thu, 16 Mar 2023 15:26:34 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Fastabend , Jakub Sitnicki , bpf@vger.kernel.org Subject: [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg Date: Thu, 16 Mar 2023 15:25:55 +0000 Message-Id: <20230316152618.711970-6-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539514831018782?= X-GMAIL-MSGID: =?utf-8?q?1760539514831018782?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: John Fastabend cc: Jakub Sitnicki cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org cc: bpf@vger.kernel.org --- net/ipv4/tcp_bpf.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index cf26d65ca389..7f17134637eb 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -72,11 +72,13 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, { bool apply = apply_bytes; struct scatterlist *sge; + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct page *page; int size, ret = 0; u32 off; while (1) { + struct bio_vec bvec; bool has_tx_ulp; sge = sk_msg_elem(msg, msg->sg.start); @@ -88,16 +90,18 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, tcp_rate_check_app_limited(sk); retry: has_tx_ulp = tls_sw_has_ctx_tx(sk); - if (has_tx_ulp) { - flags |= MSG_SENDPAGE_NOPOLICY; - ret = kernel_sendpage_locked(sk, - page, off, size, flags); - } else { - ret = do_tcp_sendpages(sk, page, off, size, flags); - } + if (has_tx_ulp) + msghdr.msg_flags |= MSG_SENDPAGE_NOPOLICY; + if (flags & MSG_SENDPAGE_NOTLAST) + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, size, off); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret <= 0) return ret; + if (apply) apply_bytes -= ret; msg->sg.size -= ret; @@ -398,7 +402,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) long timeo; int flags; - /* Don't let internal do_tcp_sendpages() flags through */ + /* Don't let internal sendpage flags through */ flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED); flags |= MSG_NO_SHARED_FRAGS; From patchwork Thu Mar 16 15:25:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70862 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554697wrt; Thu, 16 Mar 2023 08:43:28 -0700 (PDT) X-Google-Smtp-Source: AK7set92AcRjfBiGnHpruSkY9sGz93Ak9xV+fG9bO6Y4OUXPK5t3TBtC2ByTC21B//XQrR0wAdzk X-Received: by 2002:a05:6a20:b298:b0:cf:71ee:6326 with SMTP id ei24-20020a056a20b29800b000cf71ee6326mr4335565pzb.5.1678981408186; Thu, 16 Mar 2023 08:43:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981408; cv=none; d=google.com; s=arc-20160816; b=CnsX2R/MJFHyEUrJlERjkH1BBGF5NgRfAQ4Dy7m6e8UUWU01aIzTcCLUv/7ONi+92u KmCJ4Wqk8BvHEJQiZVfTbGAITa4eM31o0O0JWueyQO378LHa8O77dfAyYri9cyt8D7M1 YTqrkCHbUPeu/gCY67s+ZuOGGKR9DSzVqVYD5LoKVRRSjFlQEbusBn6RoGEYCHMYJalo D631q9BmgiJsVLiT9jTkeCLSe8nfLzcsaCFAPqoXlG2OWmPK1zsxh4y8WHEL6d+UyFCn Oqk8wKesYCeuOIFiBkW9PDh2Nr6fDLIinT7G7VkkC6fGZyGoWCEgh1CYZx/9LDX+XrIL YOpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=uTjDCNASvXcY+BopRPsgm/a5PL8NZz+ezO7bbQo70FvZ/RbPUKxjnAAUA/1kd72vpz FzE6EgO3s9itm5ROEspZWtD6rn8keRlXLSyTMH4lIeRB81TJDaXFb/DBi/W8R6bGemXE r/H+4wKStlM+wiiTO9lpRC0BiiGsXexOMaWr2ACilBDfKSeRY4CU8K3lrgPX8dnsjk2F 594mEENcnNQ2uM0TKYnoXyvrOEuGhpW2Vy6fFUmZR3hL+/8YyBj77IJh5HzzwxhgV98N YrxCmQfzuEnoRuE3K8dJJ7JBfoR92IOZjaF9z57XDhv5UlllzLgnStJcZmdukCBKhXRN yDvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YibKidY1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j13-20020a63230d000000b004fba03ee686si8198270pgj.202.2023.03.16.08.43.13; Thu, 16 Mar 2023 08:43:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YibKidY1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231504AbjCPP22 (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231449AbjCPP2R (ORCPT ); Thu, 16 Mar 2023 11:28:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1992EA3B72 for ; Thu, 16 Mar 2023 08:26:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980403; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=YibKidY1Dmv7PHLzBMqh1fCh7tjC8dHSN/r4N98+33o/2stv/b20iVLWpKBCjDF1WgsHVY n2zJ0Z0Q7/p557yb7y5L4872pJh8+XTB5cc7FJl0/8pDIIS5rbOhD/4EFPwIdxewE4UM6O EJ56ufjvGNhoIwcUyilsByyvE91grtU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-403-UpLocroTPSOLnFgApGw-Ag-1; Thu, 16 Mar 2023 11:26:40 -0400 X-MC-Unique: UpLocroTPSOLnFgApGw-Ag-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ECF4C858289; Thu, 16 Mar 2023 15:26:38 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id D7E9940D1C8; Thu, 16 Mar 2023 15:26:36 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Steffen Klassert , Herbert Xu Subject: [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() Date: Thu, 16 Mar 2023 15:25:56 +0000 Message-Id: <20230316152618.711970-7-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539609140482133?= X-GMAIL-MSGID: =?utf-8?q?1760539609140482133?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Steffen Klassert cc: Herbert Xu cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/xfrm/espintcp.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 872b80188e83..3504925babdb 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -205,14 +205,16 @@ static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg, static int espintcp_sendskmsg_locked(struct sock *sk, struct espintcp_msg *emsg, int flags) { + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct sk_msg *skmsg = &emsg->skmsg; struct scatterlist *sg; int done = 0; int ret; - flags |= MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags |= MSG_SENDPAGE_NOTLAST; sg = &skmsg->sg.data[skmsg->sg.start]; do { + struct bio_vec bvec; size_t size = sg->length - emsg->offset; int offset = sg->offset + emsg->offset; struct page *p; @@ -220,11 +222,13 @@ static int espintcp_sendskmsg_locked(struct sock *sk, emsg->offset = 0; if (sg_is_last(sg)) - flags &= ~MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret < 0) { emsg->offset = offset - sg->offset; skmsg->sg.start += done; From patchwork Thu Mar 16 15:25:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70858 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554563wrt; Thu, 16 Mar 2023 08:43:14 -0700 (PDT) X-Google-Smtp-Source: AK7set8zciCDwtT0/VvDLCe86yli1wKEi1bRrZFA8j2/gZaY+5EwIAu2KAetrC1mhaHXLDvUMg9m X-Received: by 2002:a17:90b:4c4d:b0:237:d2b0:dac6 with SMTP id np13-20020a17090b4c4d00b00237d2b0dac6mr4221070pjb.33.1678981393972; Thu, 16 Mar 2023 08:43:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981393; cv=none; d=google.com; s=arc-20160816; b=vm7Rb7REvJcT+ORnJUAjdsaEqyJMEYAN9Z6/9Q9zpqosqY2XIDhuSkSWGfamQz1GdC vPXAOVboEYodGt857YBOv74N4i4ce93BshYrrIJgr3lsqZj12Sb67SHQ+PNOkxzg7kBR 9r5T8PbyJYYn/pokWRDn7fs1Hc5KNhGGyMAHQg61Hiw24FiW7WG/pb1kWiLq7fYXXGoc kWFtuaVgYhov4N8zdj/m0Z8ftkPqFFyqoaFVxQst6aiu2j0Hfkkxm8uBbWabH7ul1H58 AfuXSRZs5svdybvWDya9lgi4KH5tOQPRiekn0dL2csSF/u7FLjr18b9EqLgTidjF4dTr uSjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=OLS5n4UNczR7Lntud/yq4c19A8akH99w6SN5hz2uxtc=; b=wat/zKBR7VzF1iVa+W+n210eV4cRpvigXyuiU+xxJqCOzsQm3uJnD5V+u2lHUm5th4 OUazVdg4NfzR4MC3ugTA++NJg5WfJO19EkdJ7eOTRxW6X6Q6ZbIp2gylpizRcU15mpoT knCqDS9YC3fDZHPn5CJ083GGjjTp+VkBEsrICbLW/daz7okUYpIhPKL/+Lfm6fBRNs9A h0Qozap0ch1v93+S/MFnxRKAeG58Xfmj2A8Q/60pFV706/q4lGyk+aJVSlu+XaVdJnMc aZ8RlR6HlyhBUeEyazqBkwtqlaJ50OL/Uj8CCxEzdmkl22B0h3SHCLdtSwt3hCCapOS5 wLQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Yy23REOz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o16-20020a17090ad25000b0023747b030e7si4939537pjw.105.2023.03.16.08.42.58; Thu, 16 Mar 2023 08:43:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Yy23REOz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231501AbjCPP2r (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231469AbjCPP2Y (ORCPT ); Thu, 16 Mar 2023 11:28:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85517A908B for ; Thu, 16 Mar 2023 08:26:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OLS5n4UNczR7Lntud/yq4c19A8akH99w6SN5hz2uxtc=; b=Yy23REOzk/zUtnhH9r9gghJd3JzS57YD0D9/CztHsqFJ863LgIQggSppo6aZ2/3xugSuVb ZHSCoWlIf+h2ZF8F2vvwctRgIHYKG3IJlzcEL8oe5qvJ13NCH/4SeNmMrywoRAcSe+uKPN OFxA8bkNbHmSgnVsk/T1L6mOLDqlCPs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-391-UMelVK58POioH5w_UJM1sw-1; Thu, 16 Mar 2023 11:26:42 -0400 X-MC-Unique: UMelVK58POioH5w_UJM1sw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7F1BE811E7E; Thu, 16 Mar 2023 15:26:41 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 871AD40C6E67; Thu, 16 Mar 2023 15:26:39 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [RFC PATCH 07/28] tls: Inline do_tcp_sendpages() Date: Thu, 16 Mar 2023 15:25:57 +0000 Message-Id: <20230316152618.711970-8-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539594217742855?= X-GMAIL-MSGID: =?utf-8?q?1760539594217742855?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tls.h | 2 +- net/tls/tls_main.c | 24 +++++++++++++++--------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index 154949c7b0c8..d31521c36a84 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -256,7 +256,7 @@ struct tls_context { struct scatterlist *partially_sent_record; u16 partially_sent_offset; - bool in_tcp_sendpages; + bool splicing_pages; bool pending_open_record_frags; struct mutex tx_lock; /* protects partially_sent_* fields and diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 3735cb00905d..8802b4f8b652 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -124,7 +124,10 @@ int tls_push_sg(struct sock *sk, u16 first_offset, int flags) { - int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST, + }; int ret = 0; struct page *p; size_t size; @@ -133,16 +136,19 @@ int tls_push_sg(struct sock *sk, size = sg->length - offset; offset += sg->offset; - ctx->in_tcp_sendpages = true; + ctx->splicing_pages = true; while (1) { if (sg_is_last(sg)) - sendpage_flags = flags; + msg.msg_flags = flags | MSG_SPLICE_PAGES; /* is sending application-limited? */ tcp_rate_check_app_limited(sk); p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, sendpage_flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = tcp_sendmsg_locked(sk, &msg, size); if (ret != size) { if (ret > 0) { @@ -154,7 +160,7 @@ int tls_push_sg(struct sock *sk, offset -= sg->offset; ctx->partially_sent_offset = offset; ctx->partially_sent_record = (void *)sg; - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return ret; } @@ -168,7 +174,7 @@ int tls_push_sg(struct sock *sk, size = sg->length; } - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return 0; } @@ -246,11 +252,11 @@ static void tls_write_space(struct sock *sk) { struct tls_context *ctx = tls_get_ctx(sk); - /* If in_tcp_sendpages call lower protocol write space handler + /* If splicing_pages call lower protocol write space handler * to ensure we wake up any waiting operations there. For example - * if do_tcp_sendpages where to call sk_wait_event. + * if splicing pages where to call sk_wait_event. */ - if (ctx->in_tcp_sendpages) { + if (ctx->splicing_pages) { ctx->sk_write_space(sk); return; } From patchwork Thu Mar 16 15:25:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70866 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554976wrt; Thu, 16 Mar 2023 08:43:58 -0700 (PDT) X-Google-Smtp-Source: AK7set93OXiaLxHZ0bcEl/WUCzYcptSsZG3DMTIwHOL7MsSZPNXxrQXB6rFb7iXmtUsMpZzrbAMH X-Received: by 2002:a17:902:e541:b0:19d:1ffd:148d with SMTP id n1-20020a170902e54100b0019d1ffd148dmr4847492plf.46.1678981437793; Thu, 16 Mar 2023 08:43:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981437; cv=none; d=google.com; s=arc-20160816; b=CCVVjAJPS2AXA4jIZk8WJFSjivh2ddzkPzjM9FNjMdMux4ZDUp9T2wMHtBqFX7nzyV zSeCQw9uNK0vuQorTlU7PC4PnTLYZFIzaX3MPKI/YmvlP0noztsuQLadIq6lbQoxFwfJ 3+jHg3jzIQwExYEKwkrIGx2a2q8Za7CrCbXFWcyfNFpcTW8+wJ867FkiYH5H0Dq9HNYd iYP3CxF40ufFLxRG5k059qFr435uSA82c/8ty5KsnS2qv2wzWthrGY6D5zFak6SERJ3U fjmph1fJYgXF3/C5uxPOZZuCBPiqC8sw2IE8sLeQPjBTmKXh6pDRkww5zS94QO375bSN tDbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2QKge+op/Hl6BMrdVZzEnNbF4IaGzItNcXZtJUNSCtc=; b=ILxnpGZ3bwtHvVWjKV1VdxjdDym+26QC3QN2cQmfGeNavzdTROfRi6BTEvy0oleB2D b+OraMTfiYIi4MtCTCdApr/ms0tW8AhlN3wKkiEbvWoxNcResMShjxqHjBPR5xIZvpQR Ciz+zi0J77CAzxFBx2ny5zxEVBGZxf/6o4LT3EfVNBzoed/KPp+qhGueZAGKsmuii8fN aXlYMSGTBiuX/f3qJX90nkGxtimBMsousuzQ6NgCcW8M4juHfy+PFMXx8ABlJTJOSx0v lDFoRZQ3PGsEB2P7SU5xvXEotMBaJ2B3EaMT+bvXbCVBGoXqiZSpfTPqIyCL1Rc7rjo4 GYIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gCpB4h4a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lk15-20020a17090308cf00b0019cf1bde932si7823370plb.35.2023.03.16.08.43.42; Thu, 16 Mar 2023 08:43:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gCpB4h4a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231495AbjCPP2o (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231475AbjCPP2Y (ORCPT ); Thu, 16 Mar 2023 11:28:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2464EAD00A for ; Thu, 16 Mar 2023 08:26:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980410; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2QKge+op/Hl6BMrdVZzEnNbF4IaGzItNcXZtJUNSCtc=; b=gCpB4h4a6oL6mfboDOqIbv/FxmM/tkL5cP/ln9iFa5/mlXHAxkGDzfywrOE5qgiSglp7Xj JIfRvxs81Y+oWUEEnvrIlmgtOgTo3Nug4dnXCVSR46H1FlkDv1PtvfQj6oW05JJHCj9+aN HA69xtuTk9NBC/RfH0MfQIXjE38iwko= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-632-75WtpKLgNty4TtjE6A487g-1; Thu, 16 Mar 2023 11:26:45 -0400 X-MC-Unique: 75WtpKLgNty4TtjE6A487g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4412F858F09; Thu, 16 Mar 2023 15:26:44 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 319582166B26; Thu, 16 Mar 2023 15:26:42 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages() Date: Thu, 16 Mar 2023 15:25:58 +0000 Message-Id: <20230316152618.711970-9-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539640170636250?= X-GMAIL-MSGID: =?utf-8?q?1760539640170636250?= do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 05052b49107f..8fc179321e2b 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, } /* - * 0copy TCP transmit interface: Use do_tcp_sendpages. + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. * * Using sendpage to push page by page appears to be less efficient * than using sendmsg, even if data are copied. @@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, size_t size) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT | + MSG_SENDPAGE_NOTLAST), + }; struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0, - flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST; + int i = 0, rv = 0, sent = 0; while (size) { size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); if (size + offset <= PAGE_SIZE) - flags = MSG_MORE | MSG_DONTWAIT; + msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT; tcp_rate_check_app_limited(sk); + bvec_set_page(&bvec, page[i], bytes, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + try_page_again: lock_sock(sk); - rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags); + rv = tcp_sendmsg_locked(sk, &msg, size); release_sock(sk); if (rv > 0) { From patchwork Thu Mar 16 15:25:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70846 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp553459wrt; Thu, 16 Mar 2023 08:41:05 -0700 (PDT) X-Google-Smtp-Source: AK7set/pDmUOiQfo7qDXtXtoCSaLaXni4+sn90gNnWSn8PHuT/c9l1MviZpuI93QOuuTsHgGJhPm X-Received: by 2002:a17:902:ea12:b0:1a1:936d:84bd with SMTP id s18-20020a170902ea1200b001a1936d84bdmr2939505plg.24.1678981265370; Thu, 16 Mar 2023 08:41:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981265; cv=none; d=google.com; s=arc-20160816; b=MSVwKuedxnRq/dqZrm+cNwubnLD9R+Pzzxcb2nBbTW5DGZYlonhKUvws4DtPysN5WU 96AG0yitk8KCLJ/spDclt37+mrNeRIpLFVR9yx0AJuuVBO9NTUJDMUo6dS58BHGdrjZ6 FUiG2FTahHbxe84RYqYa+Wdioyt9hbSyAkE0PcmKRdZ0tqTsBe2YVT1B6aII0nIur7Fd RYOuaaevHEgbx8bJSbYRPW2qjmEaOBV5Ht72rA5jHqtuwpaf7NwYQi38BHYHxKgHesql /Ve8SveQWUKNU2r7+VmIPZtJ0Z1UNmjgmKBRT4aEVuuQ/kJKWuKWqwtLrPOcsJmS5iqF kp+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QhIOxGsJKwUbwgq8gdEm1+0LSvzfey76e2D0sxuOMBQ=; b=PoExnxjLUSImNOAXlt/iKyZlnDupgkpXFmlxx3vW84NvFbbg5Jd2ROBPsqEsAO3ptw yFGOF0001ML12Ryf3Ds0mc7uk7i11ehEOtEoNJHOzTBECFyOkwa/n/bP3CR9oKD/QrMV 2+/LCAZnQOLFmwup63sg4D6JNlKl8M1ZDLCrxbQRBlJlHnEKHXcYHSCVTONN1fyEMAlI Nik+dzjH+sHIDRqobrKuesS0jkmaDP7Uhfh/x+YS3Kq0hzG3zU6Bc+/jHhGPHr0FIGwh xW5WcEM5hJTGhjCmP5+7E0KKAMCoEHDqgNpXJs9NIPRzKx6Vw8YH/xsCZvZjq4p5Zwqv flXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=br8hbZZz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l16-20020a170902f69000b0019d19fb5606si8844832plg.551.2023.03.16.08.40.49; Thu, 16 Mar 2023 08:41:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=br8hbZZz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229530AbjCPP2l (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229638AbjCPP2Z (ORCPT ); Thu, 16 Mar 2023 11:28:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D21F9AE136 for ; Thu, 16 Mar 2023 08:26:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980410; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QhIOxGsJKwUbwgq8gdEm1+0LSvzfey76e2D0sxuOMBQ=; b=br8hbZZzzY82kH2/RouOE16J6j4vV21yHHzkYv3DE+TqQdkvYAoQxZ/xiAHigukLJYeHoZ 3Oj6ZlaKLZbx9JjIKEBzlbTjz4RtG7hvdw9FtJgJZDFwKYIPlLpZqUnkc37DShE6I7tnMm +RqGEJlAb76ESv2h3KGS+XWTKq+uMGk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-393-LNOwTe9INm-jM2IkB3Pl_A-1; Thu, 16 Mar 2023 11:26:47 -0400 X-MC-Unique: LNOwTe9INm-jM2IkB3Pl_A-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A44A8811E7D; Thu, 16 Mar 2023 15:26:46 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DA02540D1C7; Thu, 16 Mar 2023 15:26:44 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() Date: Thu, 16 Mar 2023 15:25:59 +0000 Message-Id: <20230316152618.711970-10-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539459268399714?= X-GMAIL-MSGID: =?utf-8?q?1760539459268399714?= Fold do_tcp_sendpages() into its last remaining caller, tcp_sendpage_locked(). Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tcp.h | 2 -- net/ipv4/tcp.c | 21 +++++++-------------- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index db9f828e9d1e..844bc8e6a714 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -333,8 +333,6 @@ int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags); int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags); -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int tcp_send_mss(struct sock *sk, int *size_goal, int flags); void tcp_push(struct sock *sk, int flags, int mss_now, int nonagle, int size_goal); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7c3acc5673e9..f1454e4497df 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,14 +971,19 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, + size_t size, int flags) { struct bio_vec bvec; struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + if (!(sk->sk_route_caps & NETIF_F_SG)) + return sock_no_sendpage_locked(sk, page, offset, size, flags); + + tcp_rate_check_app_limited(sk); /* is sending application-limited? */ + bvec_set_page(&bvec, page, size, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); @@ -987,18 +992,6 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, return tcp_sendmsg_locked(sk, &msg, size); } -EXPORT_SYMBOL_GPL(do_tcp_sendpages); - -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - return do_tcp_sendpages(sk, page, offset, size, flags); -} EXPORT_SYMBOL_GPL(tcp_sendpage_locked); int tcp_sendpage(struct sock *sk, struct page *page, int offset, From patchwork Thu Mar 16 15:26:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70835 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp547447wrt; Thu, 16 Mar 2023 08:30:20 -0700 (PDT) X-Google-Smtp-Source: AK7set/x4wLeV3IJxIISvo7H5/hPra84FCCy29V7jiX8OFcwSVmoZfPmdHm0boLyq9IdgDFPyPvU X-Received: by 2002:a17:902:d4c8:b0:19d:137c:2ad2 with SMTP id o8-20020a170902d4c800b0019d137c2ad2mr4325906plg.52.1678980619866; Thu, 16 Mar 2023 08:30:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678980619; cv=none; d=google.com; s=arc-20160816; b=pojwVs8MuHsgoRyyitZACHWo/9GOInHPifrCGPzPmK0lDO3A7mPXzEVQmkf5HcCtnJ XhSVh7q86N4pzIuWV6ZJJn5mTmEix9Cpm1pywReSZ95qbuEfMXgmjO4ZVkdOPPPPi5Cz svVcbHdR9N2ffcJtrf9YNYFXSapybLc8Jm9+rMe46lDPK4SFxxI3G32IMUdxIx2Ilf5z 0QVbJ8qKQ8bByRzVe6NhHy/Y+7Q2lJbq4p02QQDF6xbYmuj7FeJgZ6nPtkgDYW9esArv fWXvJ0EqGfTuJDuZj2EBUmRz7JGZJ0mKxJx07WSR85zS47txtlUQDFkCB0WHSJYcQYOF SFQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=GYA122zSJc6JqJKbnjvsKVO49kEubsqnJ80XNajw+5k=; b=xcu4om2hmc2DYXwUA5HphM8AaBz1EbTVt4CuVUbm6A9A31En0ae+z2tlTYTeKTykVy IoNxgWE0o9TtnJArMxANf3lJjur5J1f67OrJuFuRkLmyAY5BBaOWOKuJRbMniK8RE2uI W5QETRekRSK8FboBQwjYAyVGVQshf782bPlBYmTim/NUpP9q5qXmCxzrAlGM4LDi9+zd 3gBpVoSx4O/55ooCcb5IUt5wDFdbQEn2tR7/V6ZZuRDOWtKK07h8N6UhArvE2UDIbasC +fkVnG7gnIIFgI5NnpvH0JfJvOcSuIELUgxAIIMhUb5aihKRBHggfNFYa0gdtxettf/U IQBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WhIuDVOR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l1-20020a170903244100b001990aae7572si8631411pls.294.2023.03.16.08.30.04; Thu, 16 Mar 2023 08:30:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WhIuDVOR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231548AbjCPP2f (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231434AbjCPP2W (ORCPT ); Thu, 16 Mar 2023 11:28:22 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1F1DB1B31 for ; Thu, 16 Mar 2023 08:26:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GYA122zSJc6JqJKbnjvsKVO49kEubsqnJ80XNajw+5k=; b=WhIuDVORFme/LHZmCDoSGrk5RXfzkpdt3biKZK6sxLDrdJHYeSzH7wDW4yKdv8qPKVyJcb GYjfhC7DSWFFgqLUqCb29LWZlpw+JYiUrWyMdnYjIaPJw8bMVIQ7cUERnPlyxcNKZ8TcUA evdgerRAYGcMoeRzlppSPOXY7mATRKA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-489-LkkusC7mPii-DXgerJyhEQ-1; Thu, 16 Mar 2023 11:26:52 -0400 X-MC-Unique: LkkusC7mPii-DXgerJyhEQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 431AA101A553; Thu, 16 Mar 2023 15:26:49 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D14C1121315; Thu, 16 Mar 2023 15:26:47 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:26:00 +0000 Message-Id: <20230316152618.711970-11-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760538782773667660?= X-GMAIL-MSGID: =?utf-8?q?1760538782773667660?= Make IP/UDP sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible (the iterator must be ITER_BVEC and the pages must be spliceable). This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/ip_output.c | 89 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 86 insertions(+), 3 deletions(-) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 4e4e308c3230..721d7e4343ed 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -977,7 +977,7 @@ static int __ip_append_data(struct sock *sk, int err; int offset = 0; bool zc = false; - unsigned int maxfraglen, fragheaderlen, maxnonfragsize; + unsigned int maxfraglen, fragheaderlen, maxnonfragsize, xlength; int csummode = CHECKSUM_NONE; struct rtable *rt = (struct rtable *)cork->dst; unsigned int wmem_alloc_delta = 0; @@ -1017,6 +1017,7 @@ static int __ip_append_data(struct sock *sk, (!exthdrlen || (rt->dst.dev->features & NETIF_F_HW_ESP_TX_CSUM))) csummode = CHECKSUM_PARTIAL; + xlength = length; if ((flags & MSG_ZEROCOPY) && length) { struct msghdr *msg = from; @@ -1047,6 +1048,16 @@ static int __ip_append_data(struct sock *sk, skb_zcopy_set(skb, uarg, &extra_uref); } } + } else if ((flags & MSG_SPLICE_PAGES) && length) { + struct msghdr *msg = from; + + if (!iov_iter_is_bvec(&msg->msg_iter)) + return -EINVAL; + if (inet->hdrincl) + return -EPERM; + if (!(rt->dst.dev->features & NETIF_F_SG)) + return -EOPNOTSUPP; + xlength = transhdrlen; /* We need an empty buffer to attach stuff to */ } cork->length += length; @@ -1074,6 +1085,50 @@ static int __ip_append_data(struct sock *sk, unsigned int alloclen, alloc_extra; unsigned int pagedlen; struct sk_buff *skb_prev; + + if (unlikely(flags & MSG_SPLICE_PAGES)) { + skb_prev = skb; + fraggap = skb_prev->len - maxfraglen; + + alloclen = fragheaderlen + hh_len + fraggap + 15; + skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation); + if (unlikely(!skb)) { + err = -ENOBUFS; + goto error; + } + + /* + * Fill in the control structures + */ + skb->ip_summed = CHECKSUM_NONE; + skb->csum = 0; + skb_reserve(skb, hh_len); + + /* + * Find where to start putting bytes. + */ + skb_put(skb, fragheaderlen + fraggap); + skb_reset_network_header(skb); + skb->transport_header = (skb->network_header + + fragheaderlen); + if (fraggap) { + skb->csum = skb_copy_and_csum_bits( + skb_prev, maxfraglen, + skb_transport_header(skb), + fraggap); + skb_prev->csum = csum_sub(skb_prev->csum, + skb->csum); + pskb_trim_unique(skb_prev, maxfraglen); + } + + /* + * Put the packet on the pending queue. + */ + __skb_queue_tail(&sk->sk_write_queue, skb); + continue; + } + xlength = length; + alloc_new_skb: skb_prev = skb; if (skb_prev) @@ -1085,7 +1140,7 @@ static int __ip_append_data(struct sock *sk, * If remaining data exceeds the mtu, * we know we need more fragment(s). */ - datalen = length + fraggap; + datalen = xlength + fraggap; if (datalen > mtu - fragheaderlen) datalen = maxfraglen - fragheaderlen; fraglen = datalen + fragheaderlen; @@ -1099,7 +1154,7 @@ static int __ip_append_data(struct sock *sk, * because we have no idea what fragment will be * the last. */ - if (datalen == length + fraggap) + if (datalen == xlength + fraggap) alloc_extra += rt->dst.trailer_len; if ((flags & MSG_MORE) && @@ -1206,6 +1261,34 @@ static int __ip_append_data(struct sock *sk, err = -EFAULT; goto error; } + } else if (flags & MSG_SPLICE_PAGES) { + struct msghdr *msg = from; + struct iov_iter *iter = &msg->msg_iter; + const struct bio_vec *bv = iter->bvec; + + if (iov_iter_count(iter) <= 0) { + err = -EIO; + goto error; + } + + copy = iov_iter_single_seg_count(&msg->msg_iter); + + err = skb_append_pagefrags(skb, bv->bv_page, + bv->bv_offset + iter->iov_offset, + copy); + if (err < 0) + goto error; + + if (skb->ip_summed == CHECKSUM_NONE) { + __wsum csum; + csum = csum_page(bv->bv_page, + bv->bv_offset + iter->iov_offset, copy); + skb->csum = csum_block_add(skb->csum, csum, skb->len); + } + + iov_iter_advance(iter, copy); + skb_len_add(skb, copy); + refcount_add(copy, &sk->sk_wmem_alloc); } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; From patchwork Thu Mar 16 15:26:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70842 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp552610wrt; Thu, 16 Mar 2023 08:39:14 -0700 (PDT) X-Google-Smtp-Source: AK7set9EWnJD9i9CBhJRLnL4lkcVsLnJ7tknPDPSK82V30ENuNLup1l5yyfHGfxaPhxy6i0dWkfy X-Received: by 2002:a05:6a20:bf19:b0:d3:d236:f5b7 with SMTP id gc25-20020a056a20bf1900b000d3d236f5b7mr3924477pzb.26.1678981154487; Thu, 16 Mar 2023 08:39:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981154; cv=none; d=google.com; s=arc-20160816; b=cNLzpEbSwmxG4UANY8jSGT0Ca0F9vSoyBBEfvbVwIkV6KvqqVGreP2lMCoEyziiF0Q 0ptx2l9PHm6YaRpgPv7ZLvyCtxoS3bJ0HGrAphl/2VRRmtkWbX1zoAPMvLC9ccLc95ig ZLc9Es8Eo0U7wriImKjo9Wcc3WzPFcQ/xaa+yGhjJKmSfHB5iEQ8GKrTX5c6K5s1tFNw /Cr7NpZ9Mu4ynCoXPnVyjMQQ4ICIcCYPqVA5GEdhXbfWZKRaDIRWZyJJWOErKK5MIlKY z29+ttBmDu/ldW8wvp4FI3ipJAOcPzNqfUljUgcyPDdoM/T+qejmEUAS7unWmkSkF/63 9GDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5bgiBObt8qBauZcUWYsV1T0v3n3OqeoexntQtIJIREA=; b=MJIgTrr5yzQukuFxk2e6I8w8rpdPzp4HqJ17A9IUmfDYPI3KzXklyZ5GlLRtGh1vIT k3LM+0FBIup47A/POKcN1OMMFIp8sq3XKHjLD9mBykkLEZo4xLL7bcc9mOVZZJpVoFTa ZrptY7Mzpsl5UdaC0YVDwsSG0SU9lPGrnUXeYuWARbw/2J8KUmuH/vLv3rjgdSh/av2J nkZ2Ot1N/EdmgZ1p8ewbeo5cemzfkZBQX/0XhIV78QD5GSIGzhN2sewcvIiFdml+lRSH RUNxHfDc9WGkb1Ns5bcKtIGg6879RnqajRxhLBSSUg/wQZNSNx0tIzD/kDZEWwZgyIfy 2bjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="YKsn3kU/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f11-20020aa7968b000000b00572f3be59f6si8310683pfk.136.2023.03.16.08.39.02; Thu, 16 Mar 2023 08:39:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="YKsn3kU/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231489AbjCPP2i (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231474AbjCPP2Y (ORCPT ); Thu, 16 Mar 2023 11:28:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E6D5B1ED1 for ; Thu, 16 Mar 2023 08:26:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980417; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5bgiBObt8qBauZcUWYsV1T0v3n3OqeoexntQtIJIREA=; b=YKsn3kU/5PAlFOh9yEwDxs3M4b8ejujkcrxpyXBn3SKGESmwhe6klFskkLAuOqzr+0SumB gPv2Dz2y6zYjRFYEnu7iMILDf3qLM/M79idct75Rwiw+Ke4TRNDmcbcHSiYsq+68eHMFHa do4VQIpP9cZkbwp5P912PX9Oay7wV5Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-383-a51AlkUUOb-NXl1w0lE3CA-1; Thu, 16 Mar 2023 11:26:53 -0400 X-MC-Unique: a51AlkUUOb-NXl1w0lE3CA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 568E3185A7A7; Thu, 16 Mar 2023 15:26:52 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id D49BC492B02; Thu, 16 Mar 2023 15:26:49 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:26:01 +0000 Message-Id: <20230316152618.711970-12-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539343190855661?= X-GMAIL-MSGID: =?utf-8?q?1760539343190855661?= Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/udp.c | 50 +++++++++----------------------------------------- 1 file changed, 9 insertions(+), 41 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c605d171eb2d..097feb92e215 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1332,52 +1332,20 @@ EXPORT_SYMBOL(udp_sendmsg); int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct inet_sock *inet = inet_sk(sk); - struct udp_sock *up = udp_sk(sk); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE + }; int ret; - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - if (!up->pending) { - struct msghdr msg = { .msg_flags = flags|MSG_MORE }; - - /* Call udp_sendmsg to specify destination address which - * sendpage interface can't pass. - * This will succeed only when the socket is connected. - */ - ret = udp_sendmsg(sk, &msg, 0); - if (ret < 0) - return ret; - } + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; lock_sock(sk); - - if (unlikely(!up->pending)) { - release_sock(sk); - - net_dbg_ratelimited("cork failed\n"); - return -EINVAL; - } - - ret = ip_append_page(sk, &inet->cork.fl.u.ip4, - page, offset, size, flags); - if (ret == -EOPNOTSUPP) { - release_sock(sk); - return sock_no_sendpage(sk->sk_socket, page, offset, - size, flags); - } - if (ret < 0) { - udp_flush_pending_frames(sk); - goto out; - } - - up->len += size; - if (!(READ_ONCE(up->corkflag) || (flags&MSG_MORE))) - ret = udp_push_pending_frames(sk); - if (!ret) - ret = size; -out: + ret = udp_sendmsg(sk, &msg, size); release_sock(sk); return ret; } From patchwork Thu Mar 16 15:26:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70865 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554905wrt; Thu, 16 Mar 2023 08:43:50 -0700 (PDT) X-Google-Smtp-Source: AK7set+gCcfQHMbluumIhve2DgvG4EW4U9+Jq+MJry514VdnJIBhE+roR8ibasfvP6glOpaAJi+n X-Received: by 2002:a05:6a21:340a:b0:d4:fd7e:c8b0 with SMTP id yn10-20020a056a21340a00b000d4fd7ec8b0mr3478031pzb.7.1678981429819; Thu, 16 Mar 2023 08:43:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981429; cv=none; d=google.com; s=arc-20160816; b=drA5cx6BLlUc0IVrldBLXB60esmyNcTPxvF7hoNMcoOdRoOjwGnnLk+c7jzjFF2Svu A407kg0/BI3eJ7jByM1NMC7jITynO3nzMRT/Ipw9Vw59d6wiS+BWIv+xDbCG9al2mPpO p1p9YsBt6yNMX7E/Xu2joP8iuMCBVeDHBZSTfLMUuqFZJ+kRlFVQe839n52cL6+hP/5Y Ov0EwiEmgauHmTfT1tlH9iNRvqXxixrmTnhSNGyiemR4gayb4eVqGdBwTqchjHbAfx50 fCAsU49brzpnXtOYarfjjWgYbLL6xAumMbE+KmM4t+mQhNUCbwBzNIAbq93MLZCs+6mD +WdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0tjSspNdCEV0klTPAiqijmVmAHGr/T2RpULqY6WmgRo=; b=eJuPCniYwY+TRj6JM6STXs88XYnOtXgg2qSAD6k2UztE7lp2QKpScp7yIHU84cxS+J yIbCeeACV1Zk/ZDtFy5+MHc9jO2fv67exjZMXmgATWbC8No0xlvOasgHAfz0lBqNg+dl dvjlAHKSXu6SwRk+2njSvcrbluF2h1B7wlvOqwxhihDJ/tgXW5OD8neklQSd9cXTBEV0 tIiK4+YVouQI4mFWk9OzktQROwazrJp6VqBubVnWjJVOLw1hJXVo1Rp9rCKeE1nlXOjv XGGneTWIua7kgXf2fB/Gw6SZI9VFSJIRtJX/yIH7uD9cJ66WZZEqqq1VqhMc+1W2CkGD XvoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DhP78U1p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f3-20020a631003000000b00507681d47bcsi8165604pgl.567.2023.03.16.08.43.34; Thu, 16 Mar 2023 08:43:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DhP78U1p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231483AbjCPP2c (ORCPT + 99 others); Thu, 16 Mar 2023 11:28:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231455AbjCPP2T (ORCPT ); Thu, 16 Mar 2023 11:28:19 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88A31B2557 for ; Thu, 16 Mar 2023 08:27:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980420; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0tjSspNdCEV0klTPAiqijmVmAHGr/T2RpULqY6WmgRo=; b=DhP78U1pdijkDq7RoH6+gxVo5yunX/SwF1+sgY0ZH0z2ctRt4Txgh1DjQCLMlq+MxuP+AJ wauils2PG7ZlKsz8s1uBU3LrV+ZMjChrFYlgsmMP866vO1x71RXo4+BQzFMnLmkIPN5Acy /BWTix83jPbAHIXwgg2QAQRSbeyLzH8= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-84-pxJorDNpMoumBCRZt3rTBA-1; Thu, 16 Mar 2023 11:26:55 -0400 X-MC-Unique: pxJorDNpMoumBCRZt3rTBA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C928228237DA; Thu, 16 Mar 2023 15:26:54 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A22240C6F87; Thu, 16 Mar 2023 15:26:52 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:26:02 +0000 Message-Id: <20230316152618.711970-13-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539631959470823?= X-GMAIL-MSGID: =?utf-8?q?1760539631959470823?= Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the source iterator if given and if ITER_BVEC and copying the data in otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/unix/af_unix.c | 84 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 68 insertions(+), 16 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 347122c3575e..6f3454db9c53 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2151,6 +2151,44 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other } #endif +/* + * Extract pages from a BVEC-type iterator and add them to the socket buffer. + */ +static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, + struct iov_iter *iter, ssize_t maxsize) +{ + const struct bio_vec *bv = iter->bvec; + unsigned long start = iter->iov_offset; + unsigned int i; + ssize_t ret = 0; + + for (i = 0; i < iter->nr_segs; i++) { + size_t off, len; + + len = bv[i].bv_len; + if (start >= len) { + start -= len; + continue; + } + + len = min_t(size_t, maxsize, len - start); + off = bv[i].bv_offset + start; + + if (skb_append_pagefrags(skb, bv->bv_page, off, len) < 0) + break; + + ret += len; + maxsize -= len; + if (maxsize <= 0) + break; + start = 0; + } + + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { @@ -2194,19 +2232,25 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, while (sent < len) { size = len - sent; - /* Keep two messages in the pipe so it schedules better */ - size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + skb = sock_alloc_send_pskb(sk, 0, 0, + msg->msg_flags & MSG_DONTWAIT, + &err, 0); + } else { + /* Keep two messages in the pipe so it schedules better */ + size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); - /* allow fallback to order-0 allocations */ - size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); + /* allow fallback to order-0 allocations */ + size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); - data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); + data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); - data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); + data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); - skb = sock_alloc_send_pskb(sk, size - data_len, data_len, - msg->msg_flags & MSG_DONTWAIT, &err, - get_order(UNIX_SKB_FRAGS_SZ)); + skb = sock_alloc_send_pskb(sk, size - data_len, data_len, + msg->msg_flags & MSG_DONTWAIT, &err, + get_order(UNIX_SKB_FRAGS_SZ)); + } if (!skb) goto out_err; @@ -2218,13 +2262,21 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, } fds_sent = true; - skb_put(skb, size - data_len); - skb->data_len = data_len; - skb->len = size; - err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); - if (err) { - kfree_skb(skb); - goto out_err; + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size); + skb->data_len += size; + skb->len += size; + skb->truesize += size; + refcount_add(size, &sk->sk_wmem_alloc); + } else { + skb_put(skb, size - data_len); + skb->data_len = data_len; + skb->len = size; + err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); + if (err) { + kfree_skb(skb); + goto out_err; + } } unix_state_lock(other); From patchwork Thu Mar 16 15:26:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70856 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554403wrt; Thu, 16 Mar 2023 08:42:56 -0700 (PDT) X-Google-Smtp-Source: AK7set8oVURm5v3VwQwf9Q5DRkyFaeoNvNKTED1Rq8jKrT2n7mJp2+WajiXJpfv+nMTJcRKke6qf X-Received: by 2002:a05:6a20:8b90:b0:d3:5b84:6fcd with SMTP id m16-20020a056a208b9000b000d35b846fcdmr4150304pzh.12.1678981375984; Thu, 16 Mar 2023 08:42:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981375; cv=none; d=google.com; s=arc-20160816; b=0m5/OKVWoFhxe8T/goNkUrydTm1XAKVxu55aPwGWbzxvBid9U3X7+jBPwaDCAg8Ige sWNr83ZGHpTDdkL6oDNRdoOMU7YoUP5VWhHt6ad8UfkryrNd6NNjsvSMqU6pQKlCz7tD uYhtAklbPGYp4K7NbYdgeBLF0bG8OR44IxIhdR/tHZcZWAg80nJQoMZEK6MpASAwrkso mg2Ejb7AHt7/tpI/st3xPpkcbApTkbfgEiWsR4YGWWMye1KXv/M+WrqqpA7yO24oaHvr KuYSUX0OfU9B5EFHW9Vvl2jReOQjHcuETdj38T9JZOeLYT+30DB9/QeT79xnSDjWeE1O /Zww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Fl0FVU0Ldf/kGpij9uVxFzVxVMUJY5FbZMiRTa3smnM=; b=wgxGg/jGJHquNmHjHNMOpF6nt4c7I0h1V8tj6OSnIlpfSw+HYzNVQn87zi6/nxcqGN 0mDhibwnqivrzChhZ5xzKVbBbiTvD/6xkRF5INDMcA9M5t3/jd5A9ZfeBnFZEaPODqeE 6fkBfaRfz/0m4ATp6VQ+nfoK7xRaknxxBhEW/E/di3LTmPk17RBD5roIIgYW9UdxunDZ gJUKbgDV4CGlFgsRXtef6hi9P4wgcWObF39W3W3W0pARA4nnKq7PxOaAj/jd6UgFW3LQ LxIupeKbQljYzdFXpHfMLnbWlw+s6DSPfKykhg/BVWtqokjMDmNt4En5/kGWZ3YpJry0 tUow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OwXdnwXW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h11-20020a056a00218b00b00625ebc3b265si2771168pfi.241.2023.03.16.08.42.40; Thu, 16 Mar 2023 08:42:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OwXdnwXW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231630AbjCPP3k (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231572AbjCPP2y (ORCPT ); Thu, 16 Mar 2023 11:28:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48A08D5A40 for ; Thu, 16 Mar 2023 08:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980426; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fl0FVU0Ldf/kGpij9uVxFzVxVMUJY5FbZMiRTa3smnM=; b=OwXdnwXWWVwrbXfle8LjbzRXHenvzttEd76sRDMkGztby1pE0V7VrzQyMcoYdODcOhcq4e JalstcDknrfc8xGuIXjwICBJLnMT7MUCZGstre72Dkc8boMlcTeVdvYAXUew9Of5msV1Tn qY8930daB964tQ3Cva5J8Md322rdQzU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-216-khorhuokMxiDfSWLdIPQyg-1; Thu, 16 Mar 2023 11:27:00 -0400 X-MC-Unique: khorhuokMxiDfSWLdIPQyg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6213F96DC83; Thu, 16 Mar 2023 15:26:57 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 64DEF35453; Thu, 16 Mar 2023 15:26:55 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() Date: Thu, 16 Mar 2023 15:26:03 +0000 Message-Id: <20230316152618.711970-14-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539574911103926?= X-GMAIL-MSGID: =?utf-8?q?1760539574911103926?= Put the loop in af_alg_sendmsg() into an if-statement to indent it to make the next patch easier to review as that will add another branch to handle MSG_SPLICE_PAGES to the if-statement. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 50 +++++++++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 5f7252a5b7b4..feb989b32606 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1060,35 +1060,37 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, if (sgl->cur) sg_unmark_end(sg + sgl->cur - 1); - do { - struct page *pg; - unsigned int i = sgl->cur; + if (1 /* TODO check MSG_SPLICE_PAGES */) { + do { + struct page *pg; + unsigned int i = sgl->cur; - plen = min_t(size_t, len, PAGE_SIZE); + plen = min_t(size_t, len, PAGE_SIZE); - pg = alloc_page(GFP_KERNEL); - if (!pg) { - err = -ENOMEM; - goto unlock; - } + pg = alloc_page(GFP_KERNEL); + if (!pg) { + err = -ENOMEM; + goto unlock; + } - sg_assign_page(sg + i, pg); + sg_assign_page(sg + i, pg); - err = memcpy_from_msg(page_address(sg_page(sg + i)), - msg, plen); - if (err) { - __free_page(sg_page(sg + i)); - sg_assign_page(sg + i, NULL); - goto unlock; - } + err = memcpy_from_msg(page_address(sg_page(sg + i)), + msg, plen); + if (err) { + __free_page(sg_page(sg + i)); + sg_assign_page(sg + i, NULL); + goto unlock; + } - sg[i].length = plen; - len -= plen; - ctx->used += plen; - copied += plen; - size -= plen; - sgl->cur++; - } while (len && sgl->cur < MAX_SGL_ENTS); + sg[i].length = plen; + len -= plen; + ctx->used += plen; + copied += plen; + size -= plen; + sgl->cur++; + } while (len && sgl->cur < MAX_SGL_ENTS); + } if (!size) sg_mark_end(sg + sgl->cur - 1); From patchwork Thu Mar 16 15:26:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70847 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp553647wrt; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) X-Google-Smtp-Source: AK7set+fOfjY7y9vrCbZA6s/l1Brn9PPSzhhjao2aeKcrZ7FRK1PWLBU16QZOcVH4vujv5uZtRLU X-Received: by 2002:a17:902:ea05:b0:1a0:6852:16e9 with SMTP id s5-20020a170902ea0500b001a0685216e9mr4052089plg.14.1678981285172; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981285; cv=none; d=google.com; s=arc-20160816; b=1K9LBtGxt7l3L95HqQ/PrSj4J5TLmwxpyVej7im0FLZB8m8hNMfxHyWncrpL1hCS5D S/x2pdDRl/NveA/ITY6wRPfGy9v6fEe86X/1ZRQgiGg8hO/W1Xi/BjKW0TNfLk5D3mdI rvpw3EfGgA8NEzgAF8lA5aFJTZFCVgo0M6SBPEnmWWBInh6aomfuY4O+k+ktf11RmjNn uka+xO+sDc63t2xpYV9dMjDPgowySD5Ooo32zN9eKzyy+WBdOPMrNrJhPQeydtNB9BF/ /Xfujfb5Jp9V9oxdwRGYZXTscuj3h1+CEiKokXha6Nw2JrtwZza3ZLLi325LM0MFKEbl GgAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5aQYhc2sqiUIyaN0Flc/U8EQ5lVituTw9/JGYM47q4M=; b=OKsZC0BQsqXAM3rjopODACiGp2ys7dq4sxbYnTvMs6knlLodgHfvsZjAXYpmmsf3gv ok26u9GybEKBahsWeJ4nhK72MNHJjurr4VHJIszWkRPQV2ed5ZJx61p7KkcykY7VQ0zn fwJZx7YqwQmDn62fj06szia98cDIly0Fw1ITwMx5ybGIIUrfmbqwQ03RoyxT8pH+B+As I3T70CbX7BW/DFT00e1jQMgChYSGItJRbUrPbQcGRS3FhfcHCdmlWUShBGaZvIiA8ntS d6hlHAmUkRq6/rlEWZfwcRWQnWDhLFWQZ1FHqzD7YW9TBBYvUEqUs3UAjoqUr6WJhnhX lCIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fVcFdkK0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o2-20020a656a42000000b004fb921d0184si9477740pgu.146.2023.03.16.08.41.12; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fVcFdkK0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231598AbjCPP3d (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231508AbjCPP2s (ORCPT ); Thu, 16 Mar 2023 11:28:48 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93F70B256F for ; Thu, 16 Mar 2023 08:27:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980422; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5aQYhc2sqiUIyaN0Flc/U8EQ5lVituTw9/JGYM47q4M=; b=fVcFdkK0v7iDsapnoX8OdSvcgQMcT19AbEfSGyLb6h/adjXvkFoZ9+EE5ki4Ulbx/hLmiX NnfT6miTW1t8pONOsAWuqdULtMa3uNTlTmAQAL0Jp1tmWCOBMvslAtSsFPLsjf8UGaODfk ZUVYIsfoBf60rTfuFxODe3DGJyWFBLg= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-659-dhLffNlOOvmeJOxSB9I_Hg-1; Thu, 16 Mar 2023 11:27:01 -0400 X-MC-Unique: dhLffNlOOvmeJOxSB9I_Hg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F07DA1C09066; Thu, 16 Mar 2023 15:26:59 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id F24A11121315; Thu, 16 Mar 2023 15:26:57 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:26:04 +0000 Message-Id: <20230316152618.711970-15-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539479862534495?= X-GMAIL-MSGID: =?utf-8?q?1760539479862534495?= Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible (the iterator must be ITER_BVEC and the pages must be spliceable). This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. [!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib. This probably needs moving to core code somewhere. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/Kconfig | 1 + crypto/af_alg.c | 29 +++++++++++++++++++++++++++-- crypto/algif_aead.c | 22 +++++++++++----------- crypto/algif_skcipher.c | 8 ++++---- 4 files changed, 43 insertions(+), 17 deletions(-) diff --git a/crypto/Kconfig b/crypto/Kconfig index 9c86f7045157..8c04ecbb4395 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1297,6 +1297,7 @@ menu "Userspace interface" config CRYPTO_USER_API tristate + select NETFS_SUPPORT # for netfs_extract_iter_to_sg() config CRYPTO_USER_API_HASH tristate "Hash algorithms" diff --git a/crypto/af_alg.c b/crypto/af_alg.c index feb989b32606..80ab4f6e018c 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -970,6 +971,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, bool init = false; int err = 0; + if ((msg->msg_flags & MSG_SPLICE_PAGES) && + !iov_iter_is_bvec(&msg->msg_iter)) + return -EINVAL; + if (msg->msg_controllen) { err = af_alg_cmsg_send(msg, &con); if (err) @@ -1015,7 +1020,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, while (size) { struct scatterlist *sg; size_t len = size; - size_t plen; + ssize_t plen; /* use the existing memory in an allocated page */ if (ctx->merge) { @@ -1060,7 +1065,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, if (sgl->cur) sg_unmark_end(sg + sgl->cur - 1); - if (1 /* TODO check MSG_SPLICE_PAGES */) { + if (msg->msg_flags & MSG_SPLICE_PAGES) { + struct sg_table sgtable = { + .sgl = sg, + .nents = sgl->cur, + .orig_nents = sgl->cur, + }; + + plen = netfs_extract_iter_to_sg(&msg->msg_iter, len, + &sgtable, MAX_SGL_ENTS, 0); + if (plen < 0) { + err = plen; + goto unlock; + } + + for (; sgl->cur < sgtable.nents; sgl->cur++) + get_page(sg_page(&sg[sgl->cur])); + len -= plen; + ctx->used += plen; + copied += plen; + size -= plen; + } else { do { struct page *pg; unsigned int i = sgl->cur; diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 42493b4d8ce4..279eb17a1dfc 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,8 +9,8 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be + * filled by user space with the data submitted via sendpage. Filling up + * the TX SGL does not cause a crypto operation -- the data will only be * tracked by the kernel. Upon receipt of one recvmsg call, the caller must * provide a buffer which is tracked with the RX SGL. * @@ -113,19 +113,19 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, } /* - * Data length provided by caller via sendmsg/sendpage that has not - * yet been processed. + * Data length provided by caller via sendmsg that has not yet been + * processed. */ used = ctx->used; /* - * Make sure sufficient data is present -- note, the same check is - * also present in sendmsg/sendpage. The checks in sendpage/sendmsg - * shall provide an information to the data sender that something is - * wrong, but they are irrelevant to maintain the kernel integrity. - * We need this check here too in case user space decides to not honor - * the error message in sendmsg/sendpage and still call recvmsg. This - * check here protects the kernel integrity. + * Make sure sufficient data is present -- note, the same check is also + * present in sendmsg. The checks in sendmsg shall provide an + * information to the data sender that something is wrong, but they are + * irrelevant to maintain the kernel integrity. We need this check + * here too in case user space decides to not honor the error message + * in sendmsg and still call recvmsg. This check here protects the + * kernel integrity. */ if (!aead_sufficient_data(sk)) return -EINVAL; diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index ee8890ee8f33..021f9ce7e87c 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg. Filling up the TX + * SGL does not cause a crypto operation -- the data will only be tracked by + * the kernel. Upon receipt of one recvmsg call, the caller must provide a + * buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed From patchwork Thu Mar 16 15:26:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70863 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554859wrt; Thu, 16 Mar 2023 08:43:44 -0700 (PDT) X-Google-Smtp-Source: AK7set9RFxS5UxowQPm1iHkmTXYrf6f7x2fKAIOa1IS3Yx82g/iy+np5DF1OlUBjbpd5jDaT/2FE X-Received: by 2002:a17:903:884:b0:1a0:6008:85d9 with SMTP id kt4-20020a170903088400b001a0600885d9mr3590823plb.15.1678981424575; Thu, 16 Mar 2023 08:43:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981424; cv=none; d=google.com; s=arc-20160816; b=lkHCF+CyXXyq6HKuHZPRC5ejMmiEeahyoEY3G6ASRPvcv3KzfUUR/z8fPOTBoucJ9a zSlp4D6qAtmsikQInLf4ELF76l+uSbDZ7/7+KDt1OaSn87VYwJYILZ+9Vu7ZOnplqOsB 45IwkO5RaXdFTnuSKa8l+7VvgfiibCRByDR1PWVqmQ12o07GPgxw1tbDVyDfct2qYzGG i6UcK2SgVVsH/F6s3Z5+PBm0U+MM8EgTBmsTrBiEZJMhx0Vybl/J47P9H9KyTNcyJLbv 2YvVicmfrZ2auMGafgPNdwzVcCVuIDSTo4trEfGp/EePhCfcK3aHMpGtEXbVnOKJsp6C Wmug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=t96KWWuCikizvNC1y3JF3FIfjq2ZpnpzfAnEQzXudzo=; b=yLrxX8alizX67ZkZTsDbJGwPxFJIYyryYG8WRgfqU/qFC8FWrimtXFgFpYRhuBi4FA KxxgKDDm+0kx1i/4pFmGp3TQwbw1x2MVYtkjXZ+1rKLrW58cbqtqWfISm8w6arKO+0v/ /Da5gdFHyIpRjrwDAQGjwZ6yssuhVgzmCnGmrPSAYC1aAcodYr6PQR7BUhCOBvD9jYqG 42QXUjVn9ULp1DyttR3B2rETXI0/+Qi2A0I/9FHXaSjaXE2Cdrcejomj15gVowQASxUB Zh9StJ7EXCT7aTytH5OktdCzTuYTMSvRAELHe4OlJNrdjkAxK92Z3COUKGjrGdL/fddn BMHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Vn2guJKd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u13-20020a170902e5cd00b001a18c0570cdsi3252882plf.582.2023.03.16.08.43.29; Thu, 16 Mar 2023 08:43:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Vn2guJKd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231618AbjCPP3g (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231571AbjCPP2w (ORCPT ); Thu, 16 Mar 2023 11:28:52 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4A9CD58BE for ; Thu, 16 Mar 2023 08:27:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980427; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t96KWWuCikizvNC1y3JF3FIfjq2ZpnpzfAnEQzXudzo=; b=Vn2guJKd3tx2oyLb7btA2qRRF7Chji2NqCvIHWps6cnSwHEekofUTrM+jmlStFOxGu8f0W mzZ9a/t2o49u34uWn+EUrkUjt2c/XLMCTsnrVvFou+LqgpIIfSW9dxdHCF0QW1lHLe0b0a q687woUJM2OGD8veQ2PleUQJK0bh5zU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638-J7ghSUrYNvWcgXhEzMBmmA-1; Thu, 16 Mar 2023 11:27:03 -0400 X-MC-Unique: J7ghSUrYNvWcgXhEzMBmmA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A60961C09044; Thu, 16 Mar 2023 15:27:02 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A9D8E40C6E67; Thu, 16 Mar 2023 15:27:00 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES Date: Thu, 16 Mar 2023 15:26:05 +0000 Message-Id: <20230316152618.711970-16-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539626234056888?= X-GMAIL-MSGID: =?utf-8?q?1760539626234056888?= Convert af_alg_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. [!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib. This probably needs moving to core code somewhere. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 53 +++++++++---------------------------------------- 1 file changed, 9 insertions(+), 44 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 80ab4f6e018c..0e77fce60876 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1148,53 +1148,18 @@ EXPORT_SYMBOL_GPL(af_alg_sendmsg); ssize_t af_alg_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { - struct sock *sk = sock->sk; - struct alg_sock *ask = alg_sk(sk); - struct af_alg_ctx *ctx = ask->private; - struct af_alg_tsgl *sgl; - int err = -EINVAL; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - lock_sock(sk); - if (!ctx->more && ctx->used) - goto unlock; - - if (!size) - goto done; - - if (!af_alg_writable(sk)) { - err = af_alg_wait_for_wmem(sk, flags); - if (err) - goto unlock; - } - - err = af_alg_alloc_tsgl(sk); - if (err) - goto unlock; - - ctx->merge = 0; - sgl = list_entry(ctx->tsgl_list.prev, struct af_alg_tsgl, list); - - if (sgl->cur) - sg_unmark_end(sgl->sg + sgl->cur - 1); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES, + }; - sg_mark_end(sgl->sg + sgl->cur); + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - get_page(page); - sg_set_page(sgl->sg + sgl->cur, page, size, offset); - sgl->cur++; - ctx->used += size; - -done: - ctx->more = flags & MSG_MORE; - -unlock: - af_alg_data_wakeup(sk); - release_sock(sk); + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - return err ?: size; + return sock_sendmsg(sock, &msg); } EXPORT_SYMBOL_GPL(af_alg_sendpage); From patchwork Thu Mar 16 15:26:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70852 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554196wrt; Thu, 16 Mar 2023 08:42:31 -0700 (PDT) X-Google-Smtp-Source: AK7set/qXn+s/tLwag4oSgnm8qO/PQLO5Y/7nWLlNCnR6wlr5pv9WrLGs/J84R7UaHrYQ8QvCpg3 X-Received: by 2002:a05:6a20:1aa7:b0:d5:9da4:6db2 with SMTP id ci39-20020a056a201aa700b000d59da46db2mr3313398pzb.62.1678981351578; Thu, 16 Mar 2023 08:42:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981351; cv=none; d=google.com; s=arc-20160816; b=nHsq4uP1p0XFj6tRdjaGNJehxhPQgQ9MMvG/1tV2NYGGfEmTGXjzAiH14BULYIlS8X uakbL6l5XfrVgtU8ElsWv1Ti4/ANNE3EF1JnMbdw8eIwHIbSPLEyg6+wp6cvGuTEiXSh 4DM6HNxpPsKSbeQJSFp33SBDXxQiuOM31FhjUr6paH5UWovIQL6r4yM9WgvsZ4VfjAT8 3MzTVsuk2DljsDuJeSEPLAxxNguWjwTIBPKwMGk9idcj+J+WQNdPkiOwiJMlnaewgnii aCRsR9CqL9TUoWDzExTgi6is1ib1/eKIpuktQ08pIYUNnSWJ656YXSBoBSuh8osBdktv NAQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pTLvmMXI2CBo7xhNFEE5mUeBoYhMgPol6qu4jiB+mT4=; b=kJru+TDE0QcD9hS4rYX3hEHzl9jgkHjTLRUBpVXHhJCTFZNpg7zLu2TKIx7JYMspYZ nCIB6IzP5uO+NQR9RXbk0bI/jwUDPzGsIlCYxNHRn7FpdpJb+UvSVkp36Yr/GXlMGbzW BAqQucqX+i7HaZnEVK3QZnhWCy8h7DG5YWyCQ22XClTKSEBxl3WKsNavaUrD8cNyKEr+ 8eDfP4N8caBKmlMGJh2r2KT6PvvdiHVF+1KeFWS9UTF57Mga4EfeVGjdQKKgR6KP83F1 Ufuq41mUgySPKYDLRb9TwJEpQlJcKi5Dx1sY4FTbZzHpB6Gl68MCcNZB/D2my1KrxtVB GwlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fSGFyJK4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bs191-20020a6328c8000000b004f143cb44a2si8014816pgb.625.2023.03.16.08.42.15; Thu, 16 Mar 2023 08:42:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fSGFyJK4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231652AbjCPP3r (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231576AbjCPP3F (ORCPT ); Thu, 16 Mar 2023 11:29:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB3C7D5A4A for ; Thu, 16 Mar 2023 08:27:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pTLvmMXI2CBo7xhNFEE5mUeBoYhMgPol6qu4jiB+mT4=; b=fSGFyJK4uZ7z+Dm9sZ4yb0n/ZGir+lglt/gahhTEPvVHYI9hp62IynjTmxdW+pO/q/mzJ0 Ufamkms9JL8qaWRCsnS21F5bvIuS2WYcNsyrAJ82zfIYrkMY6RfxQk25IkI9wM5gDeLp1c LIL6ou76lTT/0GpK6ykDXENXo2TRQ4M= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-115-U9_GV7XGNjaKDSHHiGPkTA-1; Thu, 16 Mar 2023 11:27:06 -0400 X-MC-Unique: U9_GV7XGNjaKDSHHiGPkTA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 100621C09067; Thu, 16 Mar 2023 15:27:05 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 42A90492B00; Thu, 16 Mar 2023 15:27:03 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() Date: Thu, 16 Mar 2023 15:26:06 +0000 Message-Id: <20230316152618.711970-17-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539549874310019?= X-GMAIL-MSGID: =?utf-8?q?1760539549874310019?= Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() to splice data from a pipe to a socket. This paves the way for passing in multiple pages at once from a pipe and the handling of multipage folios. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- fs/splice.c | 42 +++++++++++++++++++++++------------------- include/linux/fs.h | 2 -- include/linux/splice.h | 2 ++ net/socket.c | 26 ++------------------------ 4 files changed, 27 insertions(+), 45 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index f46dd1fb367b..23ead122d631 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -410,29 +411,32 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = { }; EXPORT_SYMBOL(nosteal_pipe_buf_ops); +#ifdef CONFIG_NET /* * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos' * using sendpage(). Return the number of bytes sent. */ -static int pipe_to_sendpage(struct pipe_inode_info *pipe, - struct pipe_buffer *buf, struct splice_desc *sd) +static int pipe_to_sendmsg(struct pipe_inode_info *pipe, + struct pipe_buffer *buf, struct splice_desc *sd) { - struct file *file = sd->u.file; - loff_t pos = sd->pos; - int more; - - if (!likely(file->f_op->sendpage)) - return -EINVAL; + struct socket *sock = sock_from_file(sd->u.file); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES, + }; - more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0; + if (sd->flags & SPLICE_F_MORE) + msg.msg_flags |= MSG_MORE; if (sd->len < sd->total_len && pipe_occupancy(pipe->head, pipe->tail) > 1) - more |= MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; - return file->f_op->sendpage(file, buf->page, buf->offset, - sd->len, &pos, more); + bvec_set_page(&bvec, buf->page, sd->len, buf->offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, sd->len); + return sock_sendmsg(sock, &msg); } +#endif static void wakeup_pipe_writers(struct pipe_inode_info *pipe) { @@ -614,7 +618,7 @@ static void splice_from_pipe_end(struct pipe_inode_info *pipe, struct splice_des * Description: * This function does little more than loop over the pipe and call * @actor to do the actual moving of a single struct pipe_buffer to - * the desired destination. See pipe_to_file, pipe_to_sendpage, or + * the desired destination. See pipe_to_file, pipe_to_sendmsg, or * pipe_to_user. * */ @@ -795,8 +799,9 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out, EXPORT_SYMBOL(iter_file_splice_write); +#ifdef CONFIG_NET /** - * generic_splice_sendpage - splice data from a pipe to a socket + * splice_to_socket - splice data from a pipe to a socket * @pipe: pipe to splice from * @out: socket to write to * @ppos: position in @out @@ -808,13 +813,12 @@ EXPORT_SYMBOL(iter_file_splice_write); * is involved. * */ -ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, struct file *out, - loff_t *ppos, size_t len, unsigned int flags) +ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out, + loff_t *ppos, size_t len, unsigned int flags) { - return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendpage); + return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendmsg); } - -EXPORT_SYMBOL(generic_splice_sendpage); +#endif static int warn_unsupported(struct file *file, const char *op) { diff --git a/include/linux/fs.h b/include/linux/fs.h index c85916e9f7db..f3ccc243851e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2740,8 +2740,6 @@ extern ssize_t generic_file_splice_read(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); extern ssize_t iter_file_splice_write(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); -extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, - struct file *out, loff_t *, size_t len, unsigned int flags); extern long do_splice_direct(struct file *in, loff_t *ppos, struct file *out, loff_t *opos, size_t len, unsigned int flags); diff --git a/include/linux/splice.h b/include/linux/splice.h index 8f052c3dae95..e6153feda86c 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -87,6 +87,8 @@ extern long do_splice(struct file *in, loff_t *off_in, extern long do_tee(struct file *in, struct file *out, size_t len, unsigned int flags); +extern ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out, + loff_t *ppos, size_t len, unsigned int flags); /* * for dynamic pipe sizing diff --git a/net/socket.c b/net/socket.c index 6bae8ce7059e..1b48a976b8cc 100644 --- a/net/socket.c +++ b/net/socket.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -126,8 +127,6 @@ static long compat_sock_ioctl(struct file *file, unsigned int cmd, unsigned long arg); #endif static int sock_fasync(int fd, struct file *filp, int on); -static ssize_t sock_sendpage(struct file *file, struct page *page, - int offset, size_t size, loff_t *ppos, int more); static ssize_t sock_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); @@ -162,8 +161,7 @@ static const struct file_operations socket_file_ops = { .mmap = sock_mmap, .release = sock_close, .fasync = sock_fasync, - .sendpage = sock_sendpage, - .splice_write = generic_splice_sendpage, + .splice_write = splice_to_socket, .splice_read = sock_splice_read, .show_fdinfo = sock_show_fdinfo, }; @@ -1062,26 +1060,6 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg, } EXPORT_SYMBOL(kernel_recvmsg); -static ssize_t sock_sendpage(struct file *file, struct page *page, - int offset, size_t size, loff_t *ppos, int more) -{ - struct socket *sock; - int flags; - int ret; - - sock = file->private_data; - - flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0; - /* more is a combination of MSG_MORE and MSG_SENDPAGE_NOTLAST */ - flags |= more; - - ret = kernel_sendpage(sock, page, offset, size, flags); - - if (trace_sock_send_length_enabled()) - call_trace_sock_send_length(sock->sk, ret, 0); - return ret; -} - static ssize_t sock_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) From patchwork Thu Mar 16 15:26:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70854 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554279wrt; Thu, 16 Mar 2023 08:42:42 -0700 (PDT) X-Google-Smtp-Source: AK7set/jpouXpglplOjYvNcmBtjZFnQaslFNp2uY3a45nUiFpj8s/ROneHJtvMs2UiRoO5gGYodv X-Received: by 2002:aa7:949a:0:b0:622:9c25:93ea with SMTP id z26-20020aa7949a000000b006229c2593eamr4631622pfk.2.1678981361830; Thu, 16 Mar 2023 08:42:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981361; cv=none; d=google.com; s=arc-20160816; b=yIdhpYw82fZ8FCZmWJtRoVA2P0rtdM3aHEmQZ5PBVkOC2qwVyyz3Ask1Vy7reG799q tCT9xpBvVVGWyrPMXUN8UUFE/YWVJX7g9DGQpueO3l8v+TZ5pwH5gNAxuuOyq9fn/NeP iKdip45YjRmC/EgZq8r/SfuCwu4xa7HxRtmAWN+vssZBWi63iLPWQ0G7U7iIRjVPIw1L DNJ/LO3WjkgGqMM1PUj7l96mVao4Odm8OA+SKg6L/3ZzSpezqsEVa7tJHtTMEDPUTfPI X8wPyA3PFKLV6SwsIXae2/lAbt+r0PHXLlDrNHdTcOgwRKsM8/0MnESJDKU5qroy7/q9 G/gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=BwLwJU8AwjD+1uYy2mnm7V08kEOLBE2Z8d7v3yKcDiA=; b=HCW8XESSmD+UK4XZ5dkCo4bSRvOWmA4f1SAOGFtGpgCZrULcqpeuRX9n+obQl9AkhL oYS98f/nMd6JYWRyz6ytjpDmWbS54XU13sIqU8G6AzOh8efGjNF6Mab62xYjEeF9vRei mNrOvpvEQMw0FdV2E1S/tKlwzZkvsxzoFiyVhB7nGtPF5pVl3wrcul5ammzTXEBO+1oj 66c7P+tqORl6o3dhbxbxwvpeQ3zxc9fmMsh/pul+NdQvAR9gE4QkVG11u3JTbsqA+y/G Blq7bjdta1FtcIGaunXiEnpir9Ym7izIGHkjn9K2M6fn/3HGqDO7gIdSed2S+xbQ0491 DHyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Olbf9ezA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk13-20020a056a02028d00b0050be35e2bccsi4141918pgb.505.2023.03.16.08.42.26; Thu, 16 Mar 2023 08:42:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Olbf9ezA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231661AbjCPP3v (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231577AbjCPP3F (ORCPT ); Thu, 16 Mar 2023 11:29:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 157B3173C for ; Thu, 16 Mar 2023 08:27:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980434; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BwLwJU8AwjD+1uYy2mnm7V08kEOLBE2Z8d7v3yKcDiA=; b=Olbf9ezASFGMK5bfmR9aLsTxaM2zccIp3MhCwgHiiiEukwE36kYyLdd6E0kKPfmXqmjsmZ 0kaefqivJtmTI9jCOUVLsMvM1WMbsbmsjTUE3hclRf/XroNbosLF/6z54UQ6NXe1AD/UAS i4jgNNgzSO0AMNb/aHvBfyHOWCGtzRo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-16-J4DCKPwsOV6EioNFlaizgw-1; Thu, 16 Mar 2023 11:27:08 -0400 X-MC-Unique: J4DCKPwsOV6EioNFlaizgw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6A67038149BC; Thu, 16 Mar 2023 15:27:07 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0039492B00; Thu, 16 Mar 2023 15:27:05 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 17/28] Remove file->f_op->sendpage Date: Thu, 16 Mar 2023 15:26:07 +0000 Message-Id: <20230316152618.711970-18-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539560656187749?= X-GMAIL-MSGID: =?utf-8?q?1760539560656187749?= Remove file->f_op->sendpage as splicing to a socket now calls sendmsg rather than sendpage. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/linux/fs.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index f3ccc243851e..a9f1b2543d2c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1773,7 +1773,6 @@ struct file_operations { int (*fsync) (struct file *, loff_t, loff_t, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); - ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); int (*flock) (struct file *, int, struct file_lock *); From patchwork Thu Mar 16 15:26:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70853 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554222wrt; Thu, 16 Mar 2023 08:42:34 -0700 (PDT) X-Google-Smtp-Source: AK7set/ocv3TEmxGe/mFlmxpvZ3O2rBd8Y5bKL1u/ESfDXqHOb0jCMLjf8r4RIcfAiiyZUTjenzm X-Received: by 2002:a17:90b:1c05:b0:23d:4188:ad8e with SMTP id oc5-20020a17090b1c0500b0023d4188ad8emr4422073pjb.7.1678981354425; Thu, 16 Mar 2023 08:42:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981354; cv=none; d=google.com; s=arc-20160816; b=fZNnioC7dl5FMNwv1z5O0y+a17IR+j28c02pLMhOX4RQEb81/Z4S6HwAU/diUW9JfS Ovv50oHjHvjcPbaFoc6GVR36yfI112kfXX0AztD1bSZqXaOsBVisLZDytO8fbNQyUCW3 kRp31+nOBM1p+NyiOJwKK7CTXs4zlSDwJRbutUeCBOpwG/gp8krYRIa+jPLsJ3CrgB76 yS70t2V88tfZ4uiwsAUcUxuC0gwVibxMiSvao79e05hhI2aLxvTU63O/BbPmDMGyrM/I 7V+pGkjpwm2iwB41cphQ6A+u9gdlKjoZfER62ry22aarlHEGqeK844HgM8mK5IbZXB++ bcng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3nPvkTYaIi99z8jZpnwnKSywFWTrpbYCfoV76x5ycHw=; b=PuUnpzzgTwIOkOXOynqb82Q0a823vtaouooL0FcsNqzDLjEsUG6aGXaQ2PStD2Rz5y wOKAEyf2yeryYiKFbgGqQUt6okOjeFYK66WRGwUuN4ZABe7HGJnBsPLzjokMfjv1hGCC mdNSJyKo5rNkRNl3RSEBKWba5O+L0RXOnTCBN1EkgC+eV0SMVx52r/nXYs6Kuy9H7LRC 9GJ8ycaXd54QOYWd8x73cVJK9wSM+to13hh+nV6s+41yrvthAyEuGHHpeNbVOejNH7j4 +lIoT+iyrjkRPcONgd8SnwBGa/q/+x2Nc34KZvtEiJby6MPSTNEjFDuyDeJ3XAcPajGX DPyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gaKL5wE3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x10-20020a634a0a000000b00507681e2613si8295481pga.569.2023.03.16.08.42.19; Thu, 16 Mar 2023 08:42:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gaKL5wE3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231640AbjCPP3o (ORCPT + 99 others); Thu, 16 Mar 2023 11:29:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231579AbjCPP3G (ORCPT ); Thu, 16 Mar 2023 11:29:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19F1D38B6D for ; Thu, 16 Mar 2023 08:27:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3nPvkTYaIi99z8jZpnwnKSywFWTrpbYCfoV76x5ycHw=; b=gaKL5wE3KXUWC7OrwvSqsXmOnLoasnM3XjJ4ZyZMdpmyK5nlm4bTfCrqzlWvMxsF8HIjtR moWG7KexLFK4Cv+fpGEscZs94NCSQRXb4emgSQON805K4CoJDAzQ/KHhuxYjqwMGo5m2C7 mu7BdsXlOvkovEbGF87+gnojQUCUVP0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-77-Nfq8OMIzNR-misbqjnyc9g-1; Thu, 16 Mar 2023 11:27:11 -0400 X-MC-Unique: Nfq8OMIzNR-misbqjnyc9g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3A73085A5A3; Thu, 16 Mar 2023 15:27:10 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 24F3A40C6E68; Thu, 16 Mar 2023 15:27:08 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit Date: Thu, 16 Mar 2023 15:26:08 +0000 Message-Id: <20230316152618.711970-19-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539552627959729?= X-GMAIL-MSGID: =?utf-8?q?1760539552627959729?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header and trailer (if present) are copied into memory acquired from zcopy_alloc() which just breaks a page up into small pieces that can be freed with put_page(). Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/infiniband/sw/siw/siw_qp_tx.c | 231 +++++--------------------- 1 file changed, 46 insertions(+), 185 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 8fc179321e2b..ec4f0ac324ce 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -312,114 +313,8 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, return rv; } -/* - * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. - * - * Using sendpage to push page by page appears to be less efficient - * than using sendmsg, even if data are copied. - * - * A general performance limitation might be the extra four bytes - * trailer checksum segment to be pushed after user data. - */ -static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, - size_t size) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT | - MSG_SENDPAGE_NOTLAST), - }; - struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0; - - while (size) { - size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); - - if (size + offset <= PAGE_SIZE) - msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT; - - tcp_rate_check_app_limited(sk); - bvec_set_page(&bvec, page[i], bytes, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - -try_page_again: - lock_sock(sk); - rv = tcp_sendmsg_locked(sk, &msg, size); - release_sock(sk); - - if (rv > 0) { - size -= rv; - sent += rv; - if (rv != bytes) { - offset += rv; - bytes -= rv; - goto try_page_again; - } - offset = 0; - } else { - if (rv == -EAGAIN || rv == 0) - break; - return rv; - } - i++; - } - return sent; -} - -/* - * siw_0copy_tx() - * - * Pushes list of pages to TCP socket. If pages from multiple - * SGE's, all referenced pages of each SGE are pushed in one - * shot. - */ -static int siw_0copy_tx(struct socket *s, struct page **page, - struct siw_sge *sge, unsigned int offset, - unsigned int size) -{ - int i = 0, sent = 0, rv; - int sge_bytes = min(sge->length - offset, size); - - offset = (sge->laddr + offset) & ~PAGE_MASK; - - while (sent != size) { - rv = siw_tcp_sendpages(s, &page[i], offset, sge_bytes); - if (rv >= 0) { - sent += rv; - if (size == sent || sge_bytes > rv) - break; - - i += PAGE_ALIGN(sge_bytes + offset) >> PAGE_SHIFT; - sge++; - sge_bytes = min(sge->length, size - sent); - offset = sge->laddr & ~PAGE_MASK; - } else { - sent = rv; - break; - } - } - return sent; -} - #define MAX_TRAILER (MPA_CRC_SIZE + 4) -static void siw_unmap_pages(struct kvec *iov, unsigned long kmap_mask, int len) -{ - int i; - - /* - * Work backwards through the array to honor the kmap_local_page() - * ordering requirements. - */ - for (i = (len-1); i >= 0; i--) { - if (kmap_mask & BIT(i)) { - unsigned long addr = (unsigned long)iov[i].iov_base; - - kunmap_local((void *)(addr & PAGE_MASK)); - } - } -} - /* * siw_tx_hdt() tries to push a complete packet to TCP where all * packet fragments are referenced by the elements of one iovec. @@ -439,15 +334,13 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) { struct siw_wqe *wqe = &c_tx->wqe_active; struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx]; - struct kvec iov[MAX_ARRAY]; - struct page *page_array[MAX_ARRAY]; + struct bio_vec bvec[MAX_ARRAY]; struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR }; int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv; unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0, sge_off = c_tx->sge_off, sge_idx = c_tx->sge_idx, pbl_idx = c_tx->pbl_idx; - unsigned long kmap_mask = 0L; if (c_tx->state == SIW_SEND_HDR) { if (c_tx->use_sendpage) { @@ -457,10 +350,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) c_tx->state = SIW_SEND_DATA; } else { - iov[0].iov_base = - (char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent; - iov[0].iov_len = hdr_len = - c_tx->ctrl_len - c_tx->ctrl_sent; + const void *hdr = &c_tx->pkt.ctrl + c_tx->ctrl_sent; + + hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent; + rv = zcopy_memdup(hdr_len, hdr, &bvec[0], GFP_NOFS); + if (rv < 0) + goto done; seg = 1; } } @@ -478,28 +373,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } else { is_kva = 1; } - if (is_kva && !c_tx->use_sendpage) { - /* - * tx from kernel virtual address: either inline data - * or memory region with assigned kernel buffer - */ - iov[seg].iov_base = - (void *)(uintptr_t)(sge->laddr + sge_off); - iov[seg].iov_len = sge_len; - - if (do_crc) - crypto_shash_update(c_tx->mpa_crc_hd, - iov[seg].iov_base, - sge_len); - sge_off += sge_len; - data_len -= sge_len; - seg++; - goto sge_done; - } while (sge_len) { size_t plen = min((int)PAGE_SIZE - fp_off, sge_len); - void *kaddr; if (!is_kva) { struct page *p; @@ -512,33 +388,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) p = siw_get_upage(mem->umem, sge->laddr + sge_off); if (unlikely(!p)) { - siw_unmap_pages(iov, kmap_mask, seg); wqe->processed -= c_tx->bytes_unsent; rv = -EFAULT; goto done_crc; } - page_array[seg] = p; - - if (!c_tx->use_sendpage) { - void *kaddr = kmap_local_page(p); - - /* Remember for later kunmap() */ - kmap_mask |= BIT(seg); - iov[seg].iov_base = kaddr + fp_off; - iov[seg].iov_len = plen; - - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - iov[seg].iov_base, - plen); - } else if (do_crc) { - kaddr = kmap_local_page(p); - crypto_shash_update(c_tx->mpa_crc_hd, - kaddr + fp_off, - plen); - kunmap_local(kaddr); - } + + bvec_set_page(&bvec[seg], p, plen, fp_off); } else { /* * Cast to an uintptr_t to preserve all 64 bits @@ -552,12 +407,15 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) * bits on a 64 bit platform and 32 bits on a * 32 bit platform. */ - page_array[seg] = virt_to_page((void *)(va & PAGE_MASK)); - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - (void *)va, - plen); + bvec_set_virt(&bvec[seg], (void *)va, plen); + } + + if (do_crc) { + void *kaddr = kmap_local_page(bvec[seg].bv_page); + crypto_shash_update(c_tx->mpa_crc_hd, + kaddr + bvec[seg].bv_offset, + bvec[seg].bv_len); + kunmap_local(kaddr); } sge_len -= plen; @@ -567,13 +425,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) if (++seg > (int)MAX_ARRAY) { siw_dbg_qp(tx_qp(c_tx), "to many fragments\n"); - siw_unmap_pages(iov, kmap_mask, seg-1); wqe->processed -= c_tx->bytes_unsent; rv = -EMSGSIZE; goto done_crc; } } -sge_done: + /* Update SGE variables at end of SGE */ if (sge_off == sge->length && (data_len != 0 || wqe->processed < wqe->bytes)) { @@ -582,15 +439,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) sge_off = 0; } } - /* trailer */ - if (likely(c_tx->state != SIW_SEND_TRAILER)) { - iov[seg].iov_base = &c_tx->trailer.pad[4 - c_tx->pad]; - iov[seg].iov_len = trl_len = MAX_TRAILER - (4 - c_tx->pad); - } else { - iov[seg].iov_base = &c_tx->trailer.pad[c_tx->ctrl_sent]; - iov[seg].iov_len = trl_len = MAX_TRAILER - c_tx->ctrl_sent; - } + /* Set the CRC in the trailer */ if (c_tx->pad) { *(u32 *)c_tx->trailer.pad = 0; if (do_crc) @@ -603,23 +453,31 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) else if (do_crc) crypto_shash_final(c_tx->mpa_crc_hd, (u8 *)&c_tx->trailer.crc); - data_len = c_tx->bytes_unsent; + /* Copy the trailer and add it to the output list */ + if (likely(c_tx->state != SIW_SEND_TRAILER)) { + void *trl = &c_tx->trailer.pad[4 - c_tx->pad]; - if (c_tx->use_sendpage) { - rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx], - c_tx->sge_off, data_len); - if (rv == data_len) { - rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len); - if (rv > 0) - rv += data_len; - else - rv = data_len; - } + trl_len = MAX_TRAILER - (4 - c_tx->pad); + rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS); + if (rv < 0) + goto done_crc; } else { - rv = kernel_sendmsg(s, &msg, iov, seg + 1, - hdr_len + data_len + trl_len); - siw_unmap_pages(iov, kmap_mask, seg); + void *trl = &c_tx->trailer.pad[c_tx->ctrl_sent]; + + trl_len = MAX_TRAILER - c_tx->ctrl_sent; + rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS); + if (rv < 0) + goto done_crc; } + + data_len = c_tx->bytes_unsent; + + if (c_tx->use_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, seg + 1, + hdr_len + data_len + trl_len); + rv = sock_sendmsg(s, &msg); + if (rv < (int)hdr_len) { /* Not even complete hdr pushed or negative rv */ wqe->processed -= data_len; @@ -680,6 +538,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } done_crc: c_tx->do_crc = 0; + if (c_tx->state == SIW_SEND_HDR) + folio_put(page_folio(bvec[0].bv_page)); + folio_put(page_folio(bvec[seg].bv_page)); done: return rv; } From patchwork Thu Mar 16 15:26:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70848 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp553645wrt; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) X-Google-Smtp-Source: AK7set9Se0SrMAU/p0MyzP/UhRFJvhpuS7TBiZgARr9qXr7hzUhQl30eN3NhUcznyv5hyPMCzyfs X-Received: by 2002:a05:6a21:868f:b0:cc:5917:c4ec with SMTP id ox15-20020a056a21868f00b000cc5917c4ecmr3739085pzb.23.1678981285054; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981285; cv=none; d=google.com; s=arc-20160816; b=Am+VfmrSZYBp1YfXqBiKWFdkHEg4XRWl/M2JCBfPbaJQSlW0ZjFatYmej44Zt3O4mO dFUw5C5/sNeKdCfXNZWKN6NwqF9rkYveWhO6yuqZfYw2LB/YTSemjWfg+/Qw0k3qKZiK PN5xaAybzN1K8jqZztXK8b9UhKUp8yQiTCoxXZRhK1Yh/pD2I0TNY8y6jkbXnk+7It+y ZPtkhz0nXAa2xqyXaA4zbM9mhZkmCfHzbyefuvQ5rMIYWpzFVz4Xhb33WHyahTwddd6c DIwS5TnzXB4gqEdcAYSTv2fXckN0BmOea0Q9P5GvCEENNLPHBb9kTc05Kj/hSijQglNu Zetg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mcxOsbWhFQyukCMWrPOKlIpXJGoD9GGTGJfL3NnBqs4=; b=VmpKgWb6KDaqGcyofOja4Bd9UpIGAKEWBPbHG/21EtViHzSpaVNQIMpyB631gxuZSS 0Cez5T2RI7GkhmotwSCej9z30hmmR9dy4Z9ijgvNqlfMBOIrF4BLeJXy63zxfR209MFZ kEy6bKwt6dBMkDK7Y6vTl8A1J/uU5B706GDmKiaeBUSeeHu60qhaLTvTK+CWMYRtPEGM o7fSRUrI5M5TRKdFsAW/1dlbAgPDl+wJqw/DGdfBXUczP31NqkVo4AfimQZnQQH3b7Np O1jf1a1rEl8JXd6auWhdhOnkEo4qLmXAFdC0n4aa8dnTEWOU4T7d/KhN/Gfyp3n2OOE8 TGoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eRGlOvPD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b6-20020a056a000cc600b00577c17000dasi1993149pfv.166.2023.03.16.08.41.10; Thu, 16 Mar 2023 08:41:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eRGlOvPD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231676AbjCPPak (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231510AbjCPP3S (ORCPT ); Thu, 16 Mar 2023 11:29:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E16BDFB60 for ; Thu, 16 Mar 2023 08:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980444; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mcxOsbWhFQyukCMWrPOKlIpXJGoD9GGTGJfL3NnBqs4=; b=eRGlOvPDm/+4ju+dd/VcUjcJspRw7+NuF1FdDNaKy8PTijhnDlZJEHNttHQGf1LNbB3VFo uQ1FKl46BAmwqvkosI1aS7dYIYJGzrwiSNxI1Hk7ZZTborz7PA6SHZkH4bDSevD7yNXyh0 gfZ7Jn1uCaXu/MnVpGH/fIUnLjqUfUg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-558-q3lDEYBpM3ippJJ8OA4ctw-1; Thu, 16 Mar 2023 11:27:13 -0400 X-MC-Unique: q3lDEYBpM3ippJJ8OA4ctw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E04E7185A792; Thu, 16 Mar 2023 15:27:12 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CADBAC15BA0; Thu, 16 Mar 2023 15:27:10 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ilya Dryomov , Xiubo Li , ceph-devel@vger.kernel.org Subject: [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Thu, 16 Mar 2023 15:26:09 +0000 Message-Id: <20230316152618.711970-20-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539480172333944?= X-GMAIL-MSGID: =?utf-8?q?1760539480172333944?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v1.c | 58 ++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 39 deletions(-) diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index d664cb1593a7..b2d801a49122 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -74,37 +74,6 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov, return r; } -/* - * @more: either or both of MSG_MORE and MSG_SENDPAGE_NOTLAST - */ -static int ceph_tcp_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int more) -{ - ssize_t (*sendpage)(struct socket *sock, struct page *page, - int offset, size_t size, int flags); - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - int ret; - - /* - * sendpage cannot properly handle pages with page_count == 0, - * we need to fall back to sendmsg if that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag which - * triggers one of hardened usercopy checks. - */ - if (sendpage_ok(page)) - sendpage = sock->ops->sendpage; - else - sendpage = sock_no_sendpage; - - ret = sendpage(sock, page, offset, size, flags); - if (ret == -EAGAIN) - ret = 0; - - return ret; -} - static void con_out_kvec_reset(struct ceph_connection *con) { BUG_ON(con->v1.out_skip); @@ -464,7 +433,6 @@ static int write_partial_message_data(struct ceph_connection *con) struct ceph_msg *msg = con->out_msg; struct ceph_msg_data_cursor *cursor = &msg->cursor; bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC); - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; u32 crc; dout("%s %p msg %p\n", __func__, con, msg); @@ -482,6 +450,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ crc = do_datacrc ? le32_to_cpu(msg->footer.data_crc) : 0; while (cursor->total_resid) { + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST, + }; struct page *page; size_t page_offset; size_t length; @@ -494,9 +466,12 @@ static int write_partial_message_data(struct ceph_connection *con) page = ceph_msg_data_next(cursor, &page_offset, &length); if (length == cursor->total_resid) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, page, page_offset, length, - more); + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, length, page_offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, length); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) { if (do_datacrc) msg->footer.data_crc = cpu_to_le32(crc); @@ -526,7 +501,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ static int write_partial_skip(struct ceph_connection *con) { - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST | MSG_MORE, + }; int ret; dout("%s %p %d left\n", __func__, con, con->v1.out_skip); @@ -534,9 +512,11 @@ static int write_partial_skip(struct ceph_connection *con) size_t size = min(con->v1.out_skip, (int)PAGE_SIZE); if (size == con->v1.out_skip) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, ceph_zero_page, 0, size, - more); + msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; + bvec_set_page(&bvec, ZERO_PAGE(0), size, 0); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) goto out; con->v1.out_skip -= ret; From patchwork Thu Mar 16 15:26:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70837 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp549361wrt; Thu, 16 Mar 2023 08:33:18 -0700 (PDT) X-Google-Smtp-Source: AK7set9CGCtG2fvU40IZ+VnXZeV+iAYZ+lRVmng1iGtDsHvbcDKRx2rt+sbvaK80D1IjN/Ob+6xU X-Received: by 2002:a05:6a20:4a02:b0:d5:f6b7:e1dc with SMTP id fr2-20020a056a204a0200b000d5f6b7e1dcmr3224994pzb.37.1678980798662; Thu, 16 Mar 2023 08:33:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678980798; cv=none; d=google.com; s=arc-20160816; b=OOcQoGMXdaXcQOKDSfWDN3lNz8CRucliLjWTwalb3dRMfJDb+nOomsPRqZMqWTcaoa pFgKu6BKVZCGtokBGK5hrs7O9lmUo9wfphdmCkTdY+xelE5ezwWViUlghSosfI/Qa91F TiLhuRudnJIO2gZr46dTFAhBu8yHiDl3gJlQntlKfAU2Q9hmPnO86O/0JBbuB/4k3weA AMKUYSZuTvv3z/O4vZsMN9wpRgOZHWylYECxdF1VTfRmlH0PokwIqO1h91FmkHaOsxM1 TXPWZ1sqGo9F6dqUXalZ+1ffG/W576h+bm+yaE4Fgm3TJK6rzN91YvrE3K34xm+M+iS/ tSAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cqdl8fmtEvVwpgPJdtUvP824my58WFWl8Wp/Vt/z1XI=; b=rtAP+jNgbLeelJTi11rY7RGUJvoM1X6LfUggWmj9JNfu1SOZHVNlXyO9JPS4tW0mb0 j5c5k/DlHaTcc/6H/ChZ3OZiv/NXHck5h7bShAm5XKhR+SPjbI+2WjgBUcC66HcH5uUf 7B/7gCFQlVLhddzbvrEMFTxiXdvsfcu5rBZFr3aRP/6efszEp2kyM9Rhy8oRIg94p62m ht3Ktq6KXwgaFGrei10Qw3xD1vReFx/tZ4VLIXofu/nn3dy+bzq0OkSGQSey9iJXnbnD qZUbLk3E0i236O092ImCtci4BZeouO0vlU+sRtM8Yte/Iupdp/86gB0CavYmlf7LB8UC jtvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TRF7MMwT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u198-20020a6279cf000000b005d3f50e0e1asi8414601pfc.280.2023.03.16.08.32.57; Thu, 16 Mar 2023 08:33:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TRF7MMwT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231739AbjCPPaz (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231770AbjCPPaL (ORCPT ); Thu, 16 Mar 2023 11:30:11 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CC12A9DE6 for ; Thu, 16 Mar 2023 08:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cqdl8fmtEvVwpgPJdtUvP824my58WFWl8Wp/Vt/z1XI=; b=TRF7MMwTiDO80z2usmJKNSU2Idn35btxLN3ES/3jiltRT4g7t6gfQlieLPBD8dovxdgRfo jpkTg13jobACwoOsPOFr6dwhSGuIDMIH4jelA/HguTjSlCIhSVk7Hxoni8c8snu+gspTYW f2cmGg01NKvPK7Iji9E8msKrYmIttTo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-644-1WZJi73lMMOMNIGU82y0ZA-1; Thu, 16 Mar 2023 11:27:22 -0400 X-MC-Unique: 1WZJi73lMMOMNIGU82y0ZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5D4E185A78F; Thu, 16 Mar 2023 15:27:15 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9C1A1140EBF4; Thu, 16 Mar 2023 15:27:13 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Martin K. Petersen" , linux-scsi@vger.kernel.org, target-devel@vger.kernel.org Subject: [RFC PATCH 20/28] iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Thu, 16 Mar 2023 15:26:10 +0000 Message-Id: <20230316152618.711970-21-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760538969892581298?= X-GMAIL-MSGID: =?utf-8?q?1760538969892581298?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage. This allows multiple pages and multipage folios to be passed through. TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the entire set of pages it's going to transfer plus two for the header and trailer and use zcopy_alloc() to allocate the header and trailer - and then call sendmsg once for the entire message. Signed-off-by: David Howells cc: "Martin K. Petersen" cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-scsi@vger.kernel.org cc: target-devel@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/target/iscsi/iscsi_target_util.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index 26dc8ed3045b..c7d58e41ac3b 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -1078,6 +1078,8 @@ int iscsit_fe_sendpage_sg( struct iscsit_conn *conn) { struct scatterlist *sg = cmd->first_data_sg; + struct bio_vec bvec; + struct msghdr msghdr = { .msg_flags = MSG_SPLICE_PAGES, }; struct kvec iov; u32 tx_hdr_size, data_len; u32 offset = cmd->first_data_sg_off; @@ -1121,17 +1123,17 @@ int iscsit_fe_sendpage_sg( u32 space = (sg->length - offset); u32 sub_len = min_t(u32, data_len, space); send_pg: - tx_sent = conn->sock->ops->sendpage(conn->sock, - sg_page(sg), sg->offset + offset, sub_len, 0); + bvec_set_page(&bvec, sg_page(sg), sub_len, sg->offset + offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, sub_len); + + tx_sent = conn->sock->ops->sendmsg(conn->sock, &msghdr, sub_len); if (tx_sent != sub_len) { if (tx_sent == -EAGAIN) { - pr_err("tcp_sendpage() returned" - " -EAGAIN\n"); + pr_err("sendmsg/splice returned -EAGAIN\n"); goto send_pg; } - pr_err("tcp_sendpage() failure: %d\n", - tx_sent); + pr_err("sendmsg/splice failure: %d\n", tx_sent); return -1; } From patchwork Thu Mar 16 15:26:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70861 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554677wrt; Thu, 16 Mar 2023 08:43:26 -0700 (PDT) X-Google-Smtp-Source: AK7set9hWHJEKHGk+W0yNpFan/ow1q/QEOWmwOJAVEF1Fw72Z6NOtc+z6zyI7eJC3CC81L611dYo X-Received: by 2002:a05:6a20:2451:b0:cc:f47b:9a with SMTP id t17-20020a056a20245100b000ccf47b009amr5341118pzc.1.1678981406596; Thu, 16 Mar 2023 08:43:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981406; cv=none; d=google.com; s=arc-20160816; b=e3a2Oy5bwtX8RJ+22CtOUTiXzsJ0gH2r2GY8Tywft/uTfD/2PsCsq6dDBALi6o+3kw YRMxjZW0B+z6FnU3NoCYjTV/8ktKBct0iMaxLJWvxL9/91lyXzGogqC1z4mSS+Tr7D6p sxB5mXdljakgCTgWdECF9TBWwqnyr6PIrO6vx+GnZTb2+Nl7MTG+K0haa4fUFFJFwx1Y EA63ZnavDv2bLnk5O/50H+u4AVBW7hIHwPPa7j6ZjKwnvjOs2EcGyMNfRM1yMrjWztzT H7IMJCfdRgobWCohPHCdHe3d0T9qgfZcLSfDezMeaODgAqMWYg3SF5o86vaZWgxdOVjP rnXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cLmy3QT3PxPcXOXP8Tmrcj7ottmHeL6VPbhO66kxXIU=; b=B1XZamA55NW2lJ/rrIHvKBnAsCLzGBquUU7lNR3QBtK2N+G81aABH5B2dfFwXR5jGv A2tnaykqCuLP7UJCp11UFX4BmEgwTcWmhJ7EGWX9drJbzZ1MWBB5lrHeSeNm+gIoimgr 6R/yWGeXjI+u4seEoZlwn5tx4pzv+YdXxMXRPQhsX5RiUFTiGqidP5uP4tKrUStWCQ8O QM+Mpb8V0TR7VjWTUVOVx6+mClnB55YcuLYq3OodCDa1jIghKLGpfS1xwhFIxS0m9L63 9WNW2B4l/lIXAB1bcqf3ZZhNXsnFyt+qM3EO5z4Gh1HUTZi7LaS2Sz53PnRaKdlTwq3r mFdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d5TjsrHg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h11-20020a056a00218b00b00625ebc3b265si2771168pfi.241.2023.03.16.08.43.11; Thu, 16 Mar 2023 08:43:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d5TjsrHg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231588AbjCPPae (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229522AbjCPP3P (ORCPT ); Thu, 16 Mar 2023 11:29:15 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16126DF71D for ; Thu, 16 Mar 2023 08:27:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cLmy3QT3PxPcXOXP8Tmrcj7ottmHeL6VPbhO66kxXIU=; b=d5TjsrHgQl5r9aHS7CVxubB8HhKntto+vFPnb/vLaGsgVjho2RwueCq5Um+0yX5UXpFdO2 stFF3BnmOuh67tRnr4zU8+OPK1LmeTFuzoekmVyCk62BvATSgDaPM1kROZtEhqe5zMw5QI BZl76FCXViheZhmhbEhweAws8VkPeb0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-531-HxkbzJbbNIOwbn5zC8ZlYg-1; Thu, 16 Mar 2023 11:27:24 -0400 X-MC-Unique: HxkbzJbbNIOwbn5zC8ZlYg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 684F938149BF; Thu, 16 Mar 2023 15:27:18 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 521EB1121315; Thu, 16 Mar 2023 15:27:16 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Fastabend , Jakub Sitnicki , bpf@vger.kernel.org Subject: [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) Date: Thu, 16 Mar 2023 15:26:11 +0000 Message-Id: <20230316152618.711970-22-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539607502721857?= X-GMAIL-MSGID: =?utf-8?q?1760539607502721857?= Translate tcp_bpf_sendpage() calls to tcp_bpf_sendmsg(MSG_SPLICE_PAGES). Signed-off-by: David Howells cc: John Fastabend cc: Jakub Sitnicki cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: netdev@vger.kernel.org --- net/ipv4/tcp_bpf.c | 49 +++++++++------------------------------------- 1 file changed, 9 insertions(+), 40 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 7f17134637eb..de37a4372437 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -485,49 +485,18 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct sk_msg tmp, *msg = NULL; - int err = 0, copied = 0; - struct sk_psock *psock; - bool enospc = false; - - psock = sk_psock_get(sk); - if (unlikely(!psock)) - return tcp_sendpage(sk, page, offset, size, flags); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES, + }; - lock_sock(sk); - if (psock->cork) { - msg = psock->cork; - } else { - msg = &tmp; - sk_msg_init(msg); - } + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - /* Catch case where ring is full and sendpage is stalled. */ - if (unlikely(sk_msg_full(msg))) - goto out_err; - - sk_msg_page_add(msg, page, size, offset); - sk_mem_charge(sk, size); - copied = size; - if (sk_msg_full(msg)) - enospc = true; - if (psock->cork_bytes) { - if (size > psock->cork_bytes) - psock->cork_bytes = 0; - else - psock->cork_bytes -= size; - if (psock->cork_bytes && !enospc) - goto out_err; - /* All cork bytes are accounted, rerun the prog. */ - psock->eval = __SK_NONE; - psock->cork_bytes = 0; - } + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - err = tcp_bpf_send_verdict(sk, psock, msg, &copied, flags); -out_err: - release_sock(sk); - sk_psock_put(sk, psock); - return copied ? copied : err; + return tcp_bpf_sendmsg(sk, &msg, size); } enum { From patchwork Thu Mar 16 15:26:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70838 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp551864wrt; Thu, 16 Mar 2023 08:37:44 -0700 (PDT) X-Google-Smtp-Source: AK7set+XVL4kehWZt9rSu7r9wMJIA4Tnukb1MoLL1i/rWqNq60AYSQUB6J8kBes+u9POmB+vwYgK X-Received: by 2002:a17:90b:4a04:b0:23b:4439:4179 with SMTP id kk4-20020a17090b4a0400b0023b44394179mr4404502pjb.28.1678981063908; Thu, 16 Mar 2023 08:37:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981063; cv=none; d=google.com; s=arc-20160816; b=QWQHz0hlF1UbbuKKJ9U8vSrn6U55GuW67ptYIXkTRpiKcm17tS1YN39oxPhwcwvJT0 DwQeF3UX95Ai0ozyeje95cJrxXgRyEg6R5R1+c4OOSkzjoqKpVe+vvaByqC8ZHR2HY0F hGCBGISNPxmSlf+Bm5hCHSC4TdQpVIbJ8c1wgLmprehypkrbw1CR9Uyh1ErXFgmASdbZ LmvnhBxJ6H2XYKxuuk/IvFFZ43/cMa+mwnYdB6iogorhla5eB/EpGDjoQ71nPxc0paa7 /c9waPqcnhlFguJCnTj++MXZx9+fR9710RHT936KARzrsZFvr6vVyu64+BOpjrW1XkhM Pi6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+QyEBFQ78rmkbhEM2WtWHIrO32XG2a7frdMcXNjpZkk=; b=GqfB+O2zyNlnzD7IgoNHAafVpZYGeWihemOD+nkTv7fI0bh1ojxmwThqDcRAZ4I4sL pogo5zs2fuXczh8KOsqMB/NWoNbMfIQSzGmiSZ46exm+Q3VWB5MGkqOJ0EFmcZRvYQW6 mOLKGDuhiH82Y4ljCjvNHuKC2nr9mP0FJUTEVN9tbaaEnlLoTjRBpFDYzKfL8jD8whz7 HtKsbOcqsPbb49loiV3b8RxeVijvzL0P7nzKRkWLoDT3RuOp/utDS34nLEB5QzFgLAnp OUZo0dXacpoEDbrKSlYAK6YIu0rW5TbD87MArGujnro6d76qMzRK3qhxHCRsuUwgm0xS 7ZWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RbJh5GtF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h9-20020a631209000000b00502d81ecf19si8042137pgl.646.2023.03.16.08.37.30; Thu, 16 Mar 2023 08:37:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RbJh5GtF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231566AbjCPPaa (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231343AbjCPP3P (ORCPT ); Thu, 16 Mar 2023 11:29:15 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BF53DFB62 for ; Thu, 16 Mar 2023 08:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+QyEBFQ78rmkbhEM2WtWHIrO32XG2a7frdMcXNjpZkk=; b=RbJh5GtFK89EPZ8LQmlxx4k+jSxUbRdjwDky2gx5ssNjp5IsoRgIIOGPWaFn7W3wJ08mJI NGw7W7NjywQl7F5Gz8Ug6VPiHO9qkx7dtkL77WKzHTDkz2Ytratm/L3VM42xSzTztRBxbl jlTUB0RLxGsKAbh7e9P7izXptiM7JYM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-166-k012TxuLM5-J9DwPD-MUfg-1; Thu, 16 Mar 2023 11:27:21 -0400 X-MC-Unique: k012TxuLM5-J9DwPD-MUfg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DE2EC858F0E; Thu, 16 Mar 2023 15:27:20 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1C465202701E; Thu, 16 Mar 2023 15:27:19 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() Date: Thu, 16 Mar 2023 15:26:12 +0000 Message-Id: <20230316152618.711970-23-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539248157722600?= X-GMAIL-MSGID: =?utf-8?q?1760539248157722600?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in skb_send_sock(). This causes pages to be spliced from the source iterator if possible (the iterator must be ITER_BVEC and the pages must be spliceable). This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Note that this could perhaps be improved to fill out a bvec array with all the frags and then make a single sendmsg call, possibly sticking the header on the front also. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/core/skbuff.c | 49 ++++++++++++++++++++++++++--------------------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index eb7d33b41e71..9fa333e26b7d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2927,32 +2927,32 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset, } EXPORT_SYMBOL_GPL(skb_splice_bits); -static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size) +static int sendmsg_locked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; + size_t size = msg_data_left(msg); if (!sock) return -EINVAL; - return kernel_sendmsg(sock, msg, vec, num, size); + + if (!sock->ops->sendmsg_locked) + return sock_no_sendmsg_locked(sk, msg, size); + + return sock->ops->sendmsg_locked(sk, msg, size); } -static int sendpage_unlocked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; if (!sock) return -EINVAL; - return kernel_sendpage(sock, page, offset, size, flags); + return sock_sendmsg(sock, msg); } -typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size); -typedef int (*sendpage_func)(struct sock *sk, struct page *page, int offset, - size_t size, int flags); +typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg); static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, - int len, sendmsg_func sendmsg, sendpage_func sendpage) + int len, sendmsg_func sendmsg) { unsigned int orig_len = len; struct sk_buff *head = skb; @@ -2972,8 +2972,9 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, memset(&msg, 0, sizeof(msg)); msg.msg_flags = MSG_DONTWAIT; - ret = INDIRECT_CALL_2(sendmsg, kernel_sendmsg_locked, - sendmsg_unlocked, sk, &msg, &kv, 1, slen); + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &kv, 1, slen); + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -3004,11 +3005,17 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, slen = min_t(size_t, len, skb_frag_size(frag) - offset); while (slen) { - ret = INDIRECT_CALL_2(sendpage, kernel_sendpage_locked, - sendpage_unlocked, sk, - skb_frag_page(frag), - skb_frag_off(frag) + offset, - slen, MSG_DONTWAIT); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT, + }; + + bvec_set_page(&bvec, skb_frag_page(frag), slen, + skb_frag_off(frag) + offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, slen); + + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -3045,16 +3052,14 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, kernel_sendmsg_locked, - kernel_sendpage_locked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_locked); } EXPORT_SYMBOL_GPL(skb_send_sock_locked); /* Send skb data on a socket. Socket must be unlocked. */ int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked, - sendpage_unlocked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked); } /** From patchwork Thu Mar 16 15:26:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70859 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554591wrt; Thu, 16 Mar 2023 08:43:16 -0700 (PDT) X-Google-Smtp-Source: AK7set8CoQHP7XzW7nmlK/3ku+SDeuat/AgJaV+hbPXQS+IYGy/kMXON+AheL7iG6CrJo7fD2X62 X-Received: by 2002:a17:90a:764a:b0:23f:ed7:4774 with SMTP id s10-20020a17090a764a00b0023f0ed74774mr4288239pjl.7.1678981396647; Thu, 16 Mar 2023 08:43:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981396; cv=none; d=google.com; s=arc-20160816; b=p09mReubtBXgy80XVyou++VZ3cxtMx5EGJCn1qgsMImElo2YAH/VrfkD0XdDoOf3nR gF2tQgreGmXidDeZCYh4hs13PhcCPZZzCMaEJjQho1wvvsoI5+OxgjkEWixqHr4Q83Eb EyYpdxGiv6vHHxe2WLRsTeRKb73pOjoPSS7jhFABODy0BKZ0Wg/NgPmDQjAfCgjUPPjx RyH+eYa/RCHseEWy2k5Q8UfC+y5zeWbOzAlnWxcJznYpKloc7snodETWsnXe/5ew/Y2s qP4uIuqQnD3h+YtIx4Zm8biy/f64HX6NkM0hxVjoW/Cir1dlEBrnFbZlz1GCzZUgYbTK zt6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zb6mFnKCVLQu6xBmdV/ljtXgUzSvLJ/bZitog7Xho38=; b=Y4kKAP41OYZLxXvT4VM7GvLD+ZmMpYaNL9MCtsVOoq4NUAXPlLdeOSn2Elh289PMaJ 9YJ0diEUHn47VZDL3uY/gLByC6F3pz/Smo6UEURwc/F/2x28o/3IPnhkq70t0V7hOBnK aq76aIuFB5kqyyh6yAJ1XfhSR5hgqP+Etf/WvFWYDxntq1Exa/LxQjV/4RS52+fcz1YI gCjxtaJBbKPh9N90Fx+arBaCqB8m4JIGQXtCTHQ/EAY5eP3rxfTwZJvbpyD0q91oWwlz w15i4vATciKlflxxwo1sOpNcyYb3wlqgu78WGqeOgMMCpLW7mujENdrj4b5sheX/x+PA +c4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NMQu4bpN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j28-20020a63551c000000b0050bc14fc7absi8126448pgb.106.2023.03.16.08.43.01; Thu, 16 Mar 2023 08:43:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NMQu4bpN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231693AbjCPPap (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229943AbjCPP3T (ORCPT ); Thu, 16 Mar 2023 11:29:19 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56FAFDFB7B for ; Thu, 16 Mar 2023 08:27:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980447; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zb6mFnKCVLQu6xBmdV/ljtXgUzSvLJ/bZitog7Xho38=; b=NMQu4bpNobpBTUVnGaqnv7jVeB+Qb0O4YSoquuxZLzWZnnOVWRtQANRHH9iyN5pyQbmcB4 4KPnFgFMXQ6Y3qFEHMQc4/V1U3w4gjlClDZPbHUlBCJgy78DyPJeECzswaI9pgbYx+lYA1 qCYCYZVbc0yCm3/+WqWvofZdzjbgj14= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-584-Wn_Xf5sGMh2vYcAXsHUQRQ-1; Thu, 16 Mar 2023 11:27:24 -0400 X-MC-Unique: Wn_Xf5sGMh2vYcAXsHUQRQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A4CD3857FB3; Thu, 16 Mar 2023 15:27:23 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A8B7C140EBF4; Thu, 16 Mar 2023 15:27:21 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Herbert Xu , linux-crypto@vger.kernel.org Subject: [RFC PATCH 23/28] algif: Remove hash_sendpage*() Date: Thu, 16 Mar 2023 15:26:13 +0000 Message-Id: <20230316152618.711970-24-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539597178949418?= X-GMAIL-MSGID: =?utf-8?q?1760539597178949418?= Remove hash_sendpage*() and use hash_sendmsg() as the latter seems to just use the source pages directly anyway. Signed-off-by: David Howells cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/algif_hash.c | 66 --------------------------------------------- 1 file changed, 66 deletions(-) diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index 1d017ec5c63c..52f5828a054a 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -129,58 +129,6 @@ static int hash_sendmsg(struct socket *sock, struct msghdr *msg, return err ?: copied; } -static ssize_t hash_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct sock *sk = sock->sk; - struct alg_sock *ask = alg_sk(sk); - struct hash_ctx *ctx = ask->private; - int err; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; - - lock_sock(sk); - sg_init_table(ctx->sgl.sg, 1); - sg_set_page(ctx->sgl.sg, page, size, offset); - - if (!(flags & MSG_MORE)) { - err = hash_alloc_result(sk, ctx); - if (err) - goto unlock; - } else if (!ctx->more) - hash_free_result(sk, ctx); - - ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, ctx->result, size); - - if (!(flags & MSG_MORE)) { - if (ctx->more) - err = crypto_ahash_finup(&ctx->req); - else - err = crypto_ahash_digest(&ctx->req); - } else { - if (!ctx->more) { - err = crypto_ahash_init(&ctx->req); - err = crypto_wait_req(err, &ctx->wait); - if (err) - goto unlock; - } - - err = crypto_ahash_update(&ctx->req); - } - - err = crypto_wait_req(err, &ctx->wait); - if (err) - goto unlock; - - ctx->more = flags & MSG_MORE; - -unlock: - release_sock(sk); - - return err ?: size; -} - static int hash_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { @@ -285,7 +233,6 @@ static struct proto_ops algif_hash_ops = { .release = af_alg_release, .sendmsg = hash_sendmsg, - .sendpage = hash_sendpage, .recvmsg = hash_recvmsg, .accept = hash_accept, }; @@ -337,18 +284,6 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return hash_sendmsg(sock, msg, size); } -static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = hash_check_key(sock); - if (err) - return err; - - return hash_sendpage(sock, page, offset, size, flags); -} - static int hash_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -387,7 +322,6 @@ static struct proto_ops algif_hash_ops_nokey = { .release = af_alg_release, .sendmsg = hash_sendmsg_nokey, - .sendpage = hash_sendpage_nokey, .recvmsg = hash_recvmsg_nokey, .accept = hash_accept_nokey, }; From patchwork Thu Mar 16 15:26:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70851 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554149wrt; Thu, 16 Mar 2023 08:42:25 -0700 (PDT) X-Google-Smtp-Source: AK7set/9Z+XitJJaW2l7ejpUyYwa76WqO0r1HBvGCqMmsx47PHtmZx6BANklEMF3HxlduZVM5BPD X-Received: by 2002:a17:90b:1b04:b0:234:c030:7c7f with SMTP id nu4-20020a17090b1b0400b00234c0307c7fmr4580153pjb.18.1678981345610; Thu, 16 Mar 2023 08:42:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981345; cv=none; d=google.com; s=arc-20160816; b=FiNEtIMUoxU5dHCWHpegVBG6wbG7ejTFYlKGBMzoOngY/U6/z8jWzM13gUt0OE6t51 T699rhytfLr+i7ofiM2d3MLnu01eybR965o+WNAlSG/Q9jEYlsFKnsMP4uHj7XCGUSMZ V0VjFInFUVSYicXN+DNTH07rPHx3UElrmhaJ8Ns11u3f9vFabAyFs1uuX8gz6iGaCZpD YpBtzdRTMFCRDO4U7HP0WgPOFuhsZB8RJC9IgY5Sny+ES+YXAgEK7che1WC9YCrMRe0y 7WHyV/3Wr1AY4sSKzNMdpsJeJm6k3w396wHAlKrlkBRz1Fn5Vio76iMInu+qY4F/vfWq vkPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=14pAi6JHwU3C53h/ZQrPmmNB396J3kzB8Cmz3sg/vjY=; b=iSVZCcxPjZgiYDZmpBVAaA1hLHkG9lRQCnB5WJtSFVJsghg9R1HIpf0qIHLRKDjj3e +dk/uXTi1aXeakoeX6niY6V94VAjtDqbw4LumU1WdSZ7HUKE5jcsswWQFJFKJAjfzFKF 6B4iXHQexKfQAabpu8EdRWSBBMv1tnawAKqb1Q2gfFHs/HNLHkVNWAfY7u4TrYmMawx1 CWwZaDBR1WLAwBNoMZhX3ObkUIBtiAsvI/Xtq6qej74fKEMZny17xKSYMtU0m6HONRv7 M1RsyugP31zUEe2LFIdBb6/SnpT3bTyM6/eIZK6PtRCy8NxefzeQaYbWS9ujLL6umsBU VUmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KkGB8Kun; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d17-20020a170902aa9100b001a045fcd743si8283365plr.142.2023.03.16.08.42.08; Thu, 16 Mar 2023 08:42:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KkGB8Kun; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231756AbjCPPbG (ORCPT + 99 others); Thu, 16 Mar 2023 11:31:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231793AbjCPPaS (ORCPT ); Thu, 16 Mar 2023 11:30:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAF9DE192C for ; Thu, 16 Mar 2023 08:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=14pAi6JHwU3C53h/ZQrPmmNB396J3kzB8Cmz3sg/vjY=; b=KkGB8Kun1f7XldWy1TObInYwkXKlfO9UNdV8/Nog8OdEIMAU5fOWZTctWt5qCf5kj/E8VP xriZsVP7rJpS5SA7OgoXuhy58lAkJGlzhnWlAoz9Ew/NH/mX1UpEH3bqnbZk7pa+nk/eqn nwYVQkSQIIT8D6u8ADBYUFDmcnF6AHg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-319-pD9jpABAOzyxINJMnpA1QQ-1; Thu, 16 Mar 2023 11:27:27 -0400 X-MC-Unique: pD9jpABAOzyxINJMnpA1QQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 72108185A790; Thu, 16 Mar 2023 15:27:26 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A34D2027040; Thu, 16 Mar 2023 15:27:24 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ilya Dryomov , Xiubo Li , ceph-devel@vger.kernel.org Subject: [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Thu, 16 Mar 2023 15:26:14 +0000 Message-Id: <20230316152618.711970-25-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539543397855650?= X-GMAIL-MSGID: =?utf-8?q?1760539543397855650?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v2.c | 89 +++++++++-------------------------------- 1 file changed, 18 insertions(+), 71 deletions(-) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index 301a991dc6a6..1637a0c21126 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -117,91 +117,38 @@ static int ceph_tcp_recv(struct ceph_connection *con) return ret; } -static int do_sendmsg(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - int ret; - - msg.msg_iter = *it; - while (iov_iter_count(it)) { - ret = sock_sendmsg(sock, &msg); - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - WARN_ON(msg_data_left(&msg)); - return 1; -} - -static int do_try_sendpage(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - struct bio_vec bv; - int ret; - - if (WARN_ON(!iov_iter_is_bvec(it))) - return -EINVAL; - - while (iov_iter_count(it)) { - /* iov_iter_iovec() for ITER_BVEC */ - bvec_set_page(&bv, it->bvec->bv_page, - min(iov_iter_count(it), - it->bvec->bv_len - it->iov_offset), - it->bvec->bv_offset + it->iov_offset); - - /* - * sendpage cannot properly handle pages with - * page_count == 0, we need to fall back to sendmsg if - * that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag - * which triggers one of hardened usercopy checks. - */ - if (sendpage_ok(bv.bv_page)) { - ret = sock->ops->sendpage(sock, bv.bv_page, - bv.bv_offset, bv.bv_len, - CEPH_MSG_FLAGS); - } else { - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len); - ret = sock_sendmsg(sock, &msg); - } - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - return 1; -} - /* * Write as much as possible. The socket is expected to be corked, * so we don't bother with MSG_MORE/MSG_SENDPAGE_NOTLAST here. * * Return: - * 1 - done, nothing (else) to write + * >0 - done, nothing (else) to write * 0 - socket is full, need to wait * <0 - error */ static int ceph_tcp_send(struct ceph_connection *con) { + struct msghdr msg = { + .msg_iter = con->v2.out_iter, + .msg_flags = CEPH_MSG_FLAGS, + }; int ret; + if (WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter))) + return -EINVAL; + + if (con->v2.out_iter_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + dout("%s con %p have %zu try_sendpage %d\n", __func__, con, iov_iter_count(&con->v2.out_iter), con->v2.out_iter_sendpage); - if (con->v2.out_iter_sendpage) - ret = do_try_sendpage(con->sock, &con->v2.out_iter); - else - ret = do_sendmsg(con->sock, &con->v2.out_iter); + + ret = sock_sendmsg(con->sock, &msg); + if (ret > 0) + iov_iter_advance(&con->v2.out_iter, ret); + else if (ret == -EAGAIN) + ret = 0; + dout("%s con %p ret %d left %zu\n", __func__, con, ret, iov_iter_count(&con->v2.out_iter)); return ret; From patchwork Thu Mar 16 15:26:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70839 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp552173wrt; Thu, 16 Mar 2023 08:38:20 -0700 (PDT) X-Google-Smtp-Source: AK7set9QSbna5/CAMOxRR2laO1WhA1M0kR5s5ce4A7aPi7d/e6V+hVOADJHFZfRL9o1zOIej7jpt X-Received: by 2002:a62:1d46:0:b0:625:8d81:5e9d with SMTP id d67-20020a621d46000000b006258d815e9dmr3222469pfd.10.1678981099926; Thu, 16 Mar 2023 08:38:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981099; cv=none; d=google.com; s=arc-20160816; b=qoecBBHkt6jgiHGN9oDoXVTlK95p4tD37aO5DY/B+AE2gfhh/wDmikYpNv0w12vN6N iCcFA/CuEm3VtdnfLWM3vj6PcBods3SpcXXqY808lE5/ZWzpT6vN10QIoTvaxtej7MUu rb1Q5LhsgKg0At00QpbqwR0Zs2p4DrdP9noBFwCDEeDdbgCzpGOzzckFOvxp4dabIB/O XumrNbEwMG8Qwaz/xXmKnh2G0RfDtBOOh1ac4wTyLX9pPQbAjTObkaHkqBl8GdzFQUg1 Wt8nHyNOQ63w8Ih7qqqUkAkadozQlEA2jfX4bb43D2UnheBemUWARAR0OxbsQWMRycyr iXqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=92bJ6Ynf1CSTwRAMP7lCPtKM1kI8JBpDgRb6Ls1sSMY=; b=d5Jcrv7vrB7hcDp9lj6aAhEsg7ezjMiB0vKYFzl6QMfekHhJh+pj3m4SUW5HYPp1oa k+7dgmIubusv72eOFKwHLXwY3ojW1PnQt2qVKflH4dL/Aiphvqqb00oqiePz7TvOyyNU VY7Mo57N66ml7geeYGRHJKmZNJfeqSnMsAZ+YCkwFJptroYl/gPJMOO2o+rnyz62OfKU SXn28XTMxj6x5wHrKU1REvujZvoY+vvCKZrcLGfoFxG1BMTDoHqIeXC/gIOmO9/U1wQA q+9akQjYENftR8Vm4+dZQCqHbak42Qn3ZGFarsvuHKxd9NSqkXcv5ucCPEC0j5qZ0pLc mF9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BaS8V8Vm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w1-20020aa79a01000000b00623114a7324si8149394pfj.363.2023.03.16.08.38.07; Thu, 16 Mar 2023 08:38:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BaS8V8Vm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231721AbjCPPav (ORCPT + 99 others); Thu, 16 Mar 2023 11:30:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231295AbjCPP3b (ORCPT ); Thu, 16 Mar 2023 11:29:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A03065CEC4 for ; Thu, 16 Mar 2023 08:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=92bJ6Ynf1CSTwRAMP7lCPtKM1kI8JBpDgRb6Ls1sSMY=; b=BaS8V8VmJfuEENcQJVc1bkiwiHUpdN7+UTgYh9MPJuwML6A4SBLCUGEFfzewqLb+dhl4U8 gVshBfmjAXNrGrCTt1tTX1F3Bp5fsyX1Aunxlfi4Viq5LrHN9819HiW2lyaGZ4rPTDb4r9 btRaMKgOM4iz6iCh9LWKHf6RiWSkfXM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-135-Uky157iCM7O_1XjnKbwmzg-1; Thu, 16 Mar 2023 11:27:30 -0400 X-MC-Unique: Uky157iCM7O_1XjnKbwmzg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2D062185A791; Thu, 16 Mar 2023 15:27:29 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 150B540B3ED6; Thu, 16 Mar 2023 15:27:26 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Santosh Shilimkar , linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com Subject: [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Thu, 16 Mar 2023 15:26:15 +0000 Message-Id: <20230316152618.711970-26-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539285691357869?= X-GMAIL-MSGID: =?utf-8?q?1760539285691357869?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header and data pages. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header are copied into memory acquired from zcopy_alloc() which just breaks a page up into small pieces that can be freed with put_page(). Signed-off-by: David Howells cc: Santosh Shilimkar cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: rds-devel@oss.oracle.com cc: netdev@vger.kernel.org --- net/rds/tcp_send.c | 80 ++++++++++++++++++++-------------------------- 1 file changed, 35 insertions(+), 45 deletions(-) diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c index 8c4d1d6e9249..0d6eb85a930d 100644 --- a/net/rds/tcp_send.c +++ b/net/rds/tcp_send.c @@ -32,6 +32,7 @@ */ #include #include +#include #include #include "rds_single_path.h" @@ -52,29 +53,24 @@ void rds_tcp_xmit_path_complete(struct rds_conn_path *cp) tcp_sock_set_cork(tc->t_sock->sk, false); } -/* the core send_sem serializes this with other xmit and shutdown */ -static int rds_tcp_sendmsg(struct socket *sock, void *data, unsigned int len) -{ - struct kvec vec = { - .iov_base = data, - .iov_len = len, - }; - struct msghdr msg = { - .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL, - }; - - return kernel_sendmsg(sock, &msg, &vec, 1, vec.iov_len); -} - /* the core send_sem serializes this with other xmit and shutdown */ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, unsigned int hdr_off, unsigned int sg, unsigned int off) { struct rds_conn_path *cp = rm->m_inc.i_conn_path; struct rds_tcp_connection *tc = cp->cp_transport_data; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; + struct bio_vec *bvec; + unsigned int i, size = 0, ix = 0; + bool free_hdr = false; int done = 0; - int ret = 0; - int more; + int ret = -ENOMEM; + + bvec = kmalloc_array(1 + sg, sizeof(struct bio_vec), GFP_KERNEL); + if (!bvec) + goto out; if (hdr_off == 0) { /* @@ -101,41 +97,30 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, /* see rds_tcp_write_space() */ set_bit(SOCK_NOSPACE, &tc->t_sock->sk->sk_socket->flags); - ret = rds_tcp_sendmsg(tc->t_sock, - (void *)&rm->m_inc.i_hdr + hdr_off, - sizeof(rm->m_inc.i_hdr) - hdr_off); + ret = zcopy_memdup(sizeof(rm->m_inc.i_hdr) - hdr_off, + (void *)&rm->m_inc.i_hdr + hdr_off, + &bvec[ix], GFP_KERNEL); if (ret < 0) goto out; - done += ret; - if (hdr_off + done != sizeof(struct rds_header)) - goto out; + free_hdr = true; + size += bvec[ix].bv_len; + ix++; } - more = rm->data.op_nents > 1 ? (MSG_MORE | MSG_SENDPAGE_NOTLAST) : 0; - while (sg < rm->data.op_nents) { - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - - ret = tc->t_sock->ops->sendpage(tc->t_sock, - sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, - rm->data.op_sg[sg].length - off, - flags); - rdsdebug("tcp sendpage %p:%u:%u ret %d\n", (void *)sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, rm->data.op_sg[sg].length - off, - ret); - if (ret <= 0) - break; - - off += ret; - done += ret; - if (off == rm->data.op_sg[sg].length) { - off = 0; - sg++; - } - if (sg == rm->data.op_nents - 1) - more = 0; + for (i = sg; i < rm->data.op_nents; i++) { + bvec_set_page(&bvec[ix], + sg_page(&rm->data.op_sg[i]), + rm->data.op_sg[i].length - off, + rm->data.op_sg[i].offset + off); + off = 0; + size += bvec[ix].bv_len; + ix++; } + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, ix, size); + ret = sock_sendmsg(tc->t_sock, &msg); + rdsdebug("tcp sendmsg-splice %u,%u ret %d\n", ix, size, ret); + out: if (ret <= 0) { /* write_space will hit after EAGAIN, all else fatal */ @@ -158,6 +143,11 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, } if (done == 0) done = ret; + if (bvec) { + if (free_hdr) + put_page(bvec[0].bv_page); + kfree(bvec); + } return done; } From patchwork Thu Mar 16 15:26:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70868 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp555344wrt; Thu, 16 Mar 2023 08:44:48 -0700 (PDT) X-Google-Smtp-Source: AK7set/i1p2WzuzCRg286sFDGC4mfURCk/JO+SAjZuZjHNlFfInRLq4wOGwEFj5UBSMfaOLaiVle X-Received: by 2002:a17:90b:4f48:b0:23d:1852:d3b7 with SMTP id pj8-20020a17090b4f4800b0023d1852d3b7mr4389908pjb.25.1678981487710; Thu, 16 Mar 2023 08:44:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981487; cv=none; d=google.com; s=arc-20160816; b=QJ5p07EgkH27kkOjZv/Ujdck0JpMoWp7SlsDuiMCmyO90v3RBHPby/gNC+jw+toNDP 1PA7i5h5nyM3yQo9UN+wRvZNy6wSOmd7sM6qfbPPy8ZRdTEAbxdEUEvww7XpqOF48lMI +F94T38DTsuuoNRR6fkDd2yO539LhLAiK9efRJoEg41Diwe3wyABWFgoBadImRxWsbjc svxqdh4rcWi76EPQrkGCu+tFJAw+a7w1SyZod3Ljabsm1XqO8vS97lpb+vx/q1fMhSxo kWuOnjzk0ME5q0sKvYYYow+ioa2/hhbciloZ91ufWXlsrzGmsXeGNHcnotmY9Phb7RSw +O+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6d/Oxl1Cdu8ybDL5tOgPD1gv3cW2fyMj4HBClKMadWo=; b=0u6y1XOt7LzAml6hfvDXAFjiHcKveZRW2PvH5m2CpGMcvXJWVLk5kvBu8jVab2RZy7 +XggLBhbpYotjXBHlt40ITZyWvpRuiQmxVZWmGc6N5pYuTkxX7LNSLYEZrNxiPhvcLGg 3X+x9uN4SjKfG8/gEZr8g8SpgFH2h55onL81XDLKw+CwWqqfmEmWm6B0iXH+7hg6v160 pD0jE7zHlHzIF0+cizqMWVTrA3LN0HKfLoFTkOntphVw4VOt/xnVlLwigTTcjygQQ28h 1Y7gjFT7sTlWxCaDWT6rD2MIkkbkxONIpCsGGLzxd2Sh6eqmRIATf/r1YyPDSFMvK4nI lyrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W8XgKHqM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p4-20020a17090a2c4400b0023b15d4eb5asi4783081pjm.122.2023.03.16.08.44.32; Thu, 16 Mar 2023 08:44:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W8XgKHqM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231749AbjCPPbA (ORCPT + 99 others); Thu, 16 Mar 2023 11:31:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231786AbjCPPaO (ORCPT ); Thu, 16 Mar 2023 11:30:14 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C17C3E1C84 for ; Thu, 16 Mar 2023 08:27:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6d/Oxl1Cdu8ybDL5tOgPD1gv3cW2fyMj4HBClKMadWo=; b=W8XgKHqMiRGGcI3gHR1CR3kn0LU1+vqV/kAiPEvHXpvWPD6+tUR+k1aB4HJq/tfbTlAVIV tO37vJu8tPXn8zw/OMK09jdQv/pYSMu8ZsWR6DY3v0BGuBg0MbhuK67QesRVr6S3IrOi8F AnleJHrauubA1Xxf+dDdjHoX0ZUa624= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-630-rdzabOPcMoiPR4mi-J1pAg-1; Thu, 16 Mar 2023 11:27:32 -0400 X-MC-Unique: rdzabOPcMoiPR4mi-J1pAg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0D8C828237D5; Thu, 16 Mar 2023 15:27:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC5562166B26; Thu, 16 Mar 2023 15:27:29 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christine Caulfield , David Teigland , cluster-devel@redhat.com Subject: [RFC PATCH 26/28] dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Thu, 16 Mar 2023 15:26:16 +0000 Message-Id: <20230316152618.711970-27-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539692891040284?= X-GMAIL-MSGID: =?utf-8?q?1760539692891040284?= When transmitting data, call down a layer using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather using sendpage. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Christine Caulfield cc: David Teigland cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: cluster-devel@redhat.com cc: netdev@vger.kernel.org --- fs/dlm/lowcomms.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index a9b14f81d655..9c0c691b6106 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1394,8 +1394,11 @@ int dlm_lowcomms_resend_msg(struct dlm_msg *msg) /* Send a message */ static int send_to_sock(struct connection *con) { - const int msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL; struct writequeue_entry *e; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; int len, offset, ret; spin_lock_bh(&con->writequeue_lock); @@ -1411,8 +1414,9 @@ static int send_to_sock(struct connection *con) WARN_ON_ONCE(len == 0 && e->users == 0); spin_unlock_bh(&con->writequeue_lock); - ret = kernel_sendpage(con->sock, e->page, offset, len, - msg_flags); + bvec_set_page(&bvec, e->page, len, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(con->sock, &msg); trace_dlm_send(con->nodeid, ret); if (ret == -EAGAIN || ret == 0) { lock_sock(con->sock->sk); From patchwork Thu Mar 16 15:26:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70860 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554644wrt; Thu, 16 Mar 2023 08:43:23 -0700 (PDT) X-Google-Smtp-Source: AK7set9rKyXAa0oU8pRSjNWVdthFz6iYFd3LGexPBxbSJmO+8Yf2iEpV3ccMkzfEc/QDUV8bNzJC X-Received: by 2002:a05:6a21:6d96:b0:d5:c8c8:e2c with SMTP id wl22-20020a056a216d9600b000d5c8c80e2cmr4008337pzb.23.1678981403405; Thu, 16 Mar 2023 08:43:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981403; cv=none; d=google.com; s=arc-20160816; b=EeiEUd+ZLdtPNmXLBX+ii38bD4jQEM2V+PrmvkpWtgOR+YRd9KSb3N2BO4CAwJ0jsB pL+HfzQEW4n+oaIJO6zjENB6Uhu0V4f5RWDOmhhSSbcTdzPcOOwJObQWquh2rApA8Yb2 P04OOcBddGAKmeDQz5hwjyzkMkUShy+rsP+AK4A2jB4HBenKuMxGi+DC8I3qhWajSVA9 HS9vC53lcIKaRJOZCxTMPuaC9D2lKCafk0pc9I18U4Y+5ag2FhmvUjlKoJoz3gKKqVwo aQQ1kZILUDp5r+UKNJfN7v/aHxx2hvqiDzzJmw99CPPg/kqzHTgrF1hpixH3+8E+Y0Pv DmOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CNHQplOZd5HfwcKa/y8evYEBVb89JfsyQlI6+74E8fk=; b=QEoPOMjOAjQwLmg5pOOjTn8EWACD+cUW3Gasq1a9esNI0qd7Jaf8V2ytiE9hAR9uIu IdDGFEXfzGNH3UgUrorRk4sQDmwUIscYC4orMRHw3rNDuUcmWV3SPghVwtjH5rnJ/TMK xjMqSxHKq1paapUSz+qGvDpmK9MjB6ewgY/hUcUrf20FLB2uGK8YSPcb9BPRHDTDgVie 7sphZMhh1Wr5q5qmP8bPt5l7iAxpDk5JGf0cuAn6kldfwo4DgGreCIiFNkzstGc930+y 98jHkOmylQNEGgq3pAi6/oorEMiwhIHD0AjNT1dygU4bT9fAm9gVSc4x++LoCeAakrv0 HI1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W1Pl2B0f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g20-20020a056a000b9400b00625c54f17f6si5243953pfj.141.2023.03.16.08.43.08; Thu, 16 Mar 2023 08:43:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W1Pl2B0f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231772AbjCPPbY (ORCPT + 99 others); Thu, 16 Mar 2023 11:31:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231797AbjCPPaU (ORCPT ); Thu, 16 Mar 2023 11:30:20 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6884EE1CBB for ; Thu, 16 Mar 2023 08:27:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CNHQplOZd5HfwcKa/y8evYEBVb89JfsyQlI6+74E8fk=; b=W1Pl2B0fy53C6+BOKq+KK1/YTkN6VDIFUljootPxrglva5qRFPy0Pyx9URCmrbAItAdo82 3f6kgx83YSBjgITm9zA3xAwQWPJ3CNYi9EDT+HppN3ttpT+YBpUaAHf5ZKUj8RF9KF71eb tNMp6QgnuhOktUyR2d/RzoOjjQPclYI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-321-2P0S9tmzOBuO6RSNRr9oBQ-1; Thu, 16 Mar 2023 11:27:36 -0400 X-MC-Unique: 2P0S9tmzOBuO6RSNRr9oBQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E5750887402; Thu, 16 Mar 2023 15:27:34 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9F26035453; Thu, 16 Mar 2023 15:27:32 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Trond Myklebust , Anna Schumaker , Chuck Lever , linux-nfs@vger.kernel.org Subject: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Thu, 16 Mar 2023 15:26:17 +0000 Message-Id: <20230316152618.711970-28-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539603872077619?= X-GMAIL-MSGID: =?utf-8?q?1760539603872077619?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The bio_vec array has two extra slots before the first for headers and one after the last for a trailer. The headers and trailer are copied into memory acquired from zcopy_alloc() which just breaks a page up into small pieces that can be freed with put_page(). Signed-off-by: David Howells cc: Trond Myklebust cc: Anna Schumaker cc: Chuck Lever cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nfs@vger.kernel.org cc: netdev@vger.kernel.org --- net/sunrpc/svcsock.c | 70 ++++++++++++-------------------------------- net/sunrpc/xdr.c | 24 ++++++++++++--- 2 files changed, 38 insertions(+), 56 deletions(-) diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 03a4f5615086..1fa41ddbc40e 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include #include @@ -1060,16 +1061,8 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp) return 0; /* record not complete */ } -static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec, - int flags) -{ - return kernel_sendpage(sock, virt_to_page(vec->iov_base), - offset_in_page(vec->iov_base), - vec->iov_len, flags); -} - /* - * kernel_sendpage() is used exclusively to reduce the number of + * MSG_SPLICE_PAGES is used exclusively to reduce the number of * copy operations in this path. Therefore the caller must ensure * that the pages backing @xdr are unchanging. * @@ -1081,65 +1074,38 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr, { const struct kvec *head = xdr->head; const struct kvec *tail = xdr->tail; - struct kvec rm = { - .iov_base = &marker, - .iov_len = sizeof(marker), - }; struct msghdr msg = { - .msg_flags = 0, + .msg_flags = MSG_SPLICE_PAGES, }; - int ret; + int ret, n = xdr_buf_pagecount(xdr), size; *sentp = 0; ret = xdr_alloc_bvec(xdr, GFP_KERNEL); if (ret < 0) return ret; - ret = kernel_sendmsg(sock, &msg, &rm, 1, rm.iov_len); + ret = zcopy_memdup(sizeof(marker), &marker, &xdr->bvec[-2], GFP_KERNEL); if (ret < 0) return ret; - *sentp += ret; - if (ret != rm.iov_len) - return -EAGAIN; - ret = svc_tcp_send_kvec(sock, head, 0); + ret = zcopy_memdup(head->iov_len, head->iov_base, &xdr->bvec[-1], GFP_KERNEL); if (ret < 0) return ret; - *sentp += ret; - if (ret != head->iov_len) - goto out; - if (xdr->page_len) { - unsigned int offset, len, remaining; - struct bio_vec *bvec; - - bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT); - offset = offset_in_page(xdr->page_base); - remaining = xdr->page_len; - while (remaining > 0) { - len = min(remaining, bvec->bv_len - offset); - ret = kernel_sendpage(sock, bvec->bv_page, - bvec->bv_offset + offset, - len, 0); - if (ret < 0) - return ret; - *sentp += ret; - if (ret != len) - goto out; - remaining -= len; - offset = 0; - bvec++; - } - } + ret = zcopy_memdup(tail->iov_len, tail->iov_base, &xdr->bvec[n], GFP_KERNEL); + if (ret < 0) + return ret; - if (tail->iov_len) { - ret = svc_tcp_send_kvec(sock, tail, 0); - if (ret < 0) - return ret; - *sentp += ret; - } + size = sizeof(marker) + head->iov_len + xdr->page_len + tail->iov_len; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec - 2, n + 3, size); -out: + ret = sock_sendmsg(sock, &msg); + if (ret < 0) + return ret; + if (ret > 0) + *sentp = ret; + if (ret != size) + return -EAGAIN; return 0; } diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index 36835b2f5446..6dff0b4f17b8 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -145,14 +145,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp) { size_t i, n = xdr_buf_pagecount(buf); - if (n != 0 && buf->bvec == NULL) { - buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp); + if (buf->bvec == NULL) { + /* Allow for two headers and a trailer to be attached */ + buf->bvec = kmalloc_array(n + 3, sizeof(buf->bvec[0]), gfp); if (!buf->bvec) return -ENOMEM; + buf->bvec += 2; + buf->bvec[-2].bv_page = NULL; + buf->bvec[-1].bv_page = NULL; for (i = 0; i < n; i++) { bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE, 0); } + buf->bvec[n].bv_page = NULL; } return 0; } @@ -160,8 +165,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp) void xdr_free_bvec(struct xdr_buf *buf) { - kfree(buf->bvec); - buf->bvec = NULL; + if (buf->bvec) { + size_t n = xdr_buf_pagecount(buf); + + if (buf->bvec[-2].bv_page) + put_page(buf->bvec[-2].bv_page); + if (buf->bvec[-1].bv_page) + put_page(buf->bvec[-1].bv_page); + if (buf->bvec[n].bv_page) + put_page(buf->bvec[n].bv_page); + buf->bvec -= 2; + kfree(buf->bvec); + buf->bvec = NULL; + } } /** From patchwork Thu Mar 16 15:26:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 70864 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp554833wrt; Thu, 16 Mar 2023 08:43:41 -0700 (PDT) X-Google-Smtp-Source: AK7set9x3EHMARv3JxP9pip04jSLWKFmfqFoLP/YxUYoaeHuQAdtQXgyZdweu2eC8cpZ33r+EZOu X-Received: by 2002:a17:90a:1903:b0:233:f786:35ca with SMTP id 3-20020a17090a190300b00233f78635camr4462027pjg.35.1678981421138; Thu, 16 Mar 2023 08:43:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678981421; cv=none; d=google.com; s=arc-20160816; b=BEF7iQfzYlLlDReIrzDMr7x32TP4PcgCMjRTp6mMVXUrJZNhJPWyUxqk/LxPhht/7c Lg6oYswv+sFEdnkCSK3XDvOXBUjAR4ifZMotAwlG+75S3D2JuT0dpNzqkumN+0C8Rpb6 98jeKlpl8kD9BL9xifeW7HrWTeABObv3GPO1HOwkKuzdW+0gRCWpmYlyvxRp7M2RCO09 W3zHA6XyhjFcDPa1K+T5OLKKnVs7AbLph0DYPHflQgA6OVmibMZ1qTj5aLd84QRVo4xk hWQSHf2GqLhhqdTih5HOsTXM5cDHp2SPK4heR8CAGfH8eme8tb/savsobCnoOdq+z/FI Y8sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=g28lAro14BgDqolGBHXjLLcx+HZ4QIFJS3xRn8XzXbk=; b=R4LP/kyGL9qTZpH7HSUmkgOjhfEuEh9XQgce7ZHaDJ8Nr3KL0TgLW8Acqnl6pZTmrC JVc/g5GND7vZRhsbwTwvt9g6SrjwbG111Bo7OvfbRyw9h4DOZ0ka+dwQJYPPVc44s3zo GYi4xtynbs4zur5F6BMhSreXyqL9n/XGH1jm8QbhmjafeMNrEU8zu7DFFa6sxenOCwz3 bbqr986jB/w8VbwZDWvG/eAMCsl054OXdbPnzyfgqm6TZuDyC/ImPEKk9VWj1K/oVviV 0N4YSNw2GM6cXPiaPg6DQGHFZUqXl0DBHxBzQJ4SomN214HwAHtKKaBRnndbKEvtPnVf iVww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AXBYeD2j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g1-20020a655941000000b005089c2989f8si8241750pgu.714.2023.03.16.08.43.25; Thu, 16 Mar 2023 08:43:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AXBYeD2j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231806AbjCPPcs (ORCPT + 99 others); Thu, 16 Mar 2023 11:32:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231701AbjCPPaq (ORCPT ); Thu, 16 Mar 2023 11:30:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C81BE1C87 for ; Thu, 16 Mar 2023 08:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678980464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g28lAro14BgDqolGBHXjLLcx+HZ4QIFJS3xRn8XzXbk=; b=AXBYeD2jfqnWxQhYy2fZ42XL8hMxNFU84VL97sULWiYhjbUvaP+U6Y/r55oV/j5Anock5h 9zHH3oJ1KLHNg4CM9g6VO6xhotbMaFohCeWRgR0vqJTRF5GFqPw9NDuk4QSFu8kbtK5kQz 02lqgFsgUPKJaIFaLD26DZGkK2PaXco= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-80-iYSy0vJlPgiCEGBQOVhyTQ-1; Thu, 16 Mar 2023 11:27:40 -0400 X-MC-Unique: iYSy0vJlPgiCEGBQOVhyTQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 572DE96DC82; Thu, 16 Mar 2023 15:27:39 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A23D2492B00; Thu, 16 Mar 2023 15:27:35 +0000 (UTC) From: David Howells To: Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: David Howells , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, dccp@vger.kernel.org, linux-afs@lists.infradead.org, linux-arm-msm@vger.kernel.org, linux-can@vger.kernel.org, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-hams@vger.kernel.org, linux-rdma@vger.kernel.org, linux-sctp@vger.kernel.org, linux-wpan@vger.kernel.org, linux-x25@vger.kernel.org, mptcp@lists.linux.dev, rds-devel@oss.oracle.com, tipc-discussion@lists.sourceforge.net, virtualization@lists.linux-foundation.org Subject: [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Date: Thu, 16 Mar 2023 15:26:18 +0000 Message-Id: <20230316152618.711970-29-dhowells@redhat.com> In-Reply-To: <20230316152618.711970-1-dhowells@redhat.com> References: <20230316152618.711970-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760539622625550300?= X-GMAIL-MSGID: =?utf-8?q?1760539622625550300?= [!] Note: This is a work in progress. At the moment, some things won't build if this patch is applied. nvme, kcm, smc, tls. Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: dccp@vger.kernel.org cc: linux-afs@lists.infradead.org cc: linux-arm-msm@vger.kernel.org cc: linux-can@vger.kernel.org cc: linux-crypto@vger.kernel.org cc: linux-doc@vger.kernel.org cc: linux-hams@vger.kernel.org cc: linux-kernel@vger.kernel.org cc: linux-rdma@vger.kernel.org cc: linux-sctp@vger.kernel.org cc: linux-wpan@vger.kernel.org cc: linux-x25@vger.kernel.org cc: mptcp@lists.linux.dev cc: netdev@vger.kernel.org cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Acked-by: Marc Kleine-Budde # for net/can --- Documentation/networking/scaling.rst | 4 +- crypto/af_alg.c | 29 ------ crypto/algif_aead.c | 22 +---- crypto/algif_rng.c | 2 - crypto/algif_skcipher.c | 14 --- include/linux/net.h | 8 -- include/net/inet_common.h | 2 - include/net/sock.h | 6 -- net/appletalk/ddp.c | 1 - net/atm/pvc.c | 1 - net/atm/svc.c | 1 - net/ax25/af_ax25.c | 1 - net/caif/caif_socket.c | 2 - net/can/bcm.c | 1 - net/can/isotp.c | 1 - net/can/j1939/socket.c | 1 - net/can/raw.c | 1 - net/core/sock.c | 35 +------ net/dccp/ipv4.c | 1 - net/dccp/ipv6.c | 1 - net/ieee802154/socket.c | 2 - net/ipv4/af_inet.c | 21 ---- net/ipv4/tcp.c | 36 ------- net/ipv4/tcp_bpf.c | 21 +--- net/ipv4/tcp_ipv4.c | 1 - net/ipv4/udp.c | 22 ----- net/ipv4/udp_impl.h | 2 - net/ipv4/udplite.c | 1 - net/ipv6/af_inet6.c | 3 - net/ipv6/raw.c | 1 - net/ipv6/tcp_ipv6.c | 1 - net/key/af_key.c | 1 - net/l2tp/l2tp_ip.c | 1 - net/l2tp/l2tp_ip6.c | 1 - net/llc/af_llc.c | 1 - net/mctp/af_mctp.c | 1 - net/mptcp/protocol.c | 2 - net/netlink/af_netlink.c | 1 - net/netrom/af_netrom.c | 1 - net/packet/af_packet.c | 2 - net/phonet/socket.c | 2 - net/qrtr/af_qrtr.c | 1 - net/rds/af_rds.c | 1 - net/rose/af_rose.c | 1 - net/rxrpc/af_rxrpc.c | 1 - net/sctp/protocol.c | 1 - net/socket.c | 48 --------- net/tipc/socket.c | 3 - net/unix/af_unix.c | 139 --------------------------- net/vmw_vsock/af_vsock.c | 3 - net/x25/af_x25.c | 1 - net/xdp/xsk.c | 1 - 52 files changed, 9 insertions(+), 449 deletions(-) diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst index 3d435caa3ef2..92c9fb46d6a2 100644 --- a/Documentation/networking/scaling.rst +++ b/Documentation/networking/scaling.rst @@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes. rps_sock_flow_table is a global flow table that contains the *desired* CPU for flows: the CPU that is currently processing the flow in userspace. Each table value is a CPU index that is updated during calls to recvmsg -and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage() -and tcp_splice_read()). +and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and +tcp_splice_read()). When the scheduler moves a thread to a new CPU while it has outstanding receive packets on the old CPU, packets may arrive out of order. To diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 0e77fce60876..225c90657f58 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .sendmsg = sock_no_sendmsg, .recvmsg = sock_no_recvmsg, @@ -1135,34 +1134,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, } EXPORT_SYMBOL_GPL(af_alg_sendmsg); -/** - * af_alg_sendpage - sendpage system call handler - * @sock: socket of connection to user space to write to - * @page: data to send - * @offset: offset into page to begin sending - * @size: length of data - * @flags: message send/receive flags - * - * This is a generic implementation of sendpage to fill ctx->tsgl_list. - */ -ssize_t af_alg_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return sock_sendmsg(sock, &msg); -} -EXPORT_SYMBOL_GPL(af_alg_sendpage); - /** * af_alg_free_resources - release resources required for crypto request * @areq: Request holding the TX and RX SGL diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 279eb17a1dfc..b65baefe6123 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage. Filling up - * the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg (maybe with with + * MSG_SPLICE_PAGES). Filling up the TX SGL does not cause a crypto operation + * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg + * call, the caller must provide a buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed @@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = { .release = af_alg_release, .sendmsg = aead_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = aead_recvmsg, .poll = af_alg_poll, }; @@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return aead_sendmsg(sock, msg, size); } -static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = aead_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = { .release = af_alg_release, .sendmsg = aead_sendmsg_nokey, - .sendpage = aead_sendpage_nokey, .recvmsg = aead_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c index 407408c43730..10c41adac3b1 100644 --- a/crypto/algif_rng.c +++ b/crypto/algif_rng.c @@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = { .bind = sock_no_bind, .accept = sock_no_accept, .sendmsg = sock_no_sendmsg, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_recvmsg, @@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = { .mmap = sock_no_mmap, .bind = sock_no_bind, .accept = sock_no_accept, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_test_recvmsg, diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index 021f9ce7e87c..b34e20400e80 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = { .release = af_alg_release, .sendmsg = skcipher_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = skcipher_recvmsg, .poll = af_alg_poll, }; @@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return skcipher_sendmsg(sock, msg, size); } -static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = skcipher_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = { .release = af_alg_release, .sendmsg = skcipher_sendmsg_nokey, - .sendpage = skcipher_sendpage_nokey, .recvmsg = skcipher_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/include/linux/net.h b/include/linux/net.h index b73ad8e3c212..e5794968ac9f 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -206,8 +206,6 @@ struct proto_ops { size_t total_len, int flags); int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); - ssize_t (*sendpage) (struct socket *sock, struct page *page, - int offset, size_t size, int flags); ssize_t (*splice_read)(struct socket *sock, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); int (*set_peek_off)(struct sock *sk, int val); @@ -220,8 +218,6 @@ struct proto_ops { sk_read_actor_t recv_actor); /* This is different from read_sock(), it reads an entire skb at a time. */ int (*read_skb)(struct sock *sk, skb_read_actor_t recv_actor); - int (*sendpage_locked)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg, size_t size); int (*set_rcvlowat)(struct sock *sk, int val); @@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, int flags); int kernel_getsockname(struct socket *sock, struct sockaddr *addr); int kernel_getpeername(struct socket *sock, struct sockaddr *addr); -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how); /* Routine returns the IP overhead imposed by a (caller-protected) socket. */ diff --git a/include/net/inet_common.h b/include/net/inet_common.h index cec453c18f1d..054c3388fa51 100644 --- a/include/net/inet_common.h +++ b/include/net/inet_common.h @@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags, bool kern); int inet_send_prepare(struct sock *sk); int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); int inet_shutdown(struct socket *sock, int how); diff --git a/include/net/sock.h b/include/net/sock.h index 573f2bf7e0de..4618cd21e16b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1265,8 +1265,6 @@ struct proto { size_t len); int (*recvmsg)(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); - int (*sendpage)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int (*bind)(struct sock *sk, struct sockaddr *addr, int addr_len); int (*bind_add)(struct sock *sk, @@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int); int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct *vma); -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags); /* * Functions to fill in entries in struct proto_ops when a protocol diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c index a06f4d4a6f47..8978fb6212ff 100644 --- a/net/appletalk/ddp.c +++ b/net/appletalk/ddp.c @@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = { .sendmsg = atalk_sendmsg, .recvmsg = atalk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block ddp_notifier = { diff --git a/net/atm/pvc.c b/net/atm/pvc.c index 53e7d3f39e26..66d9a9bd5896 100644 --- a/net/atm/pvc.c +++ b/net/atm/pvc.c @@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/atm/svc.c b/net/atm/svc.c index 4a02bcaad279..289240fe234e 100644 --- a/net/atm/svc.c +++ b/net/atm/svc.c @@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index d8da400cb4de..5db805d5f74d 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = { .sendmsg = ax25_sendmsg, .recvmsg = ax25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c index 4eebcc66c19a..9c82698da4f5 100644 --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = { .sendmsg = caif_seqpkt_sendmsg, .recvmsg = caif_seqpkt_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops caif_stream_ops = { @@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = { .sendmsg = caif_stream_sendmsg, .recvmsg = caif_stream_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* This function is called when a socket is finally destroyed. */ diff --git a/net/can/bcm.c b/net/can/bcm.c index 27706f6ace34..65a946a36d92 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = { .sendmsg = bcm_sendmsg, .recvmsg = bcm_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto bcm_proto __read_mostly = { diff --git a/net/can/isotp.c b/net/can/isotp.c index 9bc344851704..0c3d11c29a2b 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = { .sendmsg = isotp_sendmsg, .recvmsg = isotp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto isotp_proto __read_mostly = { diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 7e90f9e61d9b..2bfe4f79bb67 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = { .sendmsg = j1939_sk_sendmsg, .recvmsg = j1939_sk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto j1939_proto __read_mostly = { diff --git a/net/can/raw.c b/net/can/raw.c index f64469b98260..15c79b079184 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = { .sendmsg = raw_sendmsg, .recvmsg = raw_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto raw_proto __read_mostly = { diff --git a/net/core/sock.c b/net/core/sock.c index 341c565dbc26..c2ae77bb2075 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file) } } -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg(sock, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage); - -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage_locked); - /* * Default Socket Callbacks */ @@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) { seq_printf(seq, "%-9s %4u %6d %6ld %-3s %6u %-3s %-10s " - "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", + "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", proto->name, proto->obj_size, sock_prot_inuse_get(seq_file_net(seq), proto), @@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) proto_method_implemented(proto->getsockopt), proto_method_implemented(proto->sendmsg), proto_method_implemented(proto->recvmsg), - proto_method_implemented(proto->sendpage), proto_method_implemented(proto->bind), proto_method_implemented(proto->backlog_rcv), proto_method_implemented(proto->hash), @@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v) "maxhdr", "slab", "module", - "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n"); + "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n"); else proto_seq_printf(seq, list_entry(v, struct proto, node)); return 0; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index b780827f5e0a..ea808de374ea 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw dccp_v4_protosw = { diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index b9d7c3dd1cb3..23eb8159e3cd 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 1fa2fe041ec0..1238f036117f 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* DGRAM Sockets (802.15.4 dataframes) */ @@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void ieee802154_sock_destruct(struct sock *sk) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 8db6747f892f..869b49933f15 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) } EXPORT_SYMBOL(inet_sendmsg); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - struct sock *sk = sock->sk; - const struct proto *prot; - - if (unlikely(inet_send_prepare(sk))) - return -EAGAIN; - - /* IPV6_ADDRFORM can change sk->sk_prot under us. */ - prot = READ_ONCE(sk->sk_prot); - if (prot->sendpage) - return prot->sendpage(sk, page, offset, size, flags); - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(inet_sendpage); - INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *, size_t, int, int *)); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, @@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = { #ifdef CONFIG_MMU .mmap = tcp_mmap, #endif - .sendpage = inet_sendpage, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .peek_len = tcp_peek_len, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = { .read_skb = udp_read_skb, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, #endif diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index f1454e4497df..26fa387f1084 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -971,42 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_sendmsg_locked(sk, &msg, size); -} -EXPORT_SYMBOL_GPL(tcp_sendpage_locked); - -int tcp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - int ret; - - lock_sock(sk); - ret = tcp_sendpage_locked(sk, page, offset, size, flags); - release_sock(sk); - - return ret; -} -EXPORT_SYMBOL(tcp_sendpage); - void tcp_free_fastopen_req(struct tcp_sock *tp) { if (tp->fastopen_req) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index de37a4372437..ab83cfb9de22 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) return copied ? copied : err; } -static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_bpf_sendmsg(sk, &msg, size); -} - enum { TCP_BPF_IPV4, TCP_BPF_IPV6, @@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS], prot[TCP_BPF_TX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_TX].sendmsg = tcp_bpf_sendmsg; - prot[TCP_BPF_TX].sendpage = tcp_bpf_sendpage; prot[TCP_BPF_RX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_RX].recvmsg = tcp_bpf_recvmsg_parser; @@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops) * indeed valid assumptions. */ return ops->recvmsg == tcp_recvmsg && - ops->sendmsg == tcp_sendmsg && - ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP; + ops->sendmsg == tcp_sendmsg ? 0 : -ENOTSUPP; } int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ea370afa70ed..5c2e1c1ca329 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3112,7 +3112,6 @@ struct proto tcp_prot = { .keepalive = tcp_set_keepalive, .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v4_do_rcv, .release_cb = tcp_release_cb, .hash = inet_hash, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 097feb92e215..85bd5960f7ef 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) } EXPORT_SYMBOL(udp_sendmsg); -int udp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE - }; - int ret; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - lock_sock(sk); - ret = udp_sendmsg(sk, &msg, size); - release_sock(sk); - return ret; -} - #define UDP_SKB_IS_STATELESS 0x80000000 /* all head states (dst, sk, nf conntrack) except skb extensions are @@ -2926,7 +2905,6 @@ struct proto udp_prot = { .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, - .sendpage = udp_sendpage, .release_cb = ip4_datagram_release_cb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h index 4ba7a88a1b1d..e1ff3a375996 100644 --- a/net/ipv4/udp_impl.h +++ b/net/ipv4/udp_impl.h @@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname, int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); -int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, - int flags); void udp_destroy_sock(struct sock *sk); #ifdef CONFIG_PROC_FS diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c index e0c9cc39b81e..69870f0afc6c 100644 --- a/net/ipv4/udplite.c +++ b/net/ipv4/udplite.c @@ -54,7 +54,6 @@ struct proto udplite_prot = { .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, - .sendpage = udp_sendpage, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 38689bedfce7..769c76d59053 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = { #ifdef CONFIG_MMU .mmap = tcp_mmap, #endif - .sendpage = inet_sendpage, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, @@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = { .recvmsg = inet6_recvmsg, /* retpoline's sake */ .read_skb = udp_read_skb, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index bac9ba747bde..c6c062678c0e 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = { .sendmsg = inet_sendmsg, /* ok */ .recvmsg = sock_common_recvmsg, /* ok */ .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 1bf93b61aa06..03ba1e389901 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = { .keepalive = tcp_set_keepalive, .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v6_do_rcv, .release_cb = tcp_release_cb, .hash = inet6_hash, diff --git a/net/key/af_key.c b/net/key/af_key.c index a815f5ab4c49..bf59d42dc697 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, /* Now the operations that really occur. */ .release = pfkey_release, diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c index 4db5a554bdbd..d0dcbe3a4cd7 100644 --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw l2tp_ip_protosw = { diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 2478aa60145f..49296ce14a90 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c index da7fe94bea2e..addd94da2a81 100644 --- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = { .sendmsg = llc_ui_sendmsg, .recvmsg = llc_ui_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const char llc_proc_err_msg[] __initconst = diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c index 3150f3f0c872..c6fe2e6b85dd 100644 --- a/net/mctp/af_mctp.c +++ b/net/mctp/af_mctp.c @@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = { .sendmsg = mctp_sendmsg, .recvmsg = mctp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = mctp_compat_ioctl, #endif diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3ad9c46202fc..ade89b8d0082 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, }; static struct inet_protosw mptcp_protosw = { @@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = { .sendmsg = inet6_sendmsg, .recvmsg = inet6_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index c64277659753..f70073a3bb49 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = { .sendmsg = netlink_sendmsg, .recvmsg = netlink_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family netlink_family_ops = { diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c index 5a4cb796150f..eb8ccbd58df7 100644 --- a/net/netrom/af_netrom.c +++ b/net/netrom/af_netrom.c @@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = { .sendmsg = nr_sendmsg, .recvmsg = nr_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block nr_dev_notifier = { diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index d4e76e2ae153..385bd4982b80 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = { .sendmsg = packet_sendmsg_spkt, .recvmsg = packet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops packet_ops = { @@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = { .sendmsg = packet_sendmsg, .recvmsg = packet_recvmsg, .mmap = packet_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family packet_family_ops = { diff --git a/net/phonet/socket.c b/net/phonet/socket.c index 71e2caf6ab85..a246f7d0a817 100644 --- a/net/phonet/socket.c +++ b/net/phonet/socket.c @@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; const struct proto_ops phonet_stream_ops = { @@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; EXPORT_SYMBOL(phonet_stream_ops); diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c index 5c2fb992803b..5bb7d680bd5f 100644 --- a/net/qrtr/af_qrtr.c +++ b/net/qrtr/af_qrtr.c @@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = { .shutdown = sock_no_shutdown, .release = qrtr_release, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto qrtr_proto = { diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index 3ff6995244e5..01c4cdfef45d 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = { .sendmsg = rds_sendmsg, .recvmsg = rds_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void rds_sock_destruct(struct sock *sk) diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index ca2b17f32670..49dafe9ac72f 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = { .sendmsg = rose_sendmsg, .recvmsg = rose_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block rose_dev_notifier = { diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index 102f5cbff91a..182495804f8f 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = { .sendmsg = rxrpc_sendmsg, .recvmsg = rxrpc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto rxrpc_proto = { diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index c365df24ad33..acb2d2a69268 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* Registration with AF_INET family. */ diff --git a/net/socket.c b/net/socket.c index 1b48a976b8cc..130d6ce7f82d 100644 --- a/net/socket.c +++ b/net/socket.c @@ -3541,54 +3541,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr) } EXPORT_SYMBOL(kernel_getpeername); -/** - * kernel_sendpage - send a &page through a socket (kernel space) - * @sock: socket - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - */ - -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - if (sock->ops->sendpage) { - /* Warn in case the improper page to zero-copy send */ - WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send"); - return sock->ops->sendpage(sock, page, offset, size, flags); - } - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage); - -/** - * kernel_sendpage_locked - send a &page through the locked sock (kernel space) - * @sk: sock - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - * Caller must hold @sk. - */ - -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct socket *sock = sk->sk_socket; - - if (sock->ops->sendpage_locked) - return sock->ops->sendpage_locked(sk, page, offset, size, - flags); - - return sock_no_sendpage_locked(sk, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage_locked); - /** * kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space) * @sock: socket diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 37edfe10f8c6..d2072fbf3272 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = { .sendmsg = tipc_sendmsg, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops packet_ops = { @@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = { .sendmsg = tipc_send_packet, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops stream_ops = { @@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = { .sendmsg = tipc_sendstream, .recvmsg = tipc_recvstream, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct net_proto_family tipc_family_ops = { diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 6f3454db9c53..407f449df564 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon static int unix_shutdown(struct socket *, int); static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int); -static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset, - size_t size, int flags); static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, struct pipe_inode_info *, size_t size, unsigned int flags); @@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = { .recvmsg = unix_stream_recvmsg, .read_skb = unix_stream_read_skb, .mmap = sock_no_mmap, - .sendpage = unix_stream_sendpage, .splice_read = unix_stream_splice_read, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, @@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = { .read_skb = unix_read_skb, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = { .sendmsg = unix_seqpacket_sendmsg, .recvmsg = unix_seqpacket_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock, } } -static int maybe_init_creds(struct scm_cookie *scm, - struct socket *socket, - const struct sock *other) -{ - int err; - struct msghdr msg = { .msg_controllen = 0 }; - - err = scm_send(socket, &msg, scm, false); - if (err) - return err; - - if (unix_passcred_enabled(socket, other)) { - scm->pid = get_pid(task_tgid(current)); - current_uid_gid(&scm->creds.uid, &scm->creds.gid); - } - return err; -} - static bool unix_skb_scm_eq(struct sk_buff *skb, struct scm_cookie *scm) { @@ -2318,122 +2295,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, return sent ? : err; } -static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page, - int offset, size_t size, int flags) -{ - int err; - bool send_sigpipe = false; - bool init_scm = true; - struct scm_cookie scm; - struct sock *other, *sk = socket->sk; - struct sk_buff *skb, *newskb = NULL, *tail = NULL; - - if (flags & MSG_OOB) - return -EOPNOTSUPP; - - other = unix_peer(sk); - if (!other || sk->sk_state != TCP_ESTABLISHED) - return -ENOTCONN; - - if (false) { -alloc_skb: - unix_state_unlock(other); - mutex_unlock(&unix_sk(other)->iolock); - newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT, - &err, 0); - if (!newskb) - goto err; - } - - /* we must acquire iolock as we modify already present - * skbs in the sk_receive_queue and mess with skb->len - */ - err = mutex_lock_interruptible(&unix_sk(other)->iolock); - if (err) { - err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS; - goto err; - } - - if (sk->sk_shutdown & SEND_SHUTDOWN) { - err = -EPIPE; - send_sigpipe = true; - goto err_unlock; - } - - unix_state_lock(other); - - if (sock_flag(other, SOCK_DEAD) || - other->sk_shutdown & RCV_SHUTDOWN) { - err = -EPIPE; - send_sigpipe = true; - goto err_state_unlock; - } - - if (init_scm) { - err = maybe_init_creds(&scm, socket, other); - if (err) - goto err_state_unlock; - init_scm = false; - } - - skb = skb_peek_tail(&other->sk_receive_queue); - if (tail && tail == skb) { - skb = newskb; - } else if (!skb || !unix_skb_scm_eq(skb, &scm)) { - if (newskb) { - skb = newskb; - } else { - tail = skb; - goto alloc_skb; - } - } else if (newskb) { - /* this is fast path, we don't necessarily need to - * call to kfree_skb even though with newskb == NULL - * this - does no harm - */ - consume_skb(newskb); - newskb = NULL; - } - - if (skb_append_pagefrags(skb, page, offset, size)) { - tail = skb; - goto alloc_skb; - } - - skb->len += size; - skb->data_len += size; - skb->truesize += size; - refcount_add(size, &sk->sk_wmem_alloc); - - if (newskb) { - err = unix_scm_to_skb(&scm, skb, false); - if (err) - goto err_state_unlock; - spin_lock(&other->sk_receive_queue.lock); - __skb_queue_tail(&other->sk_receive_queue, newskb); - spin_unlock(&other->sk_receive_queue.lock); - } - - unix_state_unlock(other); - mutex_unlock(&unix_sk(other)->iolock); - - other->sk_data_ready(other); - scm_destroy(&scm); - return size; - -err_state_unlock: - unix_state_unlock(other); -err_unlock: - mutex_unlock(&unix_sk(other)->iolock); -err: - kfree_skb(newskb); - if (send_sigpipe && !(flags & MSG_NOSIGNAL)) - send_sig(SIGPIPE, current, 0); - if (!init_scm) - scm_destroy(&scm); - return err; -} - static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 19aea7cba26e..d0e476755cdc 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = { .sendmsg = vsock_dgram_sendmsg, .recvmsg = vsock_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static int vsock_transport_cancel_pkt(struct vsock_sock *vsk) @@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_rcvlowat = vsock_set_rcvlowat, }; @@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static int vsock_create(struct net *net, struct socket *sock, diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index 5c7ad301d742..0fb5143bec7a 100644 --- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = { .sendmsg = x25_sendmsg, .recvmsg = x25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct packet_type x25_packet_type __read_mostly = { diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 2ac58b282b5e..eff1f0aaa4b5 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = { .sendmsg = xsk_sendmsg, .recvmsg = xsk_recvmsg, .mmap = xsk_mmap, - .sendpage = sock_no_sendpage, }; static void xsk_destruct(struct sock *sk)