From patchwork Tue Jun 20 14:53:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110551 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3727034vqr; Tue, 20 Jun 2023 08:00:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4x4TIHJaDcZgGJKocMA9eN0ERiPg/4QJUQuXFJVzb8Q7cKQBX1EOTVxXyJtCurtnHO1ouJ X-Received: by 2002:a05:6a00:170e:b0:666:e1f4:5153 with SMTP id h14-20020a056a00170e00b00666e1f45153mr8127026pfc.0.1687273256247; Tue, 20 Jun 2023 08:00:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273256; cv=none; d=google.com; s=arc-20160816; b=YxcVxSCVigHAavQPWvo+jMbx1eJW4UzyuyK0zfctxxZ/OTqGnTPX+0NP4il+w/5ulk uxlwwAhwidDNmBX2wnQ47CH2tKIkK4bl93vSO8zAla6Oeq2UPHvJ3y1BS9XxPCAq/wJ/ dwtTeg53K1rtDeUlwLWo2ew17n6gR/rY8C+0Nd+Fc+tkT1QCW9DnQMvZ1lfhmatWcd/z u6gMspLpJo0KW4B2Kz/Z1YBsAH5Kyrcs32X1Y/8BW83Z+penZ30GvZJkhNzAF9/sPhkZ wTetBC/K2nvfWoXxo6L1C8dA9CXKdD7biRm7hyY26zoVcZbZfPs2kI+Hvzudq0h+UUHR ordg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Xe+E3y9iF5WoqF9RAoI5mIYI7LZ4lxxn5w1KkKAWiNw=; b=Vh6y3VA1ZWGuwTC89tjFRSYtkzCEDok5FogKo4E7oBv4PmxmtmUGLtumY4eSf7BWXG wfv5ivZQxfm+LCbvj4QEu+42xA62ve9YcaMNaEcQ/v6CknEAdkdq6n7OMjD8/QursKKF O+UsAYLTccPIYk2msuGjyUSnK6cBA5ONATUFMDed0Dg16rl2ZTBtoe9QwteI6DLgTWP5 MnY2blunyh9E+IgGZLInJDdQw3hws/mkCJu8+83alpZ2WXKWMKnT50cRGb15pRfbUPDs r8tetalYNfY5TXjbEd8bNA/r0gkw2FZgx/4PHXCvzrq2YVi2G05555Yt5/f93amz3WuK 1V5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZV3zqbRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a187-20020a624dc4000000b0065930ad0643si1807393pfb.79.2023.06.20.08.00.38; Tue, 20 Jun 2023 08:00:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZV3zqbRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232954AbjFTO5V (ORCPT + 99 others); Tue, 20 Jun 2023 10:57:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232353AbjFTO5R (ORCPT ); Tue, 20 Jun 2023 10:57:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1832F95 for ; Tue, 20 Jun 2023 07:56:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xe+E3y9iF5WoqF9RAoI5mIYI7LZ4lxxn5w1KkKAWiNw=; b=ZV3zqbRvSAZNZyxoW6WCjZVYszEY9DLDim/UB4rfBC4oJcECgEewwQgrflboC5jq6++E/u I/vtAfEpyuyg283GRiGSDMpoXjhYWWEv7n/gwDBys72T9wpHvfwKn1b/hamOWZ6+E3yL0V Bx2Lbl0RBeZUBVVFJM1SIc6dNZNwL3o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-601-lKmh20CrMjqnzvPqQo7SgA-1; Tue, 20 Jun 2023 10:56:26 -0400 X-MC-Unique: lKmh20CrMjqnzvPqQo7SgA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 24F888E6CDA; Tue, 20 Jun 2023 14:53:45 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4FEFB2166B26; Tue, 20 Jun 2023 14:53:43 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Menglong Dong Subject: [PATCH net-next v3 01/18] net: Copy slab data for sendmsg(MSG_SPLICE_PAGES) Date: Tue, 20 Jun 2023 15:53:20 +0100 Message-ID: <20230620145338.1300897-2-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234241798352591?= X-GMAIL-MSGID: =?utf-8?q?1769234241798352591?= If sendmsg() is passed MSG_SPLICE_PAGES and is given a buffer that contains some data that's resident in the slab, copy it rather than returning EIO. This can be made use of by a number of drivers in the kernel, including: iwarp, ceph/rds, dlm, nvme, ocfs2, drdb. It could also be used by iscsi, rxrpc, sunrpc, cifs and probably others. skb_splice_from_iter() is given it's own fragment allocator as page_frag_alloc_align() can't be used because it does no locking to prevent parallel callers from racing. alloc_skb_frag() uses a separate folio for each cpu and locks to the cpu whilst allocating, reenabling cpu migration around folio allocation. This could allocate a whole page instead for each fragment to be copied, as alloc_skb_with_frags() would do instead, but that would waste a lot of space (most of the fragments look like they're going to be small). This allows an entire message that consists of, say, a protocol header or two, a number of pages of data and a protocol footer to be sent using a single call to sock_sendmsg(). The callers could be made to copy the data into fragments before calling sendmsg(), but that then penalises them if MSG_SPLICE_PAGES gets ignored. Signed-off-by: David Howells cc: Alexander Duyck cc: Eric Dumazet cc: "David S. Miller" cc: David Ahern cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: Menglong Dong cc: netdev@vger.kernel.org --- Notes: ver #3) - Remove duplicate decl of skb_splice_from_iter(). ver #2) - Fix parameter to put_cpu_ptr() to have an '&'. include/linux/skbuff.h | 2 + net/core/skbuff.c | 171 ++++++++++++++++++++++++++++++++++++++++- 2 files changed, 170 insertions(+), 3 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 91ed66952580..5f53bd5d375d 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -5037,6 +5037,8 @@ static inline void skb_mark_for_recycle(struct sk_buff *skb) #endif } +void *alloc_skb_frag(size_t fragsz, gfp_t gfp); +void *copy_skb_frag(const void *s, size_t len, gfp_t gfp); ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, ssize_t maxsize, gfp_t gfp); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index fee2b1c105fe..d962c93a429d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -6755,6 +6755,145 @@ nodefer: __kfree_skb(skb); smp_call_function_single_async(cpu, &sd->defer_csd); } +struct skb_splice_frag_cache { + struct folio *folio; + void *virt; + unsigned int offset; + /* we maintain a pagecount bias, so that we dont dirty cache line + * containing page->_refcount every time we allocate a fragment. + */ + unsigned int pagecnt_bias; + bool pfmemalloc; +}; + +static DEFINE_PER_CPU(struct skb_splice_frag_cache, skb_splice_frag_cache); + +/** + * alloc_skb_frag - Allocate a page fragment for using in a socket + * @fragsz: The size of fragment required + * @gfp: Allocation flags + */ +void *alloc_skb_frag(size_t fragsz, gfp_t gfp) +{ + struct skb_splice_frag_cache *cache; + struct folio *folio, *spare = NULL; + size_t offset, fsize; + void *p; + + if (WARN_ON_ONCE(fragsz == 0)) + fragsz = 1; + + cache = get_cpu_ptr(&skb_splice_frag_cache); +reload: + folio = cache->folio; + offset = cache->offset; +try_again: + if (fragsz > offset) + goto insufficient_space; + + /* Make the allocation. */ + cache->pagecnt_bias--; + offset = ALIGN_DOWN(offset - fragsz, SMP_CACHE_BYTES); + cache->offset = offset; + p = cache->virt + offset; + put_cpu_ptr(&skb_splice_frag_cache); + if (spare) + folio_put(spare); + return p; + +insufficient_space: + /* See if we can refurbish the current folio. */ + if (!folio || !folio_ref_sub_and_test(folio, cache->pagecnt_bias)) + goto get_new_folio; + if (unlikely(cache->pfmemalloc)) { + __folio_put(folio); + goto get_new_folio; + } + + fsize = folio_size(folio); + if (unlikely(fragsz > fsize)) + goto frag_too_big; + + /* OK, page count is 0, we can safely set it */ + folio_set_count(folio, PAGE_FRAG_CACHE_MAX_SIZE + 1); + + /* Reset page count bias and offset to start of new frag */ + cache->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = fsize; + goto try_again; + +get_new_folio: + if (!spare) { + cache->folio = NULL; + put_cpu_ptr(&skb_splice_frag_cache); + +#if PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE + spare = folio_alloc(gfp | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC, + PAGE_FRAG_CACHE_MAX_ORDER); + if (!spare) +#endif + spare = folio_alloc(gfp, 0); + if (!spare) + return NULL; + + cache = get_cpu_ptr(&skb_splice_frag_cache); + /* We may now be on a different cpu and/or someone else may + * have refilled it + */ + cache->pfmemalloc = folio_is_pfmemalloc(spare); + if (cache->folio) + goto reload; + } + + cache->folio = spare; + cache->virt = folio_address(spare); + folio = spare; + spare = NULL; + + /* Even if we own the page, we do not use atomic_set(). This would + * break get_page_unless_zero() users. + */ + folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); + + /* Reset page count bias and offset to start of new frag */ + cache->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = folio_size(folio); + goto try_again; + +frag_too_big: + /* The caller is trying to allocate a fragment with fragsz > PAGE_SIZE + * but the cache isn't big enough to satisfy the request, this may + * happen in low memory conditions. We don't release the cache page + * because it could make memory pressure worse so we simply return NULL + * here. + */ + cache->offset = offset; + put_cpu_ptr(&skb_splice_frag_cache); + if (spare) + folio_put(spare); + return NULL; +} +EXPORT_SYMBOL(alloc_skb_frag); + +/** + * copy_skb_frag - Copy data into a page fragment. + * @s: The data to copy + * @len: The size of the data + * @gfp: Allocation flags + */ +void *copy_skb_frag(const void *s, size_t len, gfp_t gfp) +{ + void *p; + + p = alloc_skb_frag(len, gfp); + if (!p) + return NULL; + + return memcpy(p, s, len); +} +EXPORT_SYMBOL(copy_skb_frag); + static void skb_splice_csum_page(struct sk_buff *skb, struct page *page, size_t offset, size_t len) { @@ -6808,17 +6947,43 @@ ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, break; } + if (space == 0 && + !skb_can_coalesce(skb, skb_shinfo(skb)->nr_frags, + pages[0], off)) { + iov_iter_revert(iter, len); + break; + } + i = 0; do { struct page *page = pages[i++]; size_t part = min_t(size_t, PAGE_SIZE - off, len); - - ret = -EIO; - if (WARN_ON_ONCE(!sendpage_ok(page))) + bool put = false; + + if (PageSlab(page)) { + const void *p; + void *q; + + p = kmap_local_page(page); + q = copy_skb_frag(p + off, part, gfp); + kunmap_local(p); + if (!q) { + iov_iter_revert(iter, len); + ret = -ENOMEM; + goto out; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } else if (WARN_ON_ONCE(!sendpage_ok(page))) { + ret = -EIO; goto out; + } ret = skb_append_pagefrags(skb, page, off, part, frag_limit); + if (put) + put_page(page); if (ret < 0) { iov_iter_revert(iter, len); goto out; From patchwork Tue Jun 20 14:53:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110559 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3736979vqr; Tue, 20 Jun 2023 08:12:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Vjx0PhvnYNVbzYHdKowKJ7+PHSnAjAsvR5/aW1nYrd//hTxLgTKpXvXAK7MJhwoUoY8Ou X-Received: by 2002:a05:6a20:4299:b0:121:d74e:8b46 with SMTP id o25-20020a056a20429900b00121d74e8b46mr4057363pzj.55.1687273934236; Tue, 20 Jun 2023 08:12:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273934; cv=none; d=google.com; s=arc-20160816; b=v0DOl1z2ObySok5z0uYL0jAy1IU1OfVvpHXZJLcIC+wphBuZxn93tB2sMhVZ9ORI2I MxUxPm9d0KnE0CqV0l6+UPXm6ccip4pQcnNemogq3++EjF5ttK9mYScs9+OtaXoZYiCm OUEUfBfy0r+4vlpIxBAs6CHHCuZPvxA44gLwgF++i0dC9Nw+yTfeeVjnSaYYQ1wbKC0z Rigkx2trWRu9lAGf1E1lgLpH2z5zECUCGa+7zjCUmwihSh9K7KuVROj40Ato8XQc0PtZ xEOO58NGX/2VmdZC5+xrCQFgepT61ev6uw2XfzgwaxExVbbDjqjYa5LgEa2bOVnU6YwC mtCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+u30ziD9KgNs/SO68HcOpUwCa+4S9m3KYto4qjbvl9U=; b=bxs9ydjd4AXV8nnk8n91nekjrRitnpt11e0pjYpHehHls3PLaHVLuBpQiWdyRegc++ wrzLLbUcjCodyLJMl0EG5hspjJc9VgBN+L7VBngXRpQg6z5eMzxSQWrka1RSUHHQiq7M lZyHODtNN8+2YKjFsqGc0BBwz7iVramxKbPnQ5e5TlN20bBT+dJsbIRcj4EFaZy91ODV 7Yp6Qr+46C75uKIHE6rUH1mOJUb7ht4AaXtl5raJbY0SNUpDPXG7xsbLdvdR6m2m+saU 3bwb0D2ZirNhuIzwJxwsl2B8eLnNHZ52CmtFQVsL2I7wu0O/s6fmrX9wfbEDdKny1P6E cdHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D1NBsNoz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n7-20020a63b447000000b00534866eb2c2si1799363pgu.835.2023.06.20.08.12.00; Tue, 20 Jun 2023 08:12:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D1NBsNoz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231968AbjFTOzs (ORCPT + 99 others); Tue, 20 Jun 2023 10:55:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231879AbjFTOzr (ORCPT ); Tue, 20 Jun 2023 10:55:47 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0EE31709 for ; Tue, 20 Jun 2023 07:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272898; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+u30ziD9KgNs/SO68HcOpUwCa+4S9m3KYto4qjbvl9U=; b=D1NBsNozghwelIVfKn8DQ+4969LMyK50obJCrJA4oCzEl5TaoXcFXpGiC7QaUavp/+462j UPo3x4u0vsi7arMS1CAWQdHgdoFfRkafLdi4fSFtcxU5wxOOXR2EJymqpk9YwLc74kfHi0 ErOnUQMBTaocUMjNvQOcJVN/G7sVUIw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-43-AZi11j61MEqhDouArFBQJA-1; Tue, 20 Jun 2023 10:54:52 -0400 X-MC-Unique: AZi11j61MEqhDouArFBQJA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 341D238117F0; Tue, 20 Jun 2023 14:53:47 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id AD2FE40C6CD2; Tue, 20 Jun 2023 14:53:45 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Menglong Dong Subject: [PATCH net-next v3 02/18] net: Display info about MSG_SPLICE_PAGES memory handling in proc Date: Tue, 20 Jun 2023 15:53:21 +0100 Message-ID: <20230620145338.1300897-3-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234952481963331?= X-GMAIL-MSGID: =?utf-8?q?1769234952481963331?= Display information about the memory handling MSG_SPLICE_PAGES does to copy slabbed data into page fragments. For each CPU that has a cached folio, it displays the folio pfn, the offset pointer within the folio and the size of the folio. It also displays the number of pages refurbished and the number of pages replaced. Signed-off-by: David Howells cc: Alexander Duyck cc: Eric Dumazet cc: "David S. Miller" cc: David Ahern cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: Menglong Dong cc: netdev@vger.kernel.org --- net/core/skbuff.c | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d962c93a429d..36605510a76d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -83,6 +83,7 @@ #include #include #include +#include #include "dev.h" #include "sock_destructor.h" @@ -6758,6 +6759,7 @@ nodefer: __kfree_skb(skb); struct skb_splice_frag_cache { struct folio *folio; void *virt; + unsigned int fsize; unsigned int offset; /* we maintain a pagecount bias, so that we dont dirty cache line * containing page->_refcount every time we allocate a fragment. @@ -6767,6 +6769,26 @@ struct skb_splice_frag_cache { }; static DEFINE_PER_CPU(struct skb_splice_frag_cache, skb_splice_frag_cache); +static atomic_t skb_splice_frag_replaced, skb_splice_frag_refurbished; + +static int skb_splice_show(struct seq_file *m, void *data) +{ + int cpu; + + seq_printf(m, "refurb=%u repl=%u\n", + atomic_read(&skb_splice_frag_refurbished), + atomic_read(&skb_splice_frag_replaced)); + + for_each_possible_cpu(cpu) { + const struct skb_splice_frag_cache *cache = + per_cpu_ptr(&skb_splice_frag_cache, cpu); + + seq_printf(m, "[%u] %lx %u/%u\n", + cpu, folio_pfn(cache->folio), + cache->offset, cache->fsize); + } + return 0; +} /** * alloc_skb_frag - Allocate a page fragment for using in a socket @@ -6803,17 +6825,21 @@ void *alloc_skb_frag(size_t fragsz, gfp_t gfp) insufficient_space: /* See if we can refurbish the current folio. */ - if (!folio || !folio_ref_sub_and_test(folio, cache->pagecnt_bias)) + if (!folio) goto get_new_folio; + if (!folio_ref_sub_and_test(folio, cache->pagecnt_bias)) + goto replace_folio; if (unlikely(cache->pfmemalloc)) { __folio_put(folio); - goto get_new_folio; + goto replace_folio; } fsize = folio_size(folio); if (unlikely(fragsz > fsize)) goto frag_too_big; + atomic_inc(&skb_splice_frag_refurbished); + /* OK, page count is 0, we can safely set it */ folio_set_count(folio, PAGE_FRAG_CACHE_MAX_SIZE + 1); @@ -6822,6 +6848,8 @@ void *alloc_skb_frag(size_t fragsz, gfp_t gfp) offset = fsize; goto try_again; +replace_folio: + atomic_inc(&skb_splice_frag_replaced); get_new_folio: if (!spare) { cache->folio = NULL; @@ -6848,6 +6876,7 @@ void *alloc_skb_frag(size_t fragsz, gfp_t gfp) cache->folio = spare; cache->virt = folio_address(spare); + cache->fsize = folio_size(spare); folio = spare; spare = NULL; @@ -6858,7 +6887,7 @@ void *alloc_skb_frag(size_t fragsz, gfp_t gfp) /* Reset page count bias and offset to start of new frag */ cache->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - offset = folio_size(folio); + offset = cache->fsize; goto try_again; frag_too_big: @@ -7007,3 +7036,10 @@ ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, return spliced ?: ret; } EXPORT_SYMBOL(skb_splice_from_iter); + +static int skb_splice_init(void) +{ + proc_create_single("pagefrags", S_IFREG | 0444, NULL, &skb_splice_show); + return 0; +} +late_initcall(skb_splice_init); From patchwork Tue Jun 20 14:53:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110563 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3738640vqr; Tue, 20 Jun 2023 08:14:22 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Gny+ppyIhO2KlMsc9b2aY4IcACWwhHUYwuhPoc1G5itFhb90L3Dw+9dkBvX0/lgOlaxei X-Received: by 2002:a17:902:ed44:b0:1b1:9218:6bf9 with SMTP id y4-20020a170902ed4400b001b192186bf9mr9970927plb.43.1687274061879; Tue, 20 Jun 2023 08:14:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274061; cv=none; d=google.com; s=arc-20160816; b=RNVoRKJfh8EYd6mY56NUJg3HkdsC1INC4OAYzhpMT8NWMYnYsdCI4K+mUulejbD5vt Zuq1UBBRBvoV7npJOs2vMYEiFkZJbdFGcPb9zFld9P912rl745v1+lSlBnOp1I+xrAq0 d0TYhMllzdpLpFpqeskkSLBFDzXYAt02OblSNbZ/+heZmJlhMtCs4XvGLmkppDafOo+V p7hDqShCHeCZi2HC9ZF8vBcgOKyP3OMxo9V97ddvN4BO7wYkhdfu/Ph8xtK195yTCdGY YMipP0RnwgF+p04/bI27UPyUJFgvoiqAOvGIbV4MP/7Ig542H7fBUCUkL2yDe2pGI1JD 9RBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8t7wtWM3aEB5I+3bAZDQAyKBMtXipfmjmHgxZgR4FXg=; b=bvnkw+HK4VbLkPaCdowU6e8dIeLyawtTYVB/sKNRC1tE6K3gn8x26AtxK2FT+SJmjn C4EURpI+cF3gHXG/fELSn/aDJ0jvN99oLMEsLhnKBzsdUonm145vTp9TW450XMs8LcWW Jh/pIGRLu1qpq9M0y2BgBZuQh+MSBMUA7W0Veq3TC/S2+htUXHZk+YcwyOz1jWZv5u8E lyjCjhJFS4PrWFBfxHonaKFnVNHxNXuyaA27qyLZnTo+AWLSeQ3Ry4JOseVVE63FxSau wD6npigUiQ8ffJqDUnug5wBvTRwHRWqMpmvqV3vHDlXE3JjRMRkny6eY2Cx5oZCiQV8O /jjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gfKiCLSm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e18-20020a170902f11200b001ab29399c72si1986470plb.502.2023.06.20.08.14.08; Tue, 20 Jun 2023 08:14:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gfKiCLSm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233250AbjFTO61 (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233206AbjFTO6S (ORCPT ); Tue, 20 Jun 2023 10:58:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5365172E for ; Tue, 20 Jun 2023 07:57:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273032; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8t7wtWM3aEB5I+3bAZDQAyKBMtXipfmjmHgxZgR4FXg=; b=gfKiCLSmEqk9W5N3iFN/OSrhONTDRBxSLNP9F+1W98WUPdxWSwo2WLeAnseYnBLTwbaCUp KH8oN9cSbXbdQB7S+ITl/fUmTj2qDPP7gudsfE+6HQnxVQwZmkcttLMXR2Vn1MJDrvZ7o0 hhMDWkrX0pFLWAohh53AXfpTWl+Qn+c= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-189-70MSteDdOnyTFFxz8b5FjQ-1; Tue, 20 Jun 2023 10:57:10 -0400 X-MC-Unique: 70MSteDdOnyTFFxz8b5FjQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A140B89CA6F; Tue, 20 Jun 2023 14:53:50 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D2D2440462B8; Tue, 20 Jun 2023 14:53:47 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, John Fastabend , Jakub Sitnicki , Karsten Graul , Wenjia Zhang , Jan Karcher , "D. Wythe" , Tony Lu , Wen Gu , Boris Pismenny , Steffen Klassert , Herbert Xu , bpf@vger.kernel.org, linux-s390@vger.kernel.org Subject: [PATCH net-next v3 03/18] tcp_bpf, smc, tls, espintcp: Reduce MSG_SENDPAGE_NOTLAST usage Date: Tue, 20 Jun 2023 15:53:22 +0100 Message-ID: <20230620145338.1300897-4-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235087164707828?= X-GMAIL-MSGID: =?utf-8?q?1769235087164707828?= As MSG_SENDPAGE_NOTLAST is being phased out along with sendpage(), don't use it further in than the sendpage methods, but rather translate it to MSG_MORE and use that instead. Signed-off-by: David Howells cc: Willem de Bruijn cc: John Fastabend cc: Jakub Sitnicki cc: Eric Dumazet cc: "David S. Miller" cc: David Ahern cc: Jakub Kicinski cc: Paolo Abeni cc: Karsten Graul cc: Wenjia Zhang cc: Jan Karcher cc: "D. Wythe" cc: Tony Lu cc: Wen Gu cc: Boris Pismenny cc: Steffen Klassert cc: Herbert Xu cc: netdev@vger.kernel.org cc: bpf@vger.kernel.org cc: linux-s390@vger.kernel.org --- Notes: ver #3) - In tcp_bpf, reset msg_flags on each iteration to clear MSG_MORE. - In tcp_bpf, set MSG_MORE if there's more data in the sk_msg. net/ipv4/tcp_bpf.c | 5 +++-- net/smc/smc_tx.c | 6 ++++-- net/tls/tls_device.c | 4 ++-- net/xfrm/espintcp.c | 10 ++++++---- 4 files changed, 15 insertions(+), 10 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 5a84053ac62b..31d6005cea9b 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -88,9 +88,9 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, int flags, bool uncharge) { + struct msghdr msghdr = {}; bool apply = apply_bytes; struct scatterlist *sge; - struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct page *page; int size, ret = 0; u32 off; @@ -107,11 +107,12 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, tcp_rate_check_app_limited(sk); retry: + msghdr.msg_flags = flags | MSG_SPLICE_PAGES; has_tx_ulp = tls_sw_has_ctx_tx(sk); if (has_tx_ulp) msghdr.msg_flags |= MSG_SENDPAGE_NOPOLICY; - if (flags & MSG_SENDPAGE_NOTLAST) + if (size < sge->length && msg->sg.start != msg->sg.end) msghdr.msg_flags |= MSG_MORE; bvec_set_page(&bvec, page, size, off); diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c index 45128443f1f1..9b9e0a190734 100644 --- a/net/smc/smc_tx.c +++ b/net/smc/smc_tx.c @@ -168,8 +168,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg) * should known how/when to uncork it. */ if ((msg->msg_flags & MSG_MORE || - smc_tx_is_corked(smc) || - msg->msg_flags & MSG_SENDPAGE_NOTLAST) && + smc_tx_is_corked(smc)) && atomic_read(&conn->sndbuf_space)) return true; @@ -306,6 +305,9 @@ int smc_tx_sendpage(struct smc_sock *smc, struct page *page, int offset, struct kvec iov; int rc; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; + iov.iov_base = kaddr + offset; iov.iov_len = size; iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &iov, 1, size); diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index b82770f68807..975299d7213b 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -449,7 +449,7 @@ static int tls_push_data(struct sock *sk, return -sk->sk_err; flags |= MSG_SENDPAGE_DECRYPTED; - tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST; + tls_push_record_flags = flags | MSG_MORE; timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); if (tls_is_partially_sent_record(tls_ctx)) { @@ -532,7 +532,7 @@ static int tls_push_data(struct sock *sk, if (!size) { last_record: tls_push_record_flags = flags; - if (flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE)) { + if (flags & MSG_MORE) { more = true; break; } diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 3504925babdb..d3b3f9e720b3 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -205,13 +205,15 @@ static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg, static int espintcp_sendskmsg_locked(struct sock *sk, struct espintcp_msg *emsg, int flags) { - struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + struct msghdr msghdr = { + .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE, + }; struct sk_msg *skmsg = &emsg->skmsg; + bool more = flags & MSG_MORE; struct scatterlist *sg; int done = 0; int ret; - msghdr.msg_flags |= MSG_SENDPAGE_NOTLAST; sg = &skmsg->sg.data[skmsg->sg.start]; do { struct bio_vec bvec; @@ -221,8 +223,8 @@ static int espintcp_sendskmsg_locked(struct sock *sk, emsg->offset = 0; - if (sg_is_last(sg)) - msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; + if (sg_is_last(sg) && !more) + msghdr.msg_flags &= ~MSG_MORE; p = sg_page(sg); retry: From patchwork Tue Jun 20 14:53:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110550 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3727023vqr; Tue, 20 Jun 2023 08:00:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4nJ87Wl0cRqRvKJSj2aG7GPb0LqH+P4HI6PuChpPOqepmrH0QRrPhAcDzxCf/MnZLUXkok X-Received: by 2002:a05:6808:1147:b0:3a0:34af:3fb with SMTP id u7-20020a056808114700b003a034af03fbmr4495910oiu.58.1687273255406; Tue, 20 Jun 2023 08:00:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273255; cv=none; d=google.com; s=arc-20160816; b=PxfwhzIAG9tzdnRB2q1eDLRRHomexq7Ao+94Yf84oxh/Sj4IH/EKgss+w6UB7jv+PU oRmJ74jj+3sxyJNR9Zo6rohwXBbMPDRPGIOkjpMYd8y9szTCFiemqs0EQR2U4Me9GH1Z bV760EIE3pF81mJJtYmeZyUlxlXPMqxLuegF5HoGbwZ+SqNA45m366p1I1U76oqF/qtZ j5Y3h4SA1KLPWXPezyFQYNmx4lZyRn/DPQ1UHWBdV+NGBee+TzbJuKcpPrUtX0HNASHC 6J84qSlVsj6zNp7ZmwixNDcKZEkMYCBdcu2IokwQ1YGzyMMthWMq5IpP8z/heoOzHB2M GoLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VjthvFXeQGZeDxQ7unAKOEBDdmvRqvLtJxvDci7YFS0=; b=wtgavmdNVK1W2o1/Eev5GSMoJ1TQElzdrX9W9NyfbPpRogupHMN5ugVVz6ukSmas8/ ANqxemEdxVOGoDFI4DUglYOrNCoXIT2Ms3p/LKsJGNboCd+fHl1HBZ6OnauEqPuuutD9 v2ZPTmEtGjnAqn5XwM2fLFuQAlKBTORQaRCMlTEj5FYGm00/YhUtGp/iVztHofg4/GYW 0HiD7c1MJVor5SA8gz6CmQeADssW05oMmk+68ZLzlcAT2qkFqJImNO7MbVV2Wf/gIKnc D5yIQ4ouZmIcDWLQALICKDV9LdDyOeFTgUUX8b7YYpAUUjiiQNNTJifgqGQzKH95z/o9 EmqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TOmsSlGn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ca24-20020a17090af31800b0024e059b55adsi2089512pjb.116.2023.06.20.08.00.37; Tue, 20 Jun 2023 08:00:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TOmsSlGn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232778AbjFTO5N (ORCPT + 99 others); Tue, 20 Jun 2023 10:57:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232579AbjFTO5M (ORCPT ); Tue, 20 Jun 2023 10:57:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62B4E1A3 for ; Tue, 20 Jun 2023 07:56:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VjthvFXeQGZeDxQ7unAKOEBDdmvRqvLtJxvDci7YFS0=; b=TOmsSlGnLNz/nFz8AzqlrPK63zUqNPxEsxijZGxJdT/LnFsENCS71fAPwqfNrYN4uVEXAP kEYqOZEzhoaaP4YyhxYPD/Dh/vFneUzjC8GsGscB1+qP3lai8EPZx2QdSMy9JgGNb0wnRg DwkPHFTexT1sJEHEy23Ioq/4gaFm9UU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-414-iKtTzmMANW2P_f63qnZVtw-1; Tue, 20 Jun 2023 10:56:20 -0400 X-MC-Unique: iKtTzmMANW2P_f63qnZVtw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 36AF58A6967; Tue, 20 Jun 2023 14:53:53 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 35FACC1ED97; Tue, 20 Jun 2023 14:53:51 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Bernard Metzler , Tom Talpey , Jason Gunthorpe , Leon Romanovsky , linux-rdma@vger.kernel.org Subject: [PATCH net-next v3 04/18] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit Date: Tue, 20 Jun 2023 15:53:23 +0100 Message-ID: <20230620145338.1300897-5-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234240782475409?= X-GMAIL-MSGID: =?utf-8?q?1769234240782475409?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header and trailer (if present) are copied into page fragments that can be freed with put_page(). Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: Jason Gunthorpe cc: Leon Romanovsky cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org Reviewed-by: Bernard Metzler --- Notes: ver #2) - Wrap lines at 80. drivers/infiniband/sw/siw/siw_qp_tx.c | 231 ++++---------------------- 1 file changed, 36 insertions(+), 195 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index ffb16beb6c30..2584f9da0dd8 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -311,114 +311,8 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, return rv; } -/* - * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. - * - * Using sendpage to push page by page appears to be less efficient - * than using sendmsg, even if data are copied. - * - * A general performance limitation might be the extra four bytes - * trailer checksum segment to be pushed after user data. - */ -static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, - size_t size) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = (MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST | - MSG_SPLICE_PAGES), - }; - struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0; - - while (size) { - size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); - - if (size + offset <= PAGE_SIZE) - msg.msg_flags &= ~MSG_SENDPAGE_NOTLAST; - - tcp_rate_check_app_limited(sk); - bvec_set_page(&bvec, page[i], bytes, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - -try_page_again: - lock_sock(sk); - rv = tcp_sendmsg_locked(sk, &msg, size); - release_sock(sk); - - if (rv > 0) { - size -= rv; - sent += rv; - if (rv != bytes) { - offset += rv; - bytes -= rv; - goto try_page_again; - } - offset = 0; - } else { - if (rv == -EAGAIN || rv == 0) - break; - return rv; - } - i++; - } - return sent; -} - -/* - * siw_0copy_tx() - * - * Pushes list of pages to TCP socket. If pages from multiple - * SGE's, all referenced pages of each SGE are pushed in one - * shot. - */ -static int siw_0copy_tx(struct socket *s, struct page **page, - struct siw_sge *sge, unsigned int offset, - unsigned int size) -{ - int i = 0, sent = 0, rv; - int sge_bytes = min(sge->length - offset, size); - - offset = (sge->laddr + offset) & ~PAGE_MASK; - - while (sent != size) { - rv = siw_tcp_sendpages(s, &page[i], offset, sge_bytes); - if (rv >= 0) { - sent += rv; - if (size == sent || sge_bytes > rv) - break; - - i += PAGE_ALIGN(sge_bytes + offset) >> PAGE_SHIFT; - sge++; - sge_bytes = min(sge->length, size - sent); - offset = sge->laddr & ~PAGE_MASK; - } else { - sent = rv; - break; - } - } - return sent; -} - #define MAX_TRAILER (MPA_CRC_SIZE + 4) -static void siw_unmap_pages(struct kvec *iov, unsigned long kmap_mask, int len) -{ - int i; - - /* - * Work backwards through the array to honor the kmap_local_page() - * ordering requirements. - */ - for (i = (len-1); i >= 0; i--) { - if (kmap_mask & BIT(i)) { - unsigned long addr = (unsigned long)iov[i].iov_base; - - kunmap_local((void *)(addr & PAGE_MASK)); - } - } -} - /* * siw_tx_hdt() tries to push a complete packet to TCP where all * packet fragments are referenced by the elements of one iovec. @@ -438,30 +332,21 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) { struct siw_wqe *wqe = &c_tx->wqe_active; struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx]; - struct kvec iov[MAX_ARRAY]; - struct page *page_array[MAX_ARRAY]; + struct bio_vec bvec[MAX_ARRAY]; struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR }; + void *trl; int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv; unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0, sge_off = c_tx->sge_off, sge_idx = c_tx->sge_idx, pbl_idx = c_tx->pbl_idx; - unsigned long kmap_mask = 0L; if (c_tx->state == SIW_SEND_HDR) { - if (c_tx->use_sendpage) { - rv = siw_tx_ctrl(c_tx, s, MSG_DONTWAIT | MSG_MORE); - if (rv) - goto done; + void *hdr = &c_tx->pkt.ctrl + c_tx->ctrl_sent; - c_tx->state = SIW_SEND_DATA; - } else { - iov[0].iov_base = - (char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent; - iov[0].iov_len = hdr_len = - c_tx->ctrl_len - c_tx->ctrl_sent; - seg = 1; - } + hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent; + bvec_set_virt(&bvec[0], hdr, hdr_len); + seg = 1; } wqe->processed += data_len; @@ -477,28 +362,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } else { is_kva = 1; } - if (is_kva && !c_tx->use_sendpage) { - /* - * tx from kernel virtual address: either inline data - * or memory region with assigned kernel buffer - */ - iov[seg].iov_base = - ib_virt_dma_to_ptr(sge->laddr + sge_off); - iov[seg].iov_len = sge_len; - - if (do_crc) - crypto_shash_update(c_tx->mpa_crc_hd, - iov[seg].iov_base, - sge_len); - sge_off += sge_len; - data_len -= sge_len; - seg++; - goto sge_done; - } while (sge_len) { size_t plen = min((int)PAGE_SIZE - fp_off, sge_len); - void *kaddr; if (!is_kva) { struct page *p; @@ -511,33 +377,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) p = siw_get_upage(mem->umem, sge->laddr + sge_off); if (unlikely(!p)) { - siw_unmap_pages(iov, kmap_mask, seg); wqe->processed -= c_tx->bytes_unsent; rv = -EFAULT; goto done_crc; } - page_array[seg] = p; - - if (!c_tx->use_sendpage) { - void *kaddr = kmap_local_page(p); - - /* Remember for later kunmap() */ - kmap_mask |= BIT(seg); - iov[seg].iov_base = kaddr + fp_off; - iov[seg].iov_len = plen; - - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - iov[seg].iov_base, - plen); - } else if (do_crc) { - kaddr = kmap_local_page(p); - crypto_shash_update(c_tx->mpa_crc_hd, - kaddr + fp_off, - plen); - kunmap_local(kaddr); - } + + bvec_set_page(&bvec[seg], p, plen, fp_off); } else { /* * Cast to an uintptr_t to preserve all 64 bits @@ -545,12 +390,17 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) */ u64 va = sge->laddr + sge_off; - page_array[seg] = ib_virt_dma_to_page(va); - if (do_crc) - crypto_shash_update( - c_tx->mpa_crc_hd, - ib_virt_dma_to_ptr(va), - plen); + bvec_set_virt(&bvec[seg], + ib_virt_dma_to_ptr(va), plen); + } + + if (do_crc) { + void *kaddr = + kmap_local_page(bvec[seg].bv_page); + crypto_shash_update(c_tx->mpa_crc_hd, + kaddr + bvec[seg].bv_offset, + bvec[seg].bv_len); + kunmap_local(kaddr); } sge_len -= plen; @@ -560,13 +410,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) if (++seg >= (int)MAX_ARRAY) { siw_dbg_qp(tx_qp(c_tx), "to many fragments\n"); - siw_unmap_pages(iov, kmap_mask, seg-1); wqe->processed -= c_tx->bytes_unsent; rv = -EMSGSIZE; goto done_crc; } } -sge_done: + /* Update SGE variables at end of SGE */ if (sge_off == sge->length && (data_len != 0 || wqe->processed < wqe->bytes)) { @@ -575,15 +424,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) sge_off = 0; } } - /* trailer */ - if (likely(c_tx->state != SIW_SEND_TRAILER)) { - iov[seg].iov_base = &c_tx->trailer.pad[4 - c_tx->pad]; - iov[seg].iov_len = trl_len = MAX_TRAILER - (4 - c_tx->pad); - } else { - iov[seg].iov_base = &c_tx->trailer.pad[c_tx->ctrl_sent]; - iov[seg].iov_len = trl_len = MAX_TRAILER - c_tx->ctrl_sent; - } + /* Set the CRC in the trailer */ if (c_tx->pad) { *(u32 *)c_tx->trailer.pad = 0; if (do_crc) @@ -596,23 +438,23 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) else if (do_crc) crypto_shash_final(c_tx->mpa_crc_hd, (u8 *)&c_tx->trailer.crc); - data_len = c_tx->bytes_unsent; - - if (c_tx->use_sendpage) { - rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx], - c_tx->sge_off, data_len); - if (rv == data_len) { - rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len); - if (rv > 0) - rv += data_len; - else - rv = data_len; - } + /* Copy the trailer and add it to the output list */ + if (likely(c_tx->state != SIW_SEND_TRAILER)) { + trl = &c_tx->trailer.pad[4 - c_tx->pad]; + trl_len = MAX_TRAILER - (4 - c_tx->pad); } else { - rv = kernel_sendmsg(s, &msg, iov, seg + 1, - hdr_len + data_len + trl_len); - siw_unmap_pages(iov, kmap_mask, seg); + trl = &c_tx->trailer.pad[c_tx->ctrl_sent]; + trl_len = MAX_TRAILER - c_tx->ctrl_sent; } + bvec_set_virt(&bvec[seg], trl, trl_len); + + data_len = c_tx->bytes_unsent; + + if (c_tx->use_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, seg + 1, + hdr_len + data_len + trl_len); + rv = sock_sendmsg(s, &msg); if (rv < (int)hdr_len) { /* Not even complete hdr pushed or negative rv */ wqe->processed -= data_len; @@ -673,7 +515,6 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) } done_crc: c_tx->do_crc = 0; -done: return rv; } From patchwork Tue Jun 20 14:53:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110566 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3739687vqr; Tue, 20 Jun 2023 08:15:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5ejoNdcpmI+xMHx2YQ1SXbEgAbRW8t9QGS1Lsd4RX484XUMk/lPb+wUV5mbKIRQatUBTk2 X-Received: by 2002:a05:6a20:7495:b0:11f:33da:56ec with SMTP id p21-20020a056a20749500b0011f33da56ecmr14589643pzd.27.1687274140648; Tue, 20 Jun 2023 08:15:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274140; cv=none; d=google.com; s=arc-20160816; b=KTgLUizC7RspG5YKX60ZnwdKoiXdGutZ+VPm18r7TNnvucVRLax2kPjx/ANVc6rouv Y3PRp3Wr48A8wBuCuphUsJZMMdNkuz9hsNSrz57AMUrl+9OtwkL/leBsLpbv3JMKVvIq WX5R3GqbJtvzIHTupxmkTOGw8MqH5risT5uBC2tOjv+qwFuIAUbEs/pqdxzm4/Zh5WkB iZDS/q502qtdReNmP7Km3T6YDeN52Gkp+h7IL/e+nTVWNSjtAD3fwGe7ejUzQzbgKlfo IbRttmSwbbiDJkHcwe37vbUvuKoBW40pdZaBHhNa6IQIABivUkbF4XjQ1M2zl2cwyig0 UkTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1DmsLXNf6UEopicQGVGHzvB3R+Io/k4xNBk1zidfJPQ=; b=ssRt6ImqMMr0b8fWI2HXdaghH7dB0hxrc7w4Tn07DLLJodqbDXe6NbFrF/A2a9QRWO ceg7JHMeg1Sd76Mb39f9Lt8uk4CsMTf9r2yfZpz4ELIkkhdrMd1PUrl56+72LIv/ruZ3 td9+lYqnMju7NDc9/GBA6w5jG+SstjvnAG+GEHrJ0r2HMSQrNFP72jQJ4eXBlCUInb+v p3z/xJxOziT0ZfShXprPHydaxFkZU3GFommtU9uYNBnSnbAeEC+s0xFjS2V9Kwuz72sf lxa3oLW/2LJcRxUNxcXm92t5EIfCvGISoNwZABOnbCwEOao9vqVs+Sy2bROAKRqehxfe 6D/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SWqBeLrh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x25-20020aa79579000000b006590bb60328si1898350pfq.173.2023.06.20.08.15.27; Tue, 20 Jun 2023 08:15:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SWqBeLrh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233185AbjFTO42 (ORCPT + 99 others); Tue, 20 Jun 2023 10:56:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232949AbjFTO4Y (ORCPT ); Tue, 20 Jun 2023 10:56:24 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B05F01A4 for ; Tue, 20 Jun 2023 07:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1DmsLXNf6UEopicQGVGHzvB3R+Io/k4xNBk1zidfJPQ=; b=SWqBeLrhy/gQnScPumoceKmq5ZZGLq+VcoMS9pGMwb+cV+qfLjBXRG3rUUuUnm8sMwl739 xB+bbv25vhmvGbb9JiFH/kxXcWmvS9hBOL64dEU0xzVrXD8U/DSBQ/VoSE0CD/RuwRPrBD kq/KRn9KfV1hzwh9TjeWlve6vNaVhGw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-644-q3ToHJJIOfyekEbhjCK1ZA-1; Tue, 20 Jun 2023 10:55:36 -0400 X-MC-Unique: q3ToHJJIOfyekEbhjCK1ZA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A0FD23814975; Tue, 20 Jun 2023 14:53:55 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D552B492C13; Tue, 20 Jun 2023 14:53:53 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ilya Dryomov , Xiubo Li , Jeff Layton , ceph-devel@vger.kernel.org Subject: [PATCH net-next v3 05/18] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Tue, 20 Jun 2023 15:53:24 +0100 Message-ID: <20230620145338.1300897-6-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235168877406040?= X-GMAIL-MSGID: =?utf-8?q?1769235168877406040?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v1.c | 58 ++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 39 deletions(-) diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index d664cb1593a7..f082e5c780a3 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -74,37 +74,6 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov, return r; } -/* - * @more: either or both of MSG_MORE and MSG_SENDPAGE_NOTLAST - */ -static int ceph_tcp_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int more) -{ - ssize_t (*sendpage)(struct socket *sock, struct page *page, - int offset, size_t size, int flags); - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - int ret; - - /* - * sendpage cannot properly handle pages with page_count == 0, - * we need to fall back to sendmsg if that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag which - * triggers one of hardened usercopy checks. - */ - if (sendpage_ok(page)) - sendpage = sock->ops->sendpage; - else - sendpage = sock_no_sendpage; - - ret = sendpage(sock, page, offset, size, flags); - if (ret == -EAGAIN) - ret = 0; - - return ret; -} - static void con_out_kvec_reset(struct ceph_connection *con) { BUG_ON(con->v1.out_skip); @@ -464,7 +433,6 @@ static int write_partial_message_data(struct ceph_connection *con) struct ceph_msg *msg = con->out_msg; struct ceph_msg_data_cursor *cursor = &msg->cursor; bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC); - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; u32 crc; dout("%s %p msg %p\n", __func__, con, msg); @@ -482,6 +450,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ crc = do_datacrc ? le32_to_cpu(msg->footer.data_crc) : 0; while (cursor->total_resid) { + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES, + }; struct page *page; size_t page_offset; size_t length; @@ -494,9 +466,12 @@ static int write_partial_message_data(struct ceph_connection *con) page = ceph_msg_data_next(cursor, &page_offset, &length); if (length == cursor->total_resid) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, page, page_offset, length, - more); + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, length, page_offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, length); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) { if (do_datacrc) msg->footer.data_crc = cpu_to_le32(crc); @@ -526,7 +501,10 @@ static int write_partial_message_data(struct ceph_connection *con) */ static int write_partial_skip(struct ceph_connection *con) { - int more = MSG_MORE | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msghdr = { + .msg_flags = MSG_SPLICE_PAGES | MSG_MORE, + }; int ret; dout("%s %p %d left\n", __func__, con, con->v1.out_skip); @@ -534,9 +512,11 @@ static int write_partial_skip(struct ceph_connection *con) size_t size = min(con->v1.out_skip, (int)PAGE_SIZE); if (size == con->v1.out_skip) - more = MSG_MORE; - ret = ceph_tcp_sendpage(con->sock, ceph_zero_page, 0, size, - more); + msghdr.msg_flags &= ~MSG_MORE; + bvec_set_page(&bvec, ZERO_PAGE(0), size, 0); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = sock_sendmsg(con->sock, &msghdr); if (ret <= 0) goto out; con->v1.out_skip -= ret; From patchwork Tue Jun 20 14:53:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110575 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3744425vqr; Tue, 20 Jun 2023 08:22:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5HXiz8jTrxO5CLfsZAgs92jSSzXw3pv0xe1gDP0RHmXJ2X/IFVC/pgOuOJZ9zuPIRi3tP6 X-Received: by 2002:a17:902:d682:b0:1b2:a63:9587 with SMTP id v2-20020a170902d68200b001b20a639587mr3071260ply.36.1687274555651; Tue, 20 Jun 2023 08:22:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274555; cv=none; d=google.com; s=arc-20160816; b=rfz59X7fWi0VpE9pP+5wzSTddecWKYLTWqeuBfVEe4QarjQtORUcqe0Gv/dbK7jzn7 1zujR0m37yThuA2uHI1F46EkgW7c7rRiW6T2+NftrPkwPU4oBs4hpCfogpCfch84U3HG kUmvIgA16P8x4PCmr9TAcLHndq0v2iQScUdBkg+oMLbbd6xRaKYyZ6BUmEgk++l4KD24 nONqiDsQsb3wH3cG9g4ZyB3irfxPyUoD+61yFY7q9iaZbckslmn2blqczikPHVVxZ6Qx uOiG/WUaUaLTotPJgWZ5NH4HIS4ORoSUScZXqdPjWZXdlAcaOBqz1QcEPGDqvFXsPMQB 2UJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XZXyJFmNcOo4OqBa8WBusNB3ZLLcguzSLjkBKTvuk5o=; b=G7VLLhjSFAFjpS917Rj5R1t/Y7pPZFJcPKuHR0PVdcPMKKqCG7eLR2QU9t4R6Ae8VT GMJTey4EdrMXd/7Kjjvfpq8qy0/3W76b5utBuAEPAJ6FuAi04a593c7Ule/88STCknIz bQeBtu64fismfetzqV3awJtw5Bm5rKtMC/7QzIykOsJSex18FuqDcqk4+0Efi/GDQZ6u Rph2cTnEQ8LFR1udXC1/5exbwQmxNh6UzYqD1qyL4iXYq9hgxloJAJi4NUPbmviT52u7 FxV8+PlFcJ6bOzFN3wKo6+9JEAGrJjCbCwLIlFSehphiWgm6wzXXjj+zIpThkdLToaUz eIOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U+zhXybH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p15-20020a170902e74f00b001ac78ac2c9csi2297590plf.573.2023.06.20.08.22.23; Tue, 20 Jun 2023 08:22:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U+zhXybH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229966AbjFTO4s (ORCPT + 99 others); Tue, 20 Jun 2023 10:56:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232968AbjFTO4q (ORCPT ); Tue, 20 Jun 2023 10:56:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B15E31704 for ; Tue, 20 Jun 2023 07:56:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XZXyJFmNcOo4OqBa8WBusNB3ZLLcguzSLjkBKTvuk5o=; b=U+zhXybHpvclf4SieIEKrFqH9BP4BnZfH/cs92Or04UW0m5xYCeybhS94M/9wppxA8kHS5 GboiIoYAQDAd0Ahsel9b31Miiguy/4GhWPwLWqkEKGqkh2jtwnrIdbMxZrFkdP4v8xHUpx 5QvkvAlcLL+lvgj7JQ/j3QBZ7iikb3Q= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-549-puJs86kMOViK6oXJMspZfg-1; Tue, 20 Jun 2023 10:55:52 -0400 X-MC-Unique: puJs86kMOViK6oXJMspZfg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9CF2B3C10EE0; Tue, 20 Jun 2023 14:53:57 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 372FB112132E; Tue, 20 Jun 2023 14:53:56 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next v3 06/18] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() Date: Tue, 20 Jun 2023 15:53:25 +0100 Message-ID: <20230620145338.1300897-7-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235604543194001?= X-GMAIL-MSGID: =?utf-8?q?1769235604543194001?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in skb_send_sock(). This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Note that this could perhaps be improved to fill out a bvec array with all the frags and then make a single sendmsg call, possibly sticking the header on the front also. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- Notes: ver #2) - Wrap lines at 80. net/core/skbuff.c | 50 ++++++++++++++++++++++++++--------------------- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 36605510a76d..ee2fc8fe31cb 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2990,32 +2990,32 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset, } EXPORT_SYMBOL_GPL(skb_splice_bits); -static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size) +static int sendmsg_locked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; + size_t size = msg_data_left(msg); if (!sock) return -EINVAL; - return kernel_sendmsg(sock, msg, vec, num, size); + + if (!sock->ops->sendmsg_locked) + return sock_no_sendmsg_locked(sk, msg, size); + + return sock->ops->sendmsg_locked(sk, msg, size); } -static int sendpage_unlocked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg) { struct socket *sock = sk->sk_socket; if (!sock) return -EINVAL; - return kernel_sendpage(sock, page, offset, size, flags); + return sock_sendmsg(sock, msg); } -typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg, - struct kvec *vec, size_t num, size_t size); -typedef int (*sendpage_func)(struct sock *sk, struct page *page, int offset, - size_t size, int flags); +typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg); static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, - int len, sendmsg_func sendmsg, sendpage_func sendpage) + int len, sendmsg_func sendmsg) { unsigned int orig_len = len; struct sk_buff *head = skb; @@ -3035,8 +3035,9 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, memset(&msg, 0, sizeof(msg)); msg.msg_flags = MSG_DONTWAIT; - ret = INDIRECT_CALL_2(sendmsg, kernel_sendmsg_locked, - sendmsg_unlocked, sk, &msg, &kv, 1, slen); + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &kv, 1, slen); + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -3067,11 +3068,18 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, slen = min_t(size_t, len, skb_frag_size(frag) - offset); while (slen) { - ret = INDIRECT_CALL_2(sendpage, kernel_sendpage_locked, - sendpage_unlocked, sk, - skb_frag_page(frag), - skb_frag_off(frag) + offset, - slen, MSG_DONTWAIT); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT, + }; + + bvec_set_page(&bvec, skb_frag_page(frag), slen, + skb_frag_off(frag) + offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, + slen); + + ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked, + sendmsg_unlocked, sk, &msg); if (ret <= 0) goto error; @@ -3108,16 +3116,14 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, kernel_sendmsg_locked, - kernel_sendpage_locked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_locked); } EXPORT_SYMBOL_GPL(skb_send_sock_locked); /* Send skb data on a socket. Socket must be unlocked. */ int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len) { - return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked, - sendpage_unlocked); + return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked); } /** From patchwork Tue Jun 20 14:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110573 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3743678vqr; Tue, 20 Jun 2023 08:21:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6TjZ6qLbvCqcAXRkj7c2dnzVwTOZh36cgK992OTQy55wQOPR1vKoUK9ScahLPirOejKd1Z X-Received: by 2002:a17:902:6b88:b0:1b5:553e:4ea1 with SMTP id p8-20020a1709026b8800b001b5553e4ea1mr6901213plk.1.1687274482640; Tue, 20 Jun 2023 08:21:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274482; cv=none; d=google.com; s=arc-20160816; b=x0YSXfsNbc8c2p8uKYp8T4rD2gd3GB0W6oYP5p8gMDc0lBGPS3WjTDsnk5HPnYakrk FG7UY9nhintapvZiiXNzJQLX1YMTTcxMrphS3h25leP0kDC3XgUKg7fEMk36pMFkeOFK 7o8IUuSekJJXNOBCEFFY8j+dRo2wvuLtFFSyPwfe8N6+kkTbhw1+NZGntNnPnT37EFHY oQ9xfKdn0MOSSESXjxgPsUyB6KPhdU0DtdH2QqRtJ+Fh4HMI+zjx4sTha0qulCUO1PFJ JKusjcE8zr3okSuxqY9QAVA5RWyswMszdauuJCDqGcXsJiMcCHQz+9ThNIVhyboglBs/ ZwQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=fSQZQ6YboxRaXsiohOL5R9kf0uRRv2lhrE9A+pyCn4U=; b=mI7chQM19pkLMWzHaLHTA7jH6dS2vZ4rp0owrdftFUnafK0zy0/e7PuAg6thXvG8nG dGS2DJoh0G+gYObvO6s2BrRnBV2BXOouY/1Hdc1WfDEJOlB2hqmwmuBEzZsYMV1WyYkL IxxDOPqyJ9hg9rx3KMBJMXQ+Nn9e/ZEbkgZu9q4VhxeaPchF0/NdAQCi2Rw1gtI9hM88 zQdNKRBNRamTItSx9PcT02Qc6vVgDd3w+fFkWs+qdO8w60pVEtOLfT5q+fwlGb+dNKDG +Z4p5IslKFAd8WQvdLkxwEjHwP2/c7BPnHMgyDehtbipkpj+npXGAfxeloYe+ksmNTPh eLIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Usg6ZIJ1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c2-20020a170902d90200b001a043389a7asi1961899plz.310.2023.06.20.08.21.07; Tue, 20 Jun 2023 08:21:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Usg6ZIJ1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233241AbjFTO6K (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233120AbjFTO6B (ORCPT ); Tue, 20 Jun 2023 10:58:01 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E1D01BC3 for ; Tue, 20 Jun 2023 07:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fSQZQ6YboxRaXsiohOL5R9kf0uRRv2lhrE9A+pyCn4U=; b=Usg6ZIJ1tpq5w2025EsZFUS2u94Xe8TCMeUWYlJsezLKBAK/2ufKrdD567QkZJd7EpUkpO /PQxFb7ft3JbPXPqFSCXj2P6tFm0xGCFR/40rC1q2iX/0AICdMzQha3+R6vD/OTmPpNAtF WWuV4gm/PFEzCwh0c9Mjv2FrhXcg9zk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-629-Z9ENlngsPYSmfCtfvR07VA-1; Tue, 20 Jun 2023 10:56:46 -0400 X-MC-Unique: Z9ENlngsPYSmfCtfvR07VA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 157BC3823A0C; Tue, 20 Jun 2023 14:54:18 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30210200C0DA; Tue, 20 Jun 2023 14:53:58 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ilya Dryomov , Xiubo Li , Jeff Layton , ceph-devel@vger.kernel.org Subject: [PATCH net-next v3 07/18] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Tue, 20 Jun 2023 15:53:26 +0100 Message-ID: <20230620145338.1300897-8-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235527697824910?= X-GMAIL-MSGID: =?utf-8?q?1769235527697824910?= Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when transmitting data. For the moment, this can only transmit one page at a time because of the architecture of net/ceph/, but if write_partial_message_data() can be given a bvec[] at a time by the iteration code, this would allow pages to be sent in a batch. Signed-off-by: David Howells cc: Ilya Dryomov cc: Xiubo Li cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: ceph-devel@vger.kernel.org cc: netdev@vger.kernel.org --- net/ceph/messenger_v2.c | 91 +++++++++-------------------------------- 1 file changed, 19 insertions(+), 72 deletions(-) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index 301a991dc6a6..87ac97073e75 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -117,91 +117,38 @@ static int ceph_tcp_recv(struct ceph_connection *con) return ret; } -static int do_sendmsg(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - int ret; - - msg.msg_iter = *it; - while (iov_iter_count(it)) { - ret = sock_sendmsg(sock, &msg); - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - WARN_ON(msg_data_left(&msg)); - return 1; -} - -static int do_try_sendpage(struct socket *sock, struct iov_iter *it) -{ - struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; - struct bio_vec bv; - int ret; - - if (WARN_ON(!iov_iter_is_bvec(it))) - return -EINVAL; - - while (iov_iter_count(it)) { - /* iov_iter_iovec() for ITER_BVEC */ - bvec_set_page(&bv, it->bvec->bv_page, - min(iov_iter_count(it), - it->bvec->bv_len - it->iov_offset), - it->bvec->bv_offset + it->iov_offset); - - /* - * sendpage cannot properly handle pages with - * page_count == 0, we need to fall back to sendmsg if - * that's the case. - * - * Same goes for slab pages: skb_can_coalesce() allows - * coalescing neighboring slab objects into a single frag - * which triggers one of hardened usercopy checks. - */ - if (sendpage_ok(bv.bv_page)) { - ret = sock->ops->sendpage(sock, bv.bv_page, - bv.bv_offset, bv.bv_len, - CEPH_MSG_FLAGS); - } else { - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len); - ret = sock_sendmsg(sock, &msg); - } - if (ret <= 0) { - if (ret == -EAGAIN) - ret = 0; - return ret; - } - - iov_iter_advance(it, ret); - } - - return 1; -} - /* * Write as much as possible. The socket is expected to be corked, - * so we don't bother with MSG_MORE/MSG_SENDPAGE_NOTLAST here. + * so we don't bother with MSG_MORE here. * * Return: - * 1 - done, nothing (else) to write + * >0 - done, nothing (else) to write * 0 - socket is full, need to wait * <0 - error */ static int ceph_tcp_send(struct ceph_connection *con) { + struct msghdr msg = { + .msg_iter = con->v2.out_iter, + .msg_flags = CEPH_MSG_FLAGS, + }; int ret; + if (WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter))) + return -EINVAL; + + if (con->v2.out_iter_sendpage) + msg.msg_flags |= MSG_SPLICE_PAGES; + dout("%s con %p have %zu try_sendpage %d\n", __func__, con, iov_iter_count(&con->v2.out_iter), con->v2.out_iter_sendpage); - if (con->v2.out_iter_sendpage) - ret = do_try_sendpage(con->sock, &con->v2.out_iter); - else - ret = do_sendmsg(con->sock, &con->v2.out_iter); + + ret = sock_sendmsg(con->sock, &msg); + if (ret > 0) + iov_iter_advance(&con->v2.out_iter, ret); + else if (ret == -EAGAIN) + ret = 0; + dout("%s con %p ret %d left %zu\n", __func__, con, ret, iov_iter_count(&con->v2.out_iter)); return ret; From patchwork Tue Jun 20 14:53:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110552 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3727154vqr; Tue, 20 Jun 2023 08:01:05 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4zRSCFSighk1eTlZB4sbncbNKoRiFVujh6k1ecm3+0OUwIFe+C1MjNLb++vUK+SZjGB2Fq X-Received: by 2002:a05:6a20:1589:b0:11d:322a:ce34 with SMTP id h9-20020a056a20158900b0011d322ace34mr3722656pzj.31.1687273264837; Tue, 20 Jun 2023 08:01:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273264; cv=none; d=google.com; s=arc-20160816; b=txUhIV5XfudzKlCEYtwdWtF9zYuuRsnrnreNg4i71DwNt90liSq+1CA1vtI/4p+h+p dGgHzRZ1Da8qPPycPXQsyFzV5wZEBR5xplaLxj2FsASaiTY8RtZ7KUH22PfeTRDO/c2N z+9v/aJRJLqZqqRFJby2erZpHoFBBYV+b97WaVEN1H2yYdpDmwJx71EXYukkSj3QSBYu SquB+4IDT14MHTjzRgMfoMmvgdxOSwYQ4zeTJxjpKPC+xfMM4h/pqlnnRwa41MGn2Qim 0AQM1aaXZSGw+mEdthziO3IBpo22+gp4Onlb+7uunJ04c4D1ab5qkzbrdCHp2y1uh6Yk mB9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Fr8R9cUCocDj4LYXtGmeXkCgA5OgO4hqxYPqdNjfTX8=; b=My9XiJan+L3kEIh4Z8t4tqOkxqYCv5+/wCVgqz/g8KZV81p/Nbc4WAn6S+KgHYr5HE By0id3OOnNzPGpc1y/Kry99AGrk72sRGCTqH4Wi5jDRpJ1ch4ZZA1JPonghOTySgHw/4 FF2U3N3L0a1cKYIwASl/e7zzW9ZTGaWBi5ctJlcjwQTB9bIel0sjdqRFZhUyKYAhSziU rwLmHWQFBeep+VBTsCGzWj//KDs3wyzNXgQJ4qRq8pXDOyEzK/OCNaAen0lBTaQxZf82 wrqncgTsmQ/O+08KsG8+M/YojxPQeBda+etK5mH6s/v4FRzOlR4llmHA/ILdEf5nhGdo t4ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K11U+9Mq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k3-20020a170902f28300b001ab0c00aec4si1976679plc.482.2023.06.20.08.00.44; Tue, 20 Jun 2023 08:01:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=K11U+9Mq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232905AbjFTO5f (ORCPT + 99 others); Tue, 20 Jun 2023 10:57:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232970AbjFTO52 (ORCPT ); Tue, 20 Jun 2023 10:57:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83CC11989 for ; Tue, 20 Jun 2023 07:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687272996; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fr8R9cUCocDj4LYXtGmeXkCgA5OgO4hqxYPqdNjfTX8=; b=K11U+9MqYcMS/Ny2dYdvbS9vqB/ZmJFQs3131VgiGMKgXUGuyOHaCSa7u2ZpdVDmB1KfUJ YwWTedBB5V1nYxxncvx36o17ft+HmoDeRRX/LXvniPbOWoWktyEZm0LMYlYUwi+5fvXta9 ByesKlN3mzAXlH5oux2oSnOIK4dI17g= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-264-MX17dbQIPO-VqzbjHhAp6A-1; Tue, 20 Jun 2023 10:56:30 -0400 X-MC-Unique: MX17dbQIPO-VqzbjHhAp6A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 767F18D142C; Tue, 20 Jun 2023 14:54:20 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id B6327C1ED97; Tue, 20 Jun 2023 14:54:18 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Santosh Shilimkar , linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com Subject: [PATCH net-next v3 08/18] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Tue, 20 Jun 2023 15:53:27 +0100 Message-ID: <20230620145338.1300897-9-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234251028923708?= X-GMAIL-MSGID: =?utf-8?q?1769234251028923708?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header and data pages. To make this work, the data is assembled in a bio_vec array and attached to a BVEC-type iterator. The header are copied into memory acquired from zcopy_alloc() which just breaks a page up into small pieces that can be freed with put_page(). Signed-off-by: David Howells cc: Santosh Shilimkar cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: rds-devel@oss.oracle.com cc: netdev@vger.kernel.org --- net/rds/tcp_send.c | 74 +++++++++++++++++----------------------------- 1 file changed, 27 insertions(+), 47 deletions(-) diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c index 8c4d1d6e9249..550390d5ff2b 100644 --- a/net/rds/tcp_send.c +++ b/net/rds/tcp_send.c @@ -52,29 +52,23 @@ void rds_tcp_xmit_path_complete(struct rds_conn_path *cp) tcp_sock_set_cork(tc->t_sock->sk, false); } -/* the core send_sem serializes this with other xmit and shutdown */ -static int rds_tcp_sendmsg(struct socket *sock, void *data, unsigned int len) -{ - struct kvec vec = { - .iov_base = data, - .iov_len = len, - }; - struct msghdr msg = { - .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL, - }; - - return kernel_sendmsg(sock, &msg, &vec, 1, vec.iov_len); -} - /* the core send_sem serializes this with other xmit and shutdown */ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, unsigned int hdr_off, unsigned int sg, unsigned int off) { struct rds_conn_path *cp = rm->m_inc.i_conn_path; struct rds_tcp_connection *tc = cp->cp_transport_data; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; + struct bio_vec *bvec; + unsigned int i, size = 0, ix = 0; int done = 0; - int ret = 0; - int more; + int ret = -ENOMEM; + + bvec = kmalloc_array(1 + sg, sizeof(struct bio_vec), GFP_KERNEL); + if (!bvec) + goto out; if (hdr_off == 0) { /* @@ -101,41 +95,26 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, /* see rds_tcp_write_space() */ set_bit(SOCK_NOSPACE, &tc->t_sock->sk->sk_socket->flags); - ret = rds_tcp_sendmsg(tc->t_sock, - (void *)&rm->m_inc.i_hdr + hdr_off, - sizeof(rm->m_inc.i_hdr) - hdr_off); - if (ret < 0) - goto out; - done += ret; - if (hdr_off + done != sizeof(struct rds_header)) - goto out; + bvec_set_virt(&bvec[ix], (void *)&rm->m_inc.i_hdr + hdr_off, + sizeof(rm->m_inc.i_hdr) - hdr_off); + size += bvec[ix].bv_len; + ix++; } - more = rm->data.op_nents > 1 ? (MSG_MORE | MSG_SENDPAGE_NOTLAST) : 0; - while (sg < rm->data.op_nents) { - int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more; - - ret = tc->t_sock->ops->sendpage(tc->t_sock, - sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, - rm->data.op_sg[sg].length - off, - flags); - rdsdebug("tcp sendpage %p:%u:%u ret %d\n", (void *)sg_page(&rm->data.op_sg[sg]), - rm->data.op_sg[sg].offset + off, rm->data.op_sg[sg].length - off, - ret); - if (ret <= 0) - break; - - off += ret; - done += ret; - if (off == rm->data.op_sg[sg].length) { - off = 0; - sg++; - } - if (sg == rm->data.op_nents - 1) - more = 0; + for (i = sg; i < rm->data.op_nents; i++) { + bvec_set_page(&bvec[ix], + sg_page(&rm->data.op_sg[i]), + rm->data.op_sg[i].length - off, + rm->data.op_sg[i].offset + off); + off = 0; + size += bvec[ix].bv_len; + ix++; } + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, ix, size); + ret = sock_sendmsg(tc->t_sock, &msg); + rdsdebug("tcp sendmsg-splice %u,%u ret %d\n", ix, size, ret); + out: if (ret <= 0) { /* write_space will hit after EAGAIN, all else fatal */ @@ -158,6 +137,7 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm, } if (done == 0) done = ret; + kfree(bvec); return done; } From patchwork Tue Jun 20 14:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110557 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3735602vqr; Tue, 20 Jun 2023 08:10:30 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7kuuA7kvu475Sc9ZALaTRnNpPKC5A6BJwsGoNuYc8TU5kGxfvEOSNIVlVY8Gm8a8SzzyH1 X-Received: by 2002:a9d:6a15:0:b0:6b5:887d:745e with SMTP id g21-20020a9d6a15000000b006b5887d745emr4187393otn.12.1687273830572; Tue, 20 Jun 2023 08:10:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273830; cv=none; d=google.com; s=arc-20160816; b=PgQ+gm4g88gOA6ZJORFxU6TQPs3dI4inZahUdZTfi9zGsWBAkBiCF6FhmkYa0RUZP8 GvFOetYVqFtsSEjFZQnqmM7Vuxv1x9PD/rGFYJLxAviagHq46rbCT+S/dpnUFrfFZ8Ma F7FQcg3FPQmPTrs91DLCzcci3WjxUG8MU46pojb0T+d7eKABvnHI4VEHPG4eOXnxJf46 oUDmxx4mIeXMnKqq6/LMZ/vTkCkQtP6m0a77F5Oh5Nd0nyZj0SFVhhEHl21U9qxrA9wi Nvi8yRPWet2WgzG+QnP+8gskDAmVzHn2PhS32mPR7FtKBT6JIxk8gG0ij+pncLDfVmVK T7aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CX0szcw738ZS8htZ0RxEmjBFgUMhbKSUPMuc1XxmplE=; b=eBroZreMXMPUWA4GThA0pK8waDWAE1gE40DYsY4ySqZTwuZNbtNgRpqRFCUpl11kzu Xd+PwFyMVguBMkE41+armuzSoC6JI4yagTFM1MgBNApzLdt09iC17E4f9ZgwXHk1u7DD JLMdao5IeXvo3RognnBQXLD6dQDHGP3lXkUQKdbd6gc9Asx3glsdb57tPRVnbsds8ySY GTfmsw2c7pIEDOhGtxaG6FQA4uuWNxTEI/WunyiSDpzxuFgHnD1yqjvAa0CBExaEubjo k4SRHSlZhd7dhM4BvPd1JYvnM3sm8ScuavwFnQ2Zw+n3LhbvhwfvNK5h2rIFKjLfkJij EUyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=a1k7jxLA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i124-20020a639d82000000b00553b9b18870si1852163pgd.570.2023.06.20.08.10.16; Tue, 20 Jun 2023 08:10:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=a1k7jxLA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233146AbjFTO6F (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233129AbjFTO6B (ORCPT ); Tue, 20 Jun 2023 10:58:01 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FD1719A8 for ; Tue, 20 Jun 2023 07:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CX0szcw738ZS8htZ0RxEmjBFgUMhbKSUPMuc1XxmplE=; b=a1k7jxLAseAsNhZgzT9Uz2VAR9v/cfZVytBwbsHnNpJfCP5B7OVptpy5WBzKyaVh7/iAQe 8cnUNMkNT125J8wyzkmkNcH+QFkIJgDdI7qJeevsWcaGbZ/cAVXy42SsDOu9vkG2BIhhmp 43R/+yREKB+XBcXHg1tcmaXMlbOKEtA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-398-CpGHSxxIP-O4lq_Pb2IfNQ-1; Tue, 20 Jun 2023 10:56:37 -0400 X-MC-Unique: CpGHSxxIP-O4lq_Pb2IfNQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BD2BE8A51E9; Tue, 20 Jun 2023 14:54:22 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0B0949E9C; Tue, 20 Jun 2023 14:54:20 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christine Caulfield , David Teigland , cluster-devel@redhat.com Subject: [PATCH net-next v3 09/18] dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Tue, 20 Jun 2023 15:53:28 +0100 Message-ID: <20230620145338.1300897-10-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234844021136023?= X-GMAIL-MSGID: =?utf-8?q?1769234844021136023?= When transmitting data, call down a layer using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather using sendpage. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Christine Caulfield cc: David Teigland cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: cluster-devel@redhat.com cc: netdev@vger.kernel.org --- fs/dlm/lowcomms.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 3d3802c47b8b..5c12d8cdfc16 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1395,8 +1395,11 @@ int dlm_lowcomms_resend_msg(struct dlm_msg *msg) /* Send a message */ static int send_to_sock(struct connection *con) { - const int msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL; struct writequeue_entry *e; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL, + }; int len, offset, ret; spin_lock_bh(&con->writequeue_lock); @@ -1412,8 +1415,9 @@ static int send_to_sock(struct connection *con) WARN_ON_ONCE(len == 0 && e->users == 0); spin_unlock_bh(&con->writequeue_lock); - ret = kernel_sendpage(con->sock, e->page, offset, len, - msg_flags); + bvec_set_page(&bvec, e->page, len, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(con->sock, &msg); trace_dlm_send(con->nodeid, ret); if (ret == -EAGAIN || ret == 0) { lock_sock(con->sock->sk); From patchwork Tue Jun 20 14:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110570 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3742303vqr; Tue, 20 Jun 2023 08:19:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4xFELqYQPiM64j84s0U5zcfYvX4ZNwdVGxkdsxRze7HeCVI0+LfqO9qihMHBB2gVQMa0mR X-Received: by 2002:a17:90a:356:b0:25b:e07f:4c43 with SMTP id 22-20020a17090a035600b0025be07f4c43mr12392594pjf.10.1687274360227; Tue, 20 Jun 2023 08:19:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274360; cv=none; d=google.com; s=arc-20160816; b=rqi9PbVO5eAQjdgxCT+uqxY4XTDcYmtkdx+S5XrEueZn4+ZPcWOgvnAPNWLLtR3m1K qeDWUQGbgFKtXQ7MMnlE+U1ITLDlGs/pJomV0PWRseCnHiIYsAQYbPUlU1BpwO2DIw6O Cv3b7O9tgNeA48NqKcrbapJ1oJFzGpvpTidReWVnqRHSKv1PS0beogaR60mlexEQC6CP /CNWXb16b0rzXPmZkrKAWcP0CDsdhDtzlAWOYjHzMvTB3d/Y4eG6HzJSQ5CmNQv8rR+D mVwYJqevsvm7EOsdKmpb7Hmr2NfllwZC0DNSww80N5uzQE/eEWOnBzYd4nAc9t3dMN3d oU1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=kX6Xzyb5DYMqpSrTmxybyUGZQ2hNwo35+Ji3qVw7rrY=; b=kv4BWBzJCwwDq6YF7uuKn/ke4F3MV/x7cmSOCD/5N2qx/6jiFGFkdTb5sZfq+fCZ6/ D6zYJVEYWb9Bq4XtoVquxjpaCApaIfhavtjrWEXgsFo69/5amA07Ivef3OeWzU3khJBM h7iqjZunc0ArLuHv8c8meNHisA0I8nJzQYK5vK8je6rwbyYxwhieopOBFFyqB0fZQz4t EqWzu0415oqHEhMZXLwlTpDFbN+cG/mZqaMuld4utuwIdR6YTINPO0m2pswSY43eKO62 Ok4wR9lHG7ag1wFrbmCWn3B3Dvbivdd1jtg1pKPjgLWRwKXYxCBSgWI9EfW9e4ygSYRB CJLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XtfZnubo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c4-20020a17090a8d0400b0025979e8c246si2155323pjo.70.2023.06.20.08.18.59; Tue, 20 Jun 2023 08:19:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XtfZnubo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233461AbjFTO7G (ORCPT + 99 others); Tue, 20 Jun 2023 10:59:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232859AbjFTO6l (ORCPT ); Tue, 20 Jun 2023 10:58:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A09E91BEB for ; Tue, 20 Jun 2023 07:57:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273046; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kX6Xzyb5DYMqpSrTmxybyUGZQ2hNwo35+Ji3qVw7rrY=; b=XtfZnuboBiIr6C5qukcWb9WHvFuYiruLl5511F1GWJtxEnbGG4Ollh3DAI+sMc2+OzVpQv 1SzFP8u8mNff5+9KBv5ocN6UPE7voghzgCpiCNE99V7CE26TSSBmVzKweL33UpuTACKJqG ZOsLqBFePTMPh2j8BqrW6ktttUwJ4K0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-481-ICwZLUEsPqKT1b5qYpONaQ-1; Tue, 20 Jun 2023 10:57:22 -0400 X-MC-Unique: ICwZLUEsPqKT1b5qYpONaQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6D7458B5AF6; Tue, 20 Jun 2023 14:54:25 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 554F914682FA; Tue, 20 Jun 2023 14:54:23 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sagi Grimberg , Willem de Bruijn , Keith Busch , Jens Axboe , Christoph Hellwig , Chaitanya Kulkarni , linux-nvme@lists.infradead.org Subject: [PATCH net-next v3 10/18] nvme/host: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Tue, 20 Jun 2023 15:53:29 +0100 Message-ID: <20230620145338.1300897-11-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235399528125661?= X-GMAIL-MSGID: =?utf-8?q?1769235399528125661?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells Tested-by: Sagi Grimberg Acked-by: Willem de Bruijn cc: Keith Busch cc: Jens Axboe cc: Christoph Hellwig cc: Chaitanya Kulkarni cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nvme@lists.infradead.org cc: netdev@vger.kernel.org Reviewed-by: Sagi Grimberg --- Notes: ver #2) - Wrap lines at 80. ver #3) - Split nvme/host from nvme/target changes. drivers/nvme/host/tcp.c | 46 +++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index bf0230442d57..6f31cdbb696a 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -997,25 +997,25 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req) u32 h2cdata_left = req->h2cdata_left; while (true) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, + }; struct page *page = nvme_tcp_req_cur_page(req); size_t offset = nvme_tcp_req_cur_offset(req); size_t len = nvme_tcp_req_cur_length(req); bool last = nvme_tcp_pdu_last_send(req, len); int req_data_sent = req->data_sent; - int ret, flags = MSG_DONTWAIT; + int ret; if (last && !queue->data_digest && !nvme_tcp_queue_more(queue)) - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; else - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; - if (sendpage_ok(page)) { - ret = kernel_sendpage(queue->sock, page, offset, len, - flags); - } else { - ret = sock_no_sendpage(queue->sock, page, offset, len, - flags); - } + bvec_set_page(&bvec, page, len, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (ret <= 0) return ret; @@ -1054,22 +1054,24 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req) { struct nvme_tcp_queue *queue = req->queue; struct nvme_tcp_cmd_pdu *pdu = nvme_tcp_req_cmd_pdu(req); + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; bool inline_data = nvme_tcp_has_inline_data(req); u8 hdgst = nvme_tcp_hdgst_len(queue); int len = sizeof(*pdu) + hdgst - req->offset; - int flags = MSG_DONTWAIT; int ret; if (inline_data || nvme_tcp_queue_more(queue)) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; if (queue->hdr_digest && !req->offset) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); - ret = kernel_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, flags); + bvec_set_virt(&bvec, (void *)pdu + req->offset, len); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (unlikely(ret <= 0)) return ret; @@ -1093,6 +1095,8 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req) { struct nvme_tcp_queue *queue = req->queue; struct nvme_tcp_data_pdu *pdu = nvme_tcp_req_data_pdu(req); + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_MORE, }; u8 hdgst = nvme_tcp_hdgst_len(queue); int len = sizeof(*pdu) - req->offset + hdgst; int ret; @@ -1101,13 +1105,11 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); if (!req->h2cdata_left) - ret = kernel_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, - MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST); - else - ret = sock_no_sendpage(queue->sock, virt_to_page(pdu), - offset_in_page(pdu) + req->offset, len, - MSG_DONTWAIT | MSG_MORE); + msg.msg_flags |= MSG_SPLICE_PAGES; + + bvec_set_virt(&bvec, (void *)pdu + req->offset, len); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); + ret = sock_sendmsg(queue->sock, &msg); if (unlikely(ret <= 0)) return ret; From patchwork Tue Jun 20 14:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110564 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3739032vqr; Tue, 20 Jun 2023 08:14:52 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5cQ76gIhgYxCrCV84qdK8WOgPwpjtptQL0qv9tJDoIwra7BgDymnnzhF0FaZONWiRHSLv8 X-Received: by 2002:a05:6a20:94c9:b0:10f:9317:153a with SMTP id ht9-20020a056a2094c900b0010f9317153amr8617465pzb.62.1687274091981; Tue, 20 Jun 2023 08:14:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274091; cv=none; d=google.com; s=arc-20160816; b=YRah8nVhWZkNCwFjCtD6WL1jGKXJXNSdeQT6XzMPDJ+5obxiKxZApf6Y5bqSEZGd8Z efBQGYvFUAS3VnoJ2JZe2HMYOpmQe7n19h7drsT6WFQgWqgKwdYH1ib0Qvi0d3wImulW Gpo7wEmX/Q+8FI3V9HEUVz0yMZDjHucuIg1SxPs3fSpfGa3RFFvpAmY1zerwxEplXezZ hxBeVZtwpJDA3wncI8qzqx5B7VSutx3mndWsGYaew110BlotVms9bntmspw3jklPIAML /M/FybH3mAAWNr2hr26ninItpouwPLyt5E7e2E/i5WmxF0se+s1qqdlRoYQ7ViEMl5nS fFXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SOw9ClnSc9r47uoXD4terkn/PQVkRnQZDq9J97sUBoI=; b=jT3ZtRhZPTHaP+DULpQaVU2icuZyWczSs5iHU3I20aAYy9y35rKtcPAIwSFdU86W8T 5g8BkKWdqlAXP9jhkR5dryUAAMc0lKJuoPPv1pMbnFIS+TIdPM3S2FRVqM6NOufeGO+z 58vSFSNLEDvdBT/UyEnyRWNyKVkLFxjj9I0vmG67+ZuwkYwvCSxY/4I/aOB3lgRPDBjY fymjwQsB3PnrIfTAxMzIbzY2EYexafEncENkXnNbdfuklbzqmX9BtsSHSfK0RMREc4wD eEqIjBlKoVoEizIl3Gvu97wPbwWeIeW80rxw5pCqCOirAA9/LmMMBNmCJK2YfG2ptZLt jhdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="KLim/gUi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020aa79f4e000000b00666e5fec17esi1845897pfr.87.2023.06.20.08.14.39; Tue, 20 Jun 2023 08:14:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="KLim/gUi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232353AbjFTO5r (ORCPT + 99 others); Tue, 20 Jun 2023 10:57:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233098AbjFTO5k (ORCPT ); Tue, 20 Jun 2023 10:57:40 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B10881729 for ; Tue, 20 Jun 2023 07:56:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273003; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SOw9ClnSc9r47uoXD4terkn/PQVkRnQZDq9J97sUBoI=; b=KLim/gUi0l3SOHX1BrkELsYt7F4xmkeAOIKx1i/B+ppua6BlUNhjW+6W70wLJkMI8KFrK9 LPMhbgxviY9/wrOFed2pfPw8kNAjOIShVolvArjxksLTopONybuedtQCMYQN3vf7BCoYYA t3IwYG2YOPCsZBfLKZMqm45Jdkj4qnI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-207-8YJyb2soPem25NlFFJ-W_g-1; Tue, 20 Jun 2023 10:56:40 -0400 X-MC-Unique: 8YJyb2soPem25NlFFJ-W_g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 27CFB8925DC; Tue, 20 Jun 2023 14:54:28 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 19014112132C; Tue, 20 Jun 2023 14:54:26 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sagi Grimberg , Willem de Bruijn , Keith Busch , Jens Axboe , Christoph Hellwig , Chaitanya Kulkarni , linux-nvme@lists.infradead.org Subject: [PATCH net-next v3 11/18] nvme/target: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage Date: Tue, 20 Jun 2023 15:53:30 +0100 Message-ID: <20230620145338.1300897-12-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235117820509644?= X-GMAIL-MSGID: =?utf-8?q?1769235117820509644?= When transmitting data, call down into TCP using a single sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing several sendmsg and sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells Tested-by: Sagi Grimberg Acked-by: Willem de Bruijn cc: Keith Busch cc: Jens Axboe cc: Christoph Hellwig cc: Chaitanya Kulkarni cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nvme@lists.infradead.org cc: netdev@vger.kernel.org --- Notes: ver #3) - Split nvme/host from nvme/target changes. ver #2) - Wrap lines at 80. drivers/nvme/target/tcp.c | 46 ++++++++++++++++++++++++--------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index ed98df72c76b..868aa4de2e4c 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -576,13 +576,17 @@ static void nvmet_tcp_execute_request(struct nvmet_tcp_cmd *cmd) static int nvmet_try_send_data_pdu(struct nvmet_tcp_cmd *cmd) { + struct msghdr msg = { + .msg_flags = MSG_DONTWAIT | MSG_MORE | MSG_SPLICE_PAGES, + }; + struct bio_vec bvec; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->data_pdu) - cmd->offset + hdgst; int ret; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->data_pdu), - offset_in_page(cmd->data_pdu) + cmd->offset, - left, MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST); + bvec_set_virt(&bvec, (void *)cmd->data_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; @@ -603,17 +607,21 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch) int ret; while (cmd->cur_sg) { + struct msghdr msg = { + .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, + }; struct page *page = sg_page(cmd->cur_sg); + struct bio_vec bvec; u32 left = cmd->cur_sg->length - cmd->offset; - int flags = MSG_DONTWAIT; if ((!last_in_batch && cmd->queue->send_list_len) || cmd->wbytes_done + left < cmd->req.transfer_len || queue->data_digest || !queue->nvme_sq.sqhd_disabled) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; - ret = kernel_sendpage(cmd->queue->sock, page, cmd->offset, - left, flags); + bvec_set_page(&bvec, page, left, cmd->offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; @@ -649,18 +657,20 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch) static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd, bool last_in_batch) { + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; + struct bio_vec bvec; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->rsp_pdu) - cmd->offset + hdgst; - int flags = MSG_DONTWAIT; int ret; if (!last_in_batch && cmd->queue->send_list_len) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->rsp_pdu), - offset_in_page(cmd->rsp_pdu) + cmd->offset, left, flags); + bvec_set_virt(&bvec, (void *)cmd->rsp_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; cmd->offset += ret; @@ -677,18 +687,20 @@ static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd, static int nvmet_try_send_r2t(struct nvmet_tcp_cmd *cmd, bool last_in_batch) { + struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES, }; + struct bio_vec bvec; u8 hdgst = nvmet_tcp_hdgst_len(cmd->queue); int left = sizeof(*cmd->r2t_pdu) - cmd->offset + hdgst; - int flags = MSG_DONTWAIT; int ret; if (!last_in_batch && cmd->queue->send_list_len) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; else - flags |= MSG_EOR; + msg.msg_flags |= MSG_EOR; - ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->r2t_pdu), - offset_in_page(cmd->r2t_pdu) + cmd->offset, left, flags); + bvec_set_virt(&bvec, (void *)cmd->r2t_pdu + cmd->offset, left); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, left); + ret = sock_sendmsg(cmd->queue->sock, &msg); if (ret <= 0) return ret; cmd->offset += ret; From patchwork Tue Jun 20 14:53:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110554 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3734636vqr; Tue, 20 Jun 2023 08:09:18 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4LdnacHLETCc8ES7KrSf1ZFuCPjgodLtX4EFbx74KRfLQB8uuzulKrvUXBLyJQlIHscPEy X-Received: by 2002:a05:6a00:1a56:b0:64c:c453:244f with SMTP id h22-20020a056a001a5600b0064cc453244fmr10646252pfv.15.1687273758230; Tue, 20 Jun 2023 08:09:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273758; cv=none; d=google.com; s=arc-20160816; b=bSQw0/9DjR4cAMa5pQV7+vN/8QZwy3gHVVYGYFeIXrBCcnfFpIQ/ksJr1Chzh0nYsv 5ck+PI7ItjnBqOge0HatgS5hpxRW0kr1/0384oGbmKXjTy0jyUvhAeEi3ciwBWXhoJ4w KLZySL5eIYxV5ou9QUm3PEX2otvb9YeslA+x9P+ioq/o/YvKhxVbM1sUqWt61VWMF9JX F7QEjLuj1KeH6o9MQu9zdawB8SQcswPeJqUvinx5ZS4xxXxqPBJosETRGmTh28lMd1WZ DDUmGXGYJEe7vFmNHI6T5mIEdZd8TyGorHOuVGgKjfz/8T5fsJ714VJY9Kc7i+gapbuW Af4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=iAUCq6AQpwiSNlqp6e2886GYsbEpCz4tvlCIXkUtiq4=; b=ekdKIfRks4Vr1D6/tWwvCVdu1DleVffpeFJmm/2mKpjZopjZkZQ6bvdo45t6YL1CdS fPwQt+1uSb7UaeWGrqNJvuJzF6mQjf0ixc2xJn4P2brgya5+QDgfCz+HTb8ich5ew+aN 98LUgfaFOXzrNy5IkbEtQXBMX4Z0hic9Fq9uh9j0gULR7yRNJkH9O71YE5onQH9Sti7H boQiUsm86jlHy/qiuZd5bx0UR9VsLYzaqamCXwPq64kyexjYqIyZn3KGMHW2M5f8LNaW cR4o+NRsdBMjzHVAItruecfp7o4jVvpREHeCmMesRpJ0U0qLW6WVGzhOPWwomITfPce1 MMIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KZBoUKxK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p8-20020aa79e88000000b00666c9148d03si1875935pfq.6.2023.06.20.08.09.02; Tue, 20 Jun 2023 08:09:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KZBoUKxK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233223AbjFTO6P (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233183AbjFTO6H (ORCPT ); Tue, 20 Jun 2023 10:58:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E81BA1BC8 for ; Tue, 20 Jun 2023 07:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iAUCq6AQpwiSNlqp6e2886GYsbEpCz4tvlCIXkUtiq4=; b=KZBoUKxKJzU4gHSNZ0KEHwqAXgbx2YJVHJaNTalFV0M7YuReCRrTtEnRo59CKO56w90BDH B5O6V3n0Er4TJ3IQ/shFGuBqes/sGd0GyZUoIL7UADCvlAyDhnmXZzr+CIscgScKP1/7Xe SN2XpqWOux1+7U2+jJdjD30WpnpnHT8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-269-aatl8iayPtWLIAC2EOAbnw-1; Tue, 20 Jun 2023 10:56:49 -0400 X-MC-Unique: aatl8iayPtWLIAC2EOAbnw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C3B3A8870EA; Tue, 20 Jun 2023 14:54:30 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id B1AD5200A398; Tue, 20 Jun 2023 14:54:28 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Karsten Graul , Wenjia Zhang , Jan Karcher , "D. Wythe" , Tony Lu , Wen Gu , linux-s390@vger.kernel.org Subject: [PATCH net-next v3 12/18] smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES Date: Tue, 20 Jun 2023 15:53:31 +0100 Message-ID: <20230620145338.1300897-13-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234768011286495?= X-GMAIL-MSGID: =?utf-8?q?1769234768011286495?= Drop the smc_sendpage() code as smc_sendmsg() just passes the call down to the underlying TCP socket and smc_tx_sendpage() is just a wrapper around its sendmsg implementation. Signed-off-by: David Howells cc: Karsten Graul cc: Wenjia Zhang cc: Jan Karcher cc: "D. Wythe" cc: Tony Lu cc: Wen Gu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-s390@vger.kernel.org cc: netdev@vger.kernel.org --- net/smc/af_smc.c | 29 ----------------------------- net/smc/smc_stats.c | 2 +- net/smc/smc_stats.h | 1 - net/smc/smc_tx.c | 22 +--------------------- net/smc/smc_tx.h | 2 -- 5 files changed, 2 insertions(+), 54 deletions(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 538e9c6ec8c9..a7f887d91d89 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -3133,34 +3133,6 @@ static int smc_ioctl(struct socket *sock, unsigned int cmd, return put_user(answ, (int __user *)arg); } -static ssize_t smc_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct sock *sk = sock->sk; - struct smc_sock *smc; - int rc = -EPIPE; - - smc = smc_sk(sk); - lock_sock(sk); - if (sk->sk_state != SMC_ACTIVE) { - release_sock(sk); - goto out; - } - release_sock(sk); - if (smc->use_fallback) { - rc = kernel_sendpage(smc->clcsock, page, offset, - size, flags); - } else { - lock_sock(sk); - rc = smc_tx_sendpage(smc, page, offset, size, flags); - release_sock(sk); - SMC_STAT_INC(smc, sendpage_cnt); - } - -out: - return rc; -} - /* Map the affected portions of the rmbe into an spd, note the number of bytes * to splice in conn->splice_pending, and press 'go'. Delays consumer cursor * updates till whenever a respective page has been fully processed. @@ -3232,7 +3204,6 @@ static const struct proto_ops smc_sock_ops = { .sendmsg = smc_sendmsg, .recvmsg = smc_recvmsg, .mmap = sock_no_mmap, - .sendpage = smc_sendpage, .splice_read = smc_splice_read, }; diff --git a/net/smc/smc_stats.c b/net/smc/smc_stats.c index e80e34f7ac15..ca14c0f3a07d 100644 --- a/net/smc/smc_stats.c +++ b/net/smc/smc_stats.c @@ -227,7 +227,7 @@ static int smc_nl_fill_stats_tech_data(struct sk_buff *skb, SMC_NLA_STATS_PAD)) goto errattr; if (nla_put_u64_64bit(skb, SMC_NLA_STATS_T_SENDPAGE_CNT, - smc_tech->sendpage_cnt, + 0, SMC_NLA_STATS_PAD)) goto errattr; if (nla_put_u64_64bit(skb, SMC_NLA_STATS_T_CORK_CNT, diff --git a/net/smc/smc_stats.h b/net/smc/smc_stats.h index 84b7ecd8c05c..b60fe1eb37ab 100644 --- a/net/smc/smc_stats.h +++ b/net/smc/smc_stats.h @@ -71,7 +71,6 @@ struct smc_stats_tech { u64 clnt_v2_succ_cnt; u64 srv_v1_succ_cnt; u64 srv_v2_succ_cnt; - u64 sendpage_cnt; u64 urg_data_cnt; u64 splice_cnt; u64 cork_cnt; diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c index 9b9e0a190734..5147207808e5 100644 --- a/net/smc/smc_tx.c +++ b/net/smc/smc_tx.c @@ -167,8 +167,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg) * sndbuf_space is still available. The applications * should known how/when to uncork it. */ - if ((msg->msg_flags & MSG_MORE || - smc_tx_is_corked(smc)) && + if ((msg->msg_flags & MSG_MORE || smc_tx_is_corked(smc)) && atomic_read(&conn->sndbuf_space)) return true; @@ -297,25 +296,6 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len) return rc; } -int smc_tx_sendpage(struct smc_sock *smc, struct page *page, int offset, - size_t size, int flags) -{ - struct msghdr msg = {.msg_flags = flags}; - char *kaddr = kmap(page); - struct kvec iov; - int rc; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - iov.iov_base = kaddr + offset; - iov.iov_len = size; - iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &iov, 1, size); - rc = smc_tx_sendmsg(smc, &msg, size); - kunmap(page); - return rc; -} - /***************************** sndbuf consumer *******************************/ /* sndbuf consumer: actual data transfer of one target chunk with ISM write */ diff --git a/net/smc/smc_tx.h b/net/smc/smc_tx.h index 34b578498b1f..a59f370b8b43 100644 --- a/net/smc/smc_tx.h +++ b/net/smc/smc_tx.h @@ -31,8 +31,6 @@ void smc_tx_pending(struct smc_connection *conn); void smc_tx_work(struct work_struct *work); void smc_tx_init(struct smc_sock *smc); int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len); -int smc_tx_sendpage(struct smc_sock *smc, struct page *page, int offset, - size_t size, int flags); int smc_tx_sndbuf_nonempty(struct smc_connection *conn); void smc_tx_sndbuf_nonfull(struct smc_sock *smc); void smc_tx_consumer_update(struct smc_connection *conn, bool force); From patchwork Tue Jun 20 14:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110561 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3738188vqr; Tue, 20 Jun 2023 08:13:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ491YzsBAkMQEF5DlI1AvI9Sr8U2FSsdedK5ipzeNQarNnjgmxcOwphslucCqPQsefzuEHR X-Received: by 2002:a05:6808:3af:b0:39a:6c25:d60 with SMTP id n15-20020a05680803af00b0039a6c250d60mr3855003oie.29.1687274024788; Tue, 20 Jun 2023 08:13:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274024; cv=none; d=google.com; s=arc-20160816; b=vjOKoPg+QWXZWl8NBA5s9tbEYyjdjxZm+JW8dHDNNo030XY76z5+5l1kyXh4aRiaRu nJjvTGaRQx6b8P0ZJbToHBiXJXcqO8eLJYmwQFseXhZt4s66wzg5Socb4zU3N6mXtBfH MgmjemYPaDNyqaQHPJZDkIkEeJnn4Ye6qvv6fC1CvvVIt5Ue4wXdP3bJJAzX9ihiEQAT Y17ptIGvVzqavSo0zpuZTD7Q9zv8/MwRqHlkExOflcJlCmFajoKnIG5fpkWAp799hf0v SriUL+pUOtFDQL8KhIEtHH2Re13UCj1OD7m2crthvYH2CKFUdsrsiEYQdXP+A/Vofi2s bAtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=LPVmbLfi9bgga9t/1pR/YmjTCDUThBHs3vchY9icGtA=; b=b6OSEhzk5Vtq3HwzdT73TbBP6zF6L4666aWGpUjQ0eLDLc478n4cGqZ8hgLgjteEe3 AZEoE7synY0DuV0ztai1YpmzQzAvJ59x3G5x3oO+ZQQd/Pb1CG3kSy6a+gPsLwSz7wKX Hiw4jJ0ekl4POJkH8PRu1zQQ5/dXbbjFKzYeqUvIncUtaIrXsj4Ob7zQaEnWiZKKbC+Q nK+4PGtJhtL7j7chbDho8ahzzEq+fWk8pyV1zdI6BDw1AcA64RI7Z88FpKo/t5MUsPSJ wi7F2bz/EPFdbL0uHDC+nrLaXgkcADpRKpJaJHzNy9iSDGKyO2KPgp49oOigUI1nf/1x 8ezw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MsdUC5o5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w67-20020a636246000000b00524ecfa05d8si1977465pgb.15.2023.06.20.08.13.30; Tue, 20 Jun 2023 08:13:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MsdUC5o5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233505AbjFTO6o (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233435AbjFTO62 (ORCPT ); Tue, 20 Jun 2023 10:58:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3069B19B3 for ; Tue, 20 Jun 2023 07:57:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273040; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LPVmbLfi9bgga9t/1pR/YmjTCDUThBHs3vchY9icGtA=; b=MsdUC5o52FZFpwxiI2KIo1CiMTCfAKAnNSIDh7v1SThbkWSoCQvCF58J2J4KiXXGAKNycq x3m0sK3ML+lzIdQ24sCEzvOOHdCTGGUoSRBVnGlafs7G4Nc//QxzDJnfHaV7x/sxRd5Ng/ VhIZilSKjdq6cMJZ3YjmnxVlxUkABPo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-572-azJhDjRHOim524o5jKEhlw-1; Tue, 20 Jun 2023 10:57:14 -0400 X-MC-Unique: azJhDjRHOim524o5jKEhlw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 543CA3C1A5FB; Tue, 20 Jun 2023 14:54:33 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6F4C51400C34; Tue, 20 Jun 2023 14:54:31 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mark Fasheh , Joel Becker , Joseph Qi , ocfs2-devel@oss.oracle.com Subject: [PATCH net-next v3 13/18] ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Tue, 20 Jun 2023 15:53:32 +0100 Message-ID: <20230620145338.1300897-14-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235047468130642?= X-GMAIL-MSGID: =?utf-8?q?1769235047468130642?= Fix ocfs2 to use the page fragment allocator rather than kzalloc in order to allocate the buffers for the handshake message and keepalive request and reply messages. Switch from using sendpage() to using sendmsg() + MSG_SPLICE_PAGES so that sendpage can be phased out. Signed-off-by: David Howells cc: Mark Fasheh cc: Joel Becker cc: Joseph Qi cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: ocfs2-devel@oss.oracle.com cc: netdev@vger.kernel.org --- Notes: ver #2) - Wrap lines at 80. fs/ocfs2/cluster/tcp.c | 109 +++++++++++++++++++++++------------------ 1 file changed, 60 insertions(+), 49 deletions(-) diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index aecbd712a00c..969d347ed9fa 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -110,9 +110,6 @@ static struct work_struct o2net_listen_work; static struct o2hb_callback_func o2net_hb_up, o2net_hb_down; #define O2NET_HB_PRI 0x1 -static struct o2net_handshake *o2net_hand; -static struct o2net_msg *o2net_keep_req, *o2net_keep_resp; - static int o2net_sys_err_translations[O2NET_ERR_MAX] = {[O2NET_ERR_NONE] = 0, [O2NET_ERR_NO_HNDLR] = -ENOPROTOOPT, @@ -930,19 +927,22 @@ static int o2net_send_tcp_msg(struct socket *sock, struct kvec *vec, } static void o2net_sendpage(struct o2net_sock_container *sc, - void *kmalloced_virt, - size_t size) + void *virt, size_t size) { struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num); + struct msghdr msg = {}; + struct bio_vec bv; ssize_t ret; + bvec_set_virt(&bv, virt, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, size); + while (1) { + msg.msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES; mutex_lock(&sc->sc_send_lock); - ret = sc->sc_sock->ops->sendpage(sc->sc_sock, - virt_to_page(kmalloced_virt), - offset_in_page(kmalloced_virt), - size, MSG_DONTWAIT); + ret = sock_sendmsg(sc->sc_sock, &msg); mutex_unlock(&sc->sc_send_lock); + if (ret == size) break; if (ret == (ssize_t)-EAGAIN) { @@ -1168,6 +1168,7 @@ static int o2net_process_message(struct o2net_sock_container *sc, struct o2net_msg *hdr) { struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num); + struct o2net_msg *keep_resp; int ret = 0, handler_status; enum o2net_system_error syserr; struct o2net_msg_handler *nmh = NULL; @@ -1186,8 +1187,17 @@ static int o2net_process_message(struct o2net_sock_container *sc, be32_to_cpu(hdr->status)); goto out; case O2NET_MSG_KEEP_REQ_MAGIC: - o2net_sendpage(sc, o2net_keep_resp, - sizeof(*o2net_keep_resp)); + keep_resp = alloc_skb_frag(sizeof(*keep_resp), + GFP_KERNEL); + if (!keep_resp) { + ret = -ENOMEM; + goto out; + } + memset(keep_resp, 0, sizeof(*keep_resp)); + keep_resp->magic = + cpu_to_be16(O2NET_MSG_KEEP_RESP_MAGIC); + o2net_sendpage(sc, keep_resp, sizeof(*keep_resp)); + folio_put(virt_to_folio(keep_resp)); goto out; case O2NET_MSG_KEEP_RESP_MAGIC: goto out; @@ -1439,15 +1449,23 @@ static void o2net_rx_until_empty(struct work_struct *work) sc_put(sc); } -static void o2net_initialize_handshake(void) +static struct o2net_handshake *o2net_initialize_handshake(void) { - o2net_hand->o2hb_heartbeat_timeout_ms = cpu_to_be32( - O2HB_MAX_WRITE_TIMEOUT_MS); - o2net_hand->o2net_idle_timeout_ms = cpu_to_be32(o2net_idle_timeout()); - o2net_hand->o2net_keepalive_delay_ms = cpu_to_be32( - o2net_keepalive_delay()); - o2net_hand->o2net_reconnect_delay_ms = cpu_to_be32( - o2net_reconnect_delay()); + struct o2net_handshake *hand; + + hand = alloc_skb_frag(sizeof(*hand), GFP_KERNEL); + if (!hand) + return NULL; + + memset(hand, 0, sizeof(*hand)); + hand->protocol_version = cpu_to_be64(O2NET_PROTOCOL_VERSION); + hand->connector_id = cpu_to_be64(1); + hand->o2hb_heartbeat_timeout_ms = + cpu_to_be32(O2HB_MAX_WRITE_TIMEOUT_MS); + hand->o2net_idle_timeout_ms = cpu_to_be32(o2net_idle_timeout()); + hand->o2net_keepalive_delay_ms = cpu_to_be32(o2net_keepalive_delay()); + hand->o2net_reconnect_delay_ms = cpu_to_be32(o2net_reconnect_delay()); + return hand; } /* ------------------------------------------------------------ */ @@ -1456,16 +1474,22 @@ static void o2net_initialize_handshake(void) * rx path will see the response and mark the sc valid */ static void o2net_sc_connect_completed(struct work_struct *work) { + struct o2net_handshake *hand; struct o2net_sock_container *sc = container_of(work, struct o2net_sock_container, sc_connect_work); + hand = o2net_initialize_handshake(); + if (!hand) + goto out; + mlog(ML_MSG, "sc sending handshake with ver %llu id %llx\n", (unsigned long long)O2NET_PROTOCOL_VERSION, - (unsigned long long)be64_to_cpu(o2net_hand->connector_id)); + (unsigned long long)be64_to_cpu(hand->connector_id)); - o2net_initialize_handshake(); - o2net_sendpage(sc, o2net_hand, sizeof(*o2net_hand)); + o2net_sendpage(sc, hand, sizeof(*hand)); + folio_put(virt_to_folio(hand)); +out: sc_put(sc); } @@ -1475,8 +1499,15 @@ static void o2net_sc_send_keep_req(struct work_struct *work) struct o2net_sock_container *sc = container_of(work, struct o2net_sock_container, sc_keepalive_work.work); + struct o2net_msg *keep_req; - o2net_sendpage(sc, o2net_keep_req, sizeof(*o2net_keep_req)); + keep_req = alloc_skb_frag(sizeof(*keep_req), GFP_KERNEL); + if (keep_req) { + memset(keep_req, 0, sizeof(*keep_req)); + keep_req->magic = cpu_to_be16(O2NET_MSG_KEEP_REQ_MAGIC); + o2net_sendpage(sc, keep_req, sizeof(*keep_req)); + folio_put(virt_to_folio(keep_req)); + } sc_put(sc); } @@ -1780,6 +1811,7 @@ static int o2net_accept_one(struct socket *sock, int *more) struct socket *new_sock = NULL; struct o2nm_node *node = NULL; struct o2nm_node *local_node = NULL; + struct o2net_handshake *hand; struct o2net_sock_container *sc = NULL; struct o2net_node *nn; unsigned int nofs_flag; @@ -1882,8 +1914,11 @@ static int o2net_accept_one(struct socket *sock, int *more) o2net_register_callbacks(sc->sc_sock->sk, sc); o2net_sc_queue_work(sc, &sc->sc_rx_work); - o2net_initialize_handshake(); - o2net_sendpage(sc, o2net_hand, sizeof(*o2net_hand)); + hand = o2net_initialize_handshake(); + if (hand) { + o2net_sendpage(sc, hand, sizeof(*hand)); + folio_put(virt_to_folio(hand)); + } out: if (new_sock) @@ -2090,21 +2125,8 @@ int o2net_init(void) unsigned long i; o2quo_init(); - o2net_debugfs_init(); - o2net_hand = kzalloc(sizeof(struct o2net_handshake), GFP_KERNEL); - o2net_keep_req = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL); - o2net_keep_resp = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL); - if (!o2net_hand || !o2net_keep_req || !o2net_keep_resp) - goto out; - - o2net_hand->protocol_version = cpu_to_be64(O2NET_PROTOCOL_VERSION); - o2net_hand->connector_id = cpu_to_be64(1); - - o2net_keep_req->magic = cpu_to_be16(O2NET_MSG_KEEP_REQ_MAGIC); - o2net_keep_resp->magic = cpu_to_be16(O2NET_MSG_KEEP_RESP_MAGIC); - for (i = 0; i < ARRAY_SIZE(o2net_nodes); i++) { struct o2net_node *nn = o2net_nn_from_num(i); @@ -2122,21 +2144,10 @@ int o2net_init(void) } return 0; - -out: - kfree(o2net_hand); - kfree(o2net_keep_req); - kfree(o2net_keep_resp); - o2net_debugfs_exit(); - o2quo_exit(); - return -ENOMEM; } void o2net_exit(void) { o2quo_exit(); - kfree(o2net_hand); - kfree(o2net_keep_req); - kfree(o2net_keep_resp); o2net_debugfs_exit(); } From patchwork Tue Jun 20 14:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110567 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3740240vqr; Tue, 20 Jun 2023 08:16:22 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7tiWGQonEXkG0lfP50yW2WeExfY7KDqt8xSfj8lVOWJYfc63qJX+ETDorCsCOIuJ9E4x37 X-Received: by 2002:a05:6a20:3d26:b0:10f:52e2:49ec with SMTP id y38-20020a056a203d2600b0010f52e249ecmr9801148pzi.53.1687274182529; Tue, 20 Jun 2023 08:16:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274182; cv=none; d=google.com; s=arc-20160816; b=rcLTBvn5RSSvjLTEZC/zP9p5H+/1Df8nhyneLEIDsPj688DuzRcwXZtKWFflWcMqRW eLsSvnES2+6hjDW1u0OiWrNf6q//tA0wo/UK8ya/lc8ggeprNkf6oAAtkn3+LmoQC+f0 9Iw4grPBwRQCfxE1kCi+m4qOnnJgimU/3UFsaXdMK1KptRByU6dEu6FElxOmUTiRfnGo GZtNEaQs1TtfFOP3Xxff4vralJvvqyaat4WVeugbBJaDmx2+iMXJ2ON60D5pAh+V+Uly quT0SDZXg5KFRqZ/nHibQKFWyEZsxILROPIP6TC40Xi1z/Yhf5RElT21Q2M6VKSJqBfm PPAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2HlaXseyq5U5o1i8LIFlWuZ5rATyuHBFais1x2nt408=; b=l0/uHhisx920qlvIYrytRRklITBecWRJCkayoD5i/dD5hFMxwyAiB713H4Bwi3OXXR G2rVFDFJuX5lHWinPPXstTt8425zdo3NaUF6zRmMquJfxmkwI84cr3rWJq0Ug4XAzgrN zXKXV5bmeq9zleYSC0XTlInnwu0HRdEWiKvHqM4kgZPXCU8jaCYkHio2KyAKuZJdiQWi QaKXnAnlSwnqEL7SxhRDGpw9vy1PIMCVba30g68Hcset4//vMlhP370CPbLAzWLTpHug hN5Us8yS1+ohq/7SZkyDKk9Fv3eA7NanTGwVVmQldbTa1G4siysJWu9cg6VkLQWJt4m8 MogQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=h7GrBAeA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020aa79f4e000000b00666e5fec17esi1845897pfr.87.2023.06.20.08.16.09; Tue, 20 Jun 2023 08:16:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=h7GrBAeA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233466AbjFTO7I (ORCPT + 99 others); Tue, 20 Jun 2023 10:59:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233206AbjFTO6v (ORCPT ); Tue, 20 Jun 2023 10:58:51 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A84AA1BEF for ; Tue, 20 Jun 2023 07:57:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2HlaXseyq5U5o1i8LIFlWuZ5rATyuHBFais1x2nt408=; b=h7GrBAeAcdQwyfb7vFB7U11NlmIKokVTCaoRae8F2AjAvfPeHZ8rL31QcTgGCEDKlLfvX6 vZPrySDL1068WnOf+e4LUxXZ6BxTbHOk9ZAo3WDe0JsPU1WzHGDvTnBczsCbvUlB6/L/9C rfxoU9a8ijODj6d1bvruO6f2GG2JQvk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-QZ1cyihYMfqFmIoZXCbVIQ-1; Tue, 20 Jun 2023 10:57:28 -0400 X-MC-Unique: QZ1cyihYMfqFmIoZXCbVIQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 88D0B29ABA3F; Tue, 20 Jun 2023 14:54:36 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id E58D02166B26; Tue, 20 Jun 2023 14:54:33 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH net-next v3 14/18] drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Tue, 20 Jun 2023 15:53:33 +0100 Message-ID: <20230620145338.1300897-15-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235212887902647?= X-GMAIL-MSGID: =?utf-8?q?1769235212887902647?= Use sendmsg() conditionally with MSG_SPLICE_PAGES in _drbd_send_page() rather than calling sendpage() or _drbd_no_send_page(). Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- Notes: ver #2) - Wrap lines at 80. drivers/block/drbd/drbd_main.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 83987e7a5ef2..8a01a18a2550 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1540,7 +1540,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa int offset, size_t size, unsigned msg_flags) { struct socket *socket = peer_device->connection->data.socket; - int len = size; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = msg_flags, }; int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1549,33 +1550,35 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (drbd_disable_sendpage || !sendpage_ok(page)) - return _drbd_no_send_page(peer_device, page, offset, size, msg_flags); + if (!drbd_disable_sendpage && sendpage_ok(page)) + msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; + + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - msg_flags |= MSG_NOSIGNAL; drbd_update_congested(peer_device->connection); do { int sent; - sent = socket->ops->sendpage(socket, page, offset, len, msg_flags); + sent = sock_sendmsg(socket, &msg); if (sent <= 0) { if (sent == -EAGAIN) { if (we_should_drop_the_connection(peer_device->connection, socket)) break; continue; } - drbd_warn(peer_device->device, "%s: size=%d len=%d sent=%d\n", - __func__, (int)size, len, sent); + drbd_warn(peer_device->device, "%s: size=%d len=%zu sent=%d\n", + __func__, (int)size, msg_data_left(&msg), + sent); if (sent < 0) err = sent; break; } - len -= sent; - offset += sent; - } while (len > 0 /* THINK && device->cstate >= C_CONNECTED*/); + } while (msg_data_left(&msg) + /* THINK && device->cstate >= C_CONNECTED*/); clear_bit(NET_CONGESTED, &peer_device->connection->flags); - if (len == 0) { + if (!msg_data_left(&msg)) { err = 0; peer_device->device->send_cnt += size >> 9; } From patchwork Tue Jun 20 14:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110558 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3736656vqr; Tue, 20 Jun 2023 08:11:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6GtadiLhm8pKwicTBJ+1MXkjmG3HB56kQKVF6adLfOStPO3EUp7HAbFRjFiCJrvkIkhGA9 X-Received: by 2002:a17:902:c651:b0:1af:bbfd:1c07 with SMTP id s17-20020a170902c65100b001afbbfd1c07mr12436300pls.57.1687273905054; Tue, 20 Jun 2023 08:11:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273905; cv=none; d=google.com; s=arc-20160816; b=T8/acGlFeXmJiF7Qyn4pVDFGSuvogQeaRG6ulTHgOe/SB870ZHpJnVYhsu8/LJhU8M cCoQJCqiOlynEYH/7mLClj75XvFQnWBkbONW7XTpt5F/IIUz+cITPS3cYzWTYjGkVnhr ONVR5gJm4GYrJJCpi3ylf1AoV8UQx39VXBvSPstzY6BCCPzMyECPKvmqxdyauh6aTSYW NZdkgOwYvMjpZ4PuLCjpT0DRruwZR0fcEXqTkZPhhYPvToM2Veh9++9nK9GTOKX8M/Kq nq5OybWDfzByfBYUkgsy2ueRRpLtvimYlgnzd9AY28GsuUj4qACmBGu9S2Y6mW2u9cpD Xjlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XKse4vF5Frqs4bdWsYMKttBmqWRsH1PFiAHwvjH79qA=; b=W9yLjSD3CszzR7SJ0PPBLWUyYGS78z/rZfpVb+HMbY1CvXM9Hhy5T6xp2D8Pvh3O+A ry8EZJDupDQtGZHBjZ1nvc3dW7bZpEXHAwHIYTczGDHg22iZIM/HAyycjQukeMKUjWpf n5648eu060IQ/ZNx70FC2J/YffyOk8D4tYeTaO+V5LACctY4/LLtU3Azy4GIZY5aw4QC jLxrVSdEzpNz3heFC48UU5isIPXl/9lwFfwaMnkrzUF5SSKtu0BmAF350dF0Zj3lTXTR KRYWWvbTV07PRefeRpx9VA9aSz2YSSLwFsveghPjwcPHJWIKAuDYTQhBujCpXR/yCMUi 6F9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JPxjPY4P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w19-20020a170902d3d300b001b010445600si1976265plb.500.2023.06.20.08.11.31; Tue, 20 Jun 2023 08:11:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JPxjPY4P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233471AbjFTO6c (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233448AbjFTO6X (ORCPT ); Tue, 20 Jun 2023 10:58:23 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 153EE198B for ; Tue, 20 Jun 2023 07:57:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKse4vF5Frqs4bdWsYMKttBmqWRsH1PFiAHwvjH79qA=; b=JPxjPY4PDmL9ZbtXTfjKDZnyAmqdVXlsUcdv6agqeCBsGvE547yLchHb49YHBEJpX1Zbbi Nvjyn+nZ7jD2UZsHKkJvSJCmPa9aljl8GrOjSoGQvLQleorD79bTjauOfGID4g/AXibfci GyXvOOinWQjIU0kBORhebAc8UtfuiSQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-620-Q6NvzJ07NxKczalwLXSO3g-1; Tue, 20 Jun 2023 10:57:10 -0400 X-MC-Unique: Q6NvzJ07NxKczalwLXSO3g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1552E1064B1B; Tue, 20 Jun 2023 14:54:39 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 341B5200A398; Tue, 20 Jun 2023 14:54:37 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH net-next v3 15/18] drdb: Send an entire bio in a single sendmsg Date: Tue, 20 Jun 2023 15:53:34 +0100 Message-ID: <20230620145338.1300897-16-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234921840261265?= X-GMAIL-MSGID: =?utf-8?q?1769234921840261265?= Since _drdb_sendpage() is now using sendmsg to send the pages rather sendpage, pass the entire bio in one go using a bvec iterator instead of doing it piecemeal. Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- Notes: ver #2) - Use "unsigned int" rather than "unsigned". drivers/block/drbd/drbd_main.c | 77 +++++++++++----------------------- 1 file changed, 25 insertions(+), 52 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 8a01a18a2550..beba74ae093b 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1520,28 +1520,15 @@ static void drbd_update_congested(struct drbd_connection *connection) * As a workaround, we disable sendpage on pages * with page_count == 0 or PageSlab. */ -static int _drbd_no_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) -{ - struct socket *socket; - void *addr; - int err; - - socket = peer_device->connection->data.socket; - addr = kmap(page) + offset; - err = drbd_send_all(peer_device->connection, socket, addr, size, msg_flags); - kunmap(page); - if (!err) - peer_device->device->send_cnt += size >> 9; - return err; -} - -static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) +static int _drbd_send_pages(struct drbd_peer_device *peer_device, + struct iov_iter *iter, unsigned int msg_flags) { struct socket *socket = peer_device->connection->data.socket; - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = msg_flags, }; + struct msghdr msg = { + .msg_flags = msg_flags | MSG_NOSIGNAL, + .msg_iter = *iter, + }; + size_t size = iov_iter_count(iter); int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1550,11 +1537,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (!drbd_disable_sendpage && sendpage_ok(page)) - msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; - - bvec_set_page(&bvec, page, offset, size); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + if (drbd_disable_sendpage) + msg.msg_flags &= ~(MSG_NOSIGNAL | MSG_SPLICE_PAGES); drbd_update_congested(peer_device->connection); do { @@ -1587,39 +1571,22 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa static int _drbd_send_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_no_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) - ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, 0); } static int _drbd_send_zc_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, MSG_SPLICE_PAGES); } static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, @@ -1631,10 +1598,16 @@ static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, /* hint all but last page with MSG_MORE */ page_chain_for_each(page) { + struct iov_iter iter; + struct bio_vec bvec; unsigned l = min_t(unsigned, len, PAGE_SIZE); - err = _drbd_send_page(peer_device, page, 0, l, - page_chain_next(page) ? MSG_MORE : 0); + bvec_set_page(&bvec, page, 0, l); + iov_iter_bvec(&iter, ITER_SOURCE, &bvec, 1, l); + + err = _drbd_send_pages(peer_device, &iter, + MSG_SPLICE_PAGES | + (page_chain_next(page) ? MSG_MORE : 0)); if (err) return err; len -= l; From patchwork Tue Jun 20 14:53:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110568 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3740374vqr; Tue, 20 Jun 2023 08:16:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6p10X0vDAY7bRJsT8auKRc6JmBYmEZgTBRqze1eD4CTMdwg1uDSSV8X18aYsP2hdBY9EF0 X-Received: by 2002:a05:6808:148c:b0:38e:c2a4:3530 with SMTP id e12-20020a056808148c00b0038ec2a43530mr4903742oiw.9.1687274192422; Tue, 20 Jun 2023 08:16:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687274192; cv=none; d=google.com; s=arc-20160816; b=HdZ0K+0f8C3a7flYhgiGVzWu08g4jUPUYv/iKbCWRqoAADvKRKOZHLN590urwNqg2L BckxHYqpCa2DfDX1iphT2x63QlWKJZY+g16Sqh3RpM/gZSEIBMiiZlqCOhVEPA6QhOO0 Ny0MdYKCL2OCaGhcvIIVwu/vVOvWXWPS9cq5pjrOp3f7rOJdZdIp1t6u3OmzJXS5OH2T mKO+MAPlP/q2jrbG79MRsOWp8hZqQtrNZxndQMIVUzZSgipHm8JYptVihm9p/ARfeyAy CC0yTrMhjjDu4bWW0utHYKH/649JsRYhT+eg7YZx0RPE+EOJCXxcWgj87RePI+Jcn79M F5Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bszGYXAPCfyFrCrKXBgf4DyEzlv1vS1+csO6wWtbX4c=; b=Wy8KzUGTeRFiVAIIrhEiAFPajYkg0ev3DK4KMVJK7cgivqyu1Tir6cLsjDJcxmQnIr CzNAagHkV7EyMBXwFPsN90igSaGwfoNFPNLvM5RoPSRpXl3Ta9XZc6fobmEblcjKfsgc Jf75N5m1NZHHY52u4qJSfP6O8VPf+izu6oJU1HoVnikpoq1W8Pm6qaH3gAl94f2KZxTV zaFzUzyBM8WsWmLM6/pi898XXjiQvWxdFN6Hcsy5y4XhIgaOX+NjcwDK/Bruh9MsJjZu 9ddNXG6k6vdfjn6IUbyLRcgarNZKm6qLSaeuFhLiMiimxPSK94u8c8+g/drKuBgbpsGb KLag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d+QZna+v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bs66-20020a632845000000b00543a89c95c2si1722484pgb.207.2023.06.20.08.16.18; Tue, 20 Jun 2023 08:16:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d+QZna+v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233512AbjFTO6s (ORCPT + 99 others); Tue, 20 Jun 2023 10:58:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233451AbjFTO6b (ORCPT ); Tue, 20 Jun 2023 10:58:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3910B19A5 for ; Tue, 20 Jun 2023 07:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273036; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bszGYXAPCfyFrCrKXBgf4DyEzlv1vS1+csO6wWtbX4c=; b=d+QZna+v3yqE6N6R6OygzXJA5sGiVyiagRjfLt/wxaCGWUrWODSClsEw8q6fKci7PofGXT SXXVQOnjxro7KT5EUioZzCHcG0JW/8qwLitJL6PzSXAuGRka23VhgdtJgkW2PPWW+iTdZb r9dDh1Lm+Ap9fTsRPoSGjXMRQqtWG68= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-180-5iBq3k2zOLaeVvGBHbh90g-1; Tue, 20 Jun 2023 10:57:13 -0400 X-MC-Unique: 5iBq3k2zOLaeVvGBHbh90g-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 176203C1BFD2; Tue, 20 Jun 2023 14:54:42 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9E84B422B0; Tue, 20 Jun 2023 14:54:39 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lee Duncan , Chris Leech , Mike Christie , Maurizio Lombardi , "James E.J. Bottomley" , "Martin K. Petersen" , Al Viro , open-iscsi@googlegroups.com, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org Subject: [PATCH net-next v3 16/18] iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Date: Tue, 20 Jun 2023 15:53:35 +0100 Message-ID: <20230620145338.1300897-17-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769235223118014231?= X-GMAIL-MSGID: =?utf-8?q?1769235223118014231?= Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage. This allows multiple pages and multipage folios to be passed through. TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the entire set of pages it's going to transfer plus two for the header and trailer and page fragments to hold the header and trailer - and then call sendmsg once for the entire message. Signed-off-by: David Howells cc: Lee Duncan cc: Chris Leech cc: Mike Christie cc: Maurizio Lombardi cc: "James E.J. Bottomley" cc: "Martin K. Petersen" cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: Al Viro cc: open-iscsi@googlegroups.com cc: linux-scsi@vger.kernel.org cc: target-devel@vger.kernel.org cc: netdev@vger.kernel.org --- Notes: ver #2) - Wrap lines at 80. drivers/scsi/iscsi_tcp.c | 26 +++++++++--------------- drivers/scsi/iscsi_tcp.h | 2 +- drivers/target/iscsi/iscsi_target_util.c | 15 ++++++++------ 3 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 9637d4bc2bc9..9ab8555180a3 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -301,35 +301,32 @@ static int iscsi_sw_tcp_xmit_segment(struct iscsi_tcp_conn *tcp_conn, while (!iscsi_tcp_segment_done(tcp_conn, segment, 0, r)) { struct scatterlist *sg; + struct msghdr msg = {}; + struct bio_vec bv; unsigned int offset, copy; - int flags = 0; r = 0; offset = segment->copied; copy = segment->size - offset; if (segment->total_copied + segment->size < segment->total_size) - flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST; + msg.msg_flags |= MSG_MORE; if (tcp_sw_conn->queue_recv) - flags |= MSG_DONTWAIT; + msg.msg_flags |= MSG_DONTWAIT; - /* Use sendpage if we can; else fall back to sendmsg */ if (!segment->data) { + if (!tcp_conn->iscsi_conn->datadgst_en) + msg.msg_flags |= MSG_SPLICE_PAGES; sg = segment->sg; offset += segment->sg_offset + sg->offset; - r = tcp_sw_conn->sendpage(sk, sg_page(sg), offset, - copy, flags); + bvec_set_page(&bv, sg_page(sg), copy, offset); } else { - struct msghdr msg = { .msg_flags = flags }; - struct kvec iov = { - .iov_base = segment->data + offset, - .iov_len = copy - }; - - r = kernel_sendmsg(sk, &msg, &iov, 1, copy); + bvec_set_virt(&bv, segment->data + offset, copy); } + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, copy); + r = sock_sendmsg(sk, &msg); if (r < 0) { iscsi_tcp_segment_unmap(segment); return r; @@ -746,7 +743,6 @@ iscsi_sw_tcp_conn_bind(struct iscsi_cls_session *cls_session, sock_no_linger(sk); iscsi_sw_tcp_conn_set_callbacks(conn); - tcp_sw_conn->sendpage = tcp_sw_conn->sock->ops->sendpage; /* * set receive state machine into initial state */ @@ -777,8 +773,6 @@ static int iscsi_sw_tcp_conn_set_param(struct iscsi_cls_conn *cls_conn, return -ENOTCONN; } iscsi_set_param(cls_conn, param, buf, buflen); - tcp_sw_conn->sendpage = conn->datadgst_en ? - sock_no_sendpage : tcp_sw_conn->sock->ops->sendpage; mutex_unlock(&tcp_sw_conn->sock_lock); break; case ISCSI_PARAM_MAX_R2T: diff --git a/drivers/scsi/iscsi_tcp.h b/drivers/scsi/iscsi_tcp.h index 68e14a344904..d6ec08d7eb63 100644 --- a/drivers/scsi/iscsi_tcp.h +++ b/drivers/scsi/iscsi_tcp.h @@ -48,7 +48,7 @@ struct iscsi_sw_tcp_conn { uint32_t sendpage_failures_cnt; uint32_t discontiguous_hdr_cnt; - ssize_t (*sendpage)(struct socket *, struct page *, int, size_t, int); + bool can_splice_to_tcp; }; struct iscsi_sw_tcp_host { diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index b14835fcb033..6231fa4ef5c6 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -1129,6 +1129,8 @@ int iscsit_fe_sendpage_sg( struct iscsit_conn *conn) { struct scatterlist *sg = cmd->first_data_sg; + struct bio_vec bvec; + struct msghdr msghdr = { .msg_flags = MSG_SPLICE_PAGES, }; struct kvec iov; u32 tx_hdr_size, data_len; u32 offset = cmd->first_data_sg_off; @@ -1172,17 +1174,18 @@ int iscsit_fe_sendpage_sg( u32 space = (sg->length - offset); u32 sub_len = min_t(u32, data_len, space); send_pg: - tx_sent = conn->sock->ops->sendpage(conn->sock, - sg_page(sg), sg->offset + offset, sub_len, 0); + bvec_set_page(&bvec, sg_page(sg), sub_len, sg->offset + offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, sub_len); + + tx_sent = conn->sock->ops->sendmsg(conn->sock, &msghdr, + sub_len); if (tx_sent != sub_len) { if (tx_sent == -EAGAIN) { - pr_err("tcp_sendpage() returned" - " -EAGAIN\n"); + pr_err("sendmsg/splice returned -EAGAIN\n"); goto send_pg; } - pr_err("tcp_sendpage() failure: %d\n", - tx_sent); + pr_err("sendmsg/splice failure: %d\n", tx_sent); return -1; } From patchwork Tue Jun 20 14:53:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110555 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3734638vqr; Tue, 20 Jun 2023 08:09:18 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4eDRi2rJgYelGH1gLLX9crSEiQN4KknJpqyr1GH5qC29WW4mjPJ1kfAycB+/3/Wrz1Ovsc X-Received: by 2002:a92:d1d2:0:b0:331:9a82:33f6 with SMTP id u18-20020a92d1d2000000b003319a8233f6mr11051344ilg.5.1687273758518; Tue, 20 Jun 2023 08:09:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273758; cv=none; d=google.com; s=arc-20160816; b=UfVzn7wG1RLGpsT2B6jqsBNIXXmkdZ7dP5tS0tS8RNz9YBEvfQXBp9T18eIqH24BJ1 De+NRsxXjaDn8AM9VsWwSxUh+EWZJ5zpWK/uOrZWMHt7GUqOjSU/agIvCc13aFLEIF1u 0vArrztkxYIZfg/r2A5C20CtZBnEsco8vAQHYMCRHH/WGUGnhAlw6g+cDN/AYzb5z/vk MqQvN9wRBl8W4nE2ZgYKkoGbvy0B1HgIa8xlMvJ10Htby2NgPFcD6wFfxFUa9HcAy+hh sH4NAPvsJPf7EC1ac7YLhfWAtHtUA/nVdlcvb4bhMyZYWQMheysQrD8g4LqtzCbukHKy Txjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=O4BeeShTAxgKz2QzPCzlja5T5FtGvmauABSkyx/PwCY=; b=Hnfx2A6hSypSDsB9f7n4CHZzeHvRV45Aa/GSX5ei5iv0NhEt+8MbMp0dtPPiU6lbb3 Z93gR5l4rCxcUMB3n4JOeGx7xuyjNUyz/qPBP08ULci8GgGG/H4eDDCM7stfVtqeVIDS xu/kVNeB+mD8VqFSWsZAYoIOTVj7yH6hgmWIB5kBBHMVElto+am+5TM+agDZ+qKaAD68 DwK4jv25lGOFPdy26VvUrg0WQRcq6Yd7IPltDq91rFcArnW+TuOGFIT5hAbVjF72gsSf Ru0seN/x+5RBAdo5fA6wkNz1zUQytiUgB51qHlE37pUIgQUGGCrmk2uhrKxt1jDFBVBW i6Pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="fN/qXnj9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o14-20020a63730e000000b0051374678f95si1860743pgc.808.2023.06.20.08.09.02; Tue, 20 Jun 2023 08:09:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="fN/qXnj9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233519AbjFTPAY (ORCPT + 99 others); Tue, 20 Jun 2023 11:00:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233528AbjFTO7w (ORCPT ); Tue, 20 Jun 2023 10:59:52 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF8D51B4 for ; Tue, 20 Jun 2023 07:58:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273089; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O4BeeShTAxgKz2QzPCzlja5T5FtGvmauABSkyx/PwCY=; b=fN/qXnj9wgu2vxn/NdY9SJgst05xJm8mDd56mG5zmaXIV9aCHzegr8dEBOy9k6Jwbz+JYC XHi+j8qmAdAtFHjsknFgLgHkw8/3Kwodrf9L911xxXF1AIOoJc+cVGLUS7JNmTTLrzIgKO 3XQtKErrCjpFawIjpcpy6ZEGJmFpzC0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-253-Ba6D230jMVOpJAGoxZJQFw-1; Tue, 20 Jun 2023 10:58:02 -0400 X-MC-Unique: Ba6D230jMVOpJAGoxZJQFw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4AEF9296A62D; Tue, 20 Jun 2023 14:54:46 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id B6E5F492B01; Tue, 20 Jun 2023 14:54:42 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Marc Kleine-Budde , bpf@vger.kernel.org, dccp@vger.kernel.org, linux-afs@lists.infradead.org, linux-arm-msm@vger.kernel.org, linux-can@vger.kernel.org, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-hams@vger.kernel.org, linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-sctp@vger.kernel.org, linux-wpan@vger.kernel.org, linux-x25@vger.kernel.org, mptcp@lists.linux.dev, rds-devel@oss.oracle.com, tipc-discussion@lists.sourceforge.net, virtualization@lists.linux-foundation.org Subject: [PATCH net-next v3 17/18] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Date: Tue, 20 Jun 2023 15:53:36 +0100 Message-ID: <20230620145338.1300897-18-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234768741515218?= X-GMAIL-MSGID: =?utf-8?q?1769234768741515218?= Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells Acked-by: Marc Kleine-Budde # for net/can cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: dccp@vger.kernel.org cc: linux-afs@lists.infradead.org cc: linux-arm-msm@vger.kernel.org cc: linux-can@vger.kernel.org cc: linux-crypto@vger.kernel.org cc: linux-doc@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-hams@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-rdma@vger.kernel.org cc: linux-sctp@vger.kernel.org cc: linux-wpan@vger.kernel.org cc: linux-x25@vger.kernel.org cc: mptcp@lists.linux.dev cc: netdev@vger.kernel.org cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org --- Notes: ver #2) - Removed duplicate word in comment. Documentation/bpf/map_sockmap.rst | 10 ++-- Documentation/filesystems/locking.rst | 2 - Documentation/filesystems/vfs.rst | 1 - Documentation/networking/scaling.rst | 4 +- crypto/af_alg.c | 28 ----------- crypto/algif_aead.c | 22 ++------- crypto/algif_rng.c | 2 - crypto/algif_skcipher.c | 14 ------ .../chelsio/inline_crypto/chtls/chtls.h | 2 - .../chelsio/inline_crypto/chtls/chtls_io.c | 14 ------ .../chelsio/inline_crypto/chtls/chtls_main.c | 1 - fs/nfsd/vfs.c | 2 +- include/crypto/if_alg.h | 2 - include/linux/net.h | 8 ---- include/net/inet_common.h | 2 - include/net/sock.h | 6 --- include/net/tcp.h | 4 -- net/appletalk/ddp.c | 1 - net/atm/pvc.c | 1 - net/atm/svc.c | 1 - net/ax25/af_ax25.c | 1 - net/caif/caif_socket.c | 2 - net/can/bcm.c | 1 - net/can/isotp.c | 1 - net/can/j1939/socket.c | 1 - net/can/raw.c | 1 - net/core/sock.c | 35 +------------- net/dccp/ipv4.c | 1 - net/dccp/ipv6.c | 1 - net/ieee802154/socket.c | 2 - net/ipv4/af_inet.c | 21 -------- net/ipv4/tcp.c | 43 ++--------------- net/ipv4/tcp_bpf.c | 23 +-------- net/ipv4/tcp_ipv4.c | 1 - net/ipv4/udp.c | 15 ------ net/ipv4/udp_impl.h | 2 - net/ipv4/udplite.c | 1 - net/ipv6/af_inet6.c | 3 -- net/ipv6/raw.c | 1 - net/ipv6/tcp_ipv6.c | 1 - net/kcm/kcmsock.c | 20 -------- net/key/af_key.c | 1 - net/l2tp/l2tp_ip.c | 1 - net/l2tp/l2tp_ip6.c | 1 - net/llc/af_llc.c | 1 - net/mctp/af_mctp.c | 1 - net/mptcp/protocol.c | 2 - net/netlink/af_netlink.c | 1 - net/netrom/af_netrom.c | 1 - net/packet/af_packet.c | 2 - net/phonet/socket.c | 2 - net/qrtr/af_qrtr.c | 1 - net/rds/af_rds.c | 1 - net/rose/af_rose.c | 1 - net/rxrpc/af_rxrpc.c | 1 - net/sctp/protocol.c | 1 - net/socket.c | 48 ------------------- net/tipc/socket.c | 3 -- net/tls/tls.h | 6 --- net/tls/tls_device.c | 17 ------- net/tls/tls_main.c | 7 --- net/tls/tls_sw.c | 35 -------------- net/unix/af_unix.c | 19 -------- net/vmw_vsock/af_vsock.c | 3 -- net/x25/af_x25.c | 1 - net/xdp/xsk.c | 1 - 66 files changed, 20 insertions(+), 442 deletions(-) diff --git a/Documentation/bpf/map_sockmap.rst b/Documentation/bpf/map_sockmap.rst index cc92047c6630..2d630686a00b 100644 --- a/Documentation/bpf/map_sockmap.rst +++ b/Documentation/bpf/map_sockmap.rst @@ -240,11 +240,11 @@ offsets into ``msg``, respectively. If a program of type ``BPF_PROG_TYPE_SK_MSG`` is run on a ``msg`` it can only parse data that the (``data``, ``data_end``) pointers have already consumed. For ``sendmsg()`` hooks this is likely the first scatterlist element. But for -calls relying on the ``sendpage`` handler (e.g., ``sendfile()``) this will be -the range (**0**, **0**) because the data is shared with user space and by -default the objective is to avoid allowing user space to modify data while (or -after) BPF verdict is being decided. This helper can be used to pull in data -and to set the start and end pointers to given values. Data will be copied if +calls relying on MSG_SPLICE_PAGES (e.g., ``sendfile()``) this will be the +range (**0**, **0**) because the data is shared with user space and by default +the objective is to avoid allowing user space to modify data while (or after) +BPF verdict is being decided. This helper can be used to pull in data and to +set the start and end pointers to given values. Data will be copied if necessary (i.e., if data was not linear and if start and end pointers do not point to the same chunk). diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index aa1a233b0fa8..ed148919e11a 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -521,8 +521,6 @@ prototypes:: int (*fsync) (struct file *, loff_t start, loff_t end, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); - ssize_t (*sendpage) (struct file *, struct page *, int, size_t, - loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 769be5230210..cb2a97e49872 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -1086,7 +1086,6 @@ This describes how the VFS can manipulate an open file. As of kernel int (*fsync) (struct file *, loff_t, loff_t, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); - ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); int (*flock) (struct file *, int, struct file_lock *); diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst index 3d435caa3ef2..92c9fb46d6a2 100644 --- a/Documentation/networking/scaling.rst +++ b/Documentation/networking/scaling.rst @@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes. rps_sock_flow_table is a global flow table that contains the *desired* CPU for flows: the CPU that is currently processing the flow in userspace. Each table value is a CPU index that is updated during calls to recvmsg -and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage() -and tcp_splice_read()). +and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and +tcp_splice_read()). When the scheduler moves a thread to a new CPU while it has outstanding receive packets on the old CPU, packets may arrive out of order. To diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 7d4b6016b83d..11229f3bcf84 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -482,7 +482,6 @@ static const struct proto_ops alg_proto_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .sendmsg = sock_no_sendmsg, .recvmsg = sock_no_recvmsg, @@ -1106,33 +1105,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, } EXPORT_SYMBOL_GPL(af_alg_sendmsg); -/** - * af_alg_sendpage - sendpage system call handler - * @sock: socket of connection to user space to write to - * @page: data to send - * @offset: offset into page to begin sending - * @size: length of data - * @flags: message send/receive flags - * - * This is a generic implementation of sendpage to fill ctx->tsgl_list. - */ -ssize_t af_alg_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return sock_sendmsg(sock, &msg); -} -EXPORT_SYMBOL_GPL(af_alg_sendpage); - /** * af_alg_free_resources - release resources required for crypto request * @areq: Request holding the TX and RX SGL diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 35bfa283748d..7d58cbbce4af 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage. Filling up - * the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg (maybe with + * MSG_SPLICE_PAGES). Filling up the TX SGL does not cause a crypto operation + * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg + * call, the caller must provide a buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed @@ -370,7 +370,6 @@ static struct proto_ops algif_aead_ops = { .release = af_alg_release, .sendmsg = aead_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = aead_recvmsg, .poll = af_alg_poll, }; @@ -422,18 +421,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return aead_sendmsg(sock, msg, size); } -static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = aead_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -461,7 +448,6 @@ static struct proto_ops algif_aead_ops_nokey = { .release = af_alg_release, .sendmsg = aead_sendmsg_nokey, - .sendpage = aead_sendpage_nokey, .recvmsg = aead_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c index 407408c43730..10c41adac3b1 100644 --- a/crypto/algif_rng.c +++ b/crypto/algif_rng.c @@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = { .bind = sock_no_bind, .accept = sock_no_accept, .sendmsg = sock_no_sendmsg, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_recvmsg, @@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = { .mmap = sock_no_mmap, .bind = sock_no_bind, .accept = sock_no_accept, - .sendpage = sock_no_sendpage, .release = af_alg_release, .recvmsg = rng_test_recvmsg, diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index b1f321b9f846..9ada9b741af8 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = { .release = af_alg_release, .sendmsg = skcipher_sendmsg, - .sendpage = af_alg_sendpage, .recvmsg = skcipher_recvmsg, .poll = af_alg_poll, }; @@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg, return skcipher_sendmsg(sock, msg, size); } -static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page, - int offset, size_t size, int flags) -{ - int err; - - err = skcipher_check_key(sock); - if (err) - return err; - - return af_alg_sendpage(sock, page, offset, size, flags); -} - static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg, size_t ignored, int flags) { @@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = { .release = af_alg_release, .sendmsg = skcipher_sendmsg_nokey, - .sendpage = skcipher_sendpage_nokey, .recvmsg = skcipher_recvmsg_nokey, .poll = af_alg_poll, }; diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h index da4818d2c856..68562a82d036 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h @@ -569,8 +569,6 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); int chtls_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); void chtls_splice_eof(struct socket *sock); -int chtls_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int send_tx_flowc_wr(struct sock *sk, int compl, u32 snd_nxt, u32 rcv_nxt); void chtls_tcp_push(struct sock *sk, int flags); diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index e08ac960c967..5fc64e47568a 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c @@ -1246,20 +1246,6 @@ void chtls_splice_eof(struct socket *sock) release_sock(sk); } -int chtls_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - struct bio_vec bvec; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return chtls_sendmsg(sk, &msg, size); -} - static void chtls_select_window(struct sock *sk) { struct chtls_sock *csk = rcu_dereference_sk_user_data(sk); diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c index 6b6787eafd2f..455a54708be4 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -607,7 +607,6 @@ static void __init chtls_init_ulp_ops(void) chtls_cpl_prot.shutdown = chtls_shutdown; chtls_cpl_prot.sendmsg = chtls_sendmsg; chtls_cpl_prot.splice_eof = chtls_splice_eof; - chtls_cpl_prot.sendpage = chtls_sendpage; chtls_cpl_prot.recvmsg = chtls_recvmsg; chtls_cpl_prot.setsockopt = chtls_setsockopt; chtls_cpl_prot.getsockopt = chtls_getsockopt; diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index db67f8e19344..8879e207ff5a 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -936,7 +936,7 @@ nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp, int may_flags, /* * Grab and keep cached pages associated with a file in the svc_rqst - * so that they can be passed to the network sendmsg/sendpage routines + * so that they can be passed to the network sendmsg routines * directly. They will be released after the sending has completed. * * Return values: Number of bytes consumed, or -EIO if there are no diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h index 34224e77f5a2..ef8ce86b1f78 100644 --- a/include/crypto/if_alg.h +++ b/include/crypto/if_alg.h @@ -229,8 +229,6 @@ void af_alg_wmem_wakeup(struct sock *sk); int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min); int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, unsigned int ivsize); -ssize_t af_alg_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags); void af_alg_free_resources(struct af_alg_async_req *areq); void af_alg_async_cb(void *data, int err); __poll_t af_alg_poll(struct file *file, struct socket *sock, diff --git a/include/linux/net.h b/include/linux/net.h index 23324e9a2b3d..41c608c1b02c 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -207,8 +207,6 @@ struct proto_ops { size_t total_len, int flags); int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); - ssize_t (*sendpage) (struct socket *sock, struct page *page, - int offset, size_t size, int flags); ssize_t (*splice_read)(struct socket *sock, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); void (*splice_eof)(struct socket *sock); @@ -222,8 +220,6 @@ struct proto_ops { sk_read_actor_t recv_actor); /* This is different from read_sock(), it reads an entire skb at a time. */ int (*read_skb)(struct sock *sk, skb_read_actor_t recv_actor); - int (*sendpage_locked)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg, size_t size); int (*set_rcvlowat)(struct sock *sk, int val); @@ -341,10 +337,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, int flags); int kernel_getsockname(struct socket *sock, struct sockaddr *addr); int kernel_getpeername(struct socket *sock, struct sockaddr *addr); -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how); /* Routine returns the IP overhead imposed by a (caller-protected) socket. */ diff --git a/include/net/inet_common.h b/include/net/inet_common.h index a75333342c4e..b86b8e21de7f 100644 --- a/include/net/inet_common.h +++ b/include/net/inet_common.h @@ -36,8 +36,6 @@ void __inet_accept(struct socket *sock, struct socket *newsock, int inet_send_prepare(struct sock *sk); int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); void inet_splice_eof(struct socket *sock); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); int inet_shutdown(struct socket *sock, int how); diff --git a/include/net/sock.h b/include/net/sock.h index 62a1b99da349..121284f455a8 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1277,8 +1277,6 @@ struct proto { size_t len); int (*recvmsg)(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); - int (*sendpage)(struct sock *sk, struct page *page, - int offset, size_t size, int flags); void (*splice_eof)(struct socket *sock); int (*bind)(struct sock *sk, struct sockaddr *addr, int addr_len); @@ -1919,10 +1917,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int); int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct *vma); -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags); -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags); /* * Functions to fill in entries in struct proto_ops when a protocol diff --git a/include/net/tcp.h b/include/net/tcp.h index 9c08eab647a2..95e4507febed 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -328,10 +328,6 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size); int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied, size_t size, struct ubuf_info *uarg); void tcp_splice_eof(struct socket *sock); -int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, - int flags); -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int tcp_send_mss(struct sock *sk, int *size_goal, int flags); int tcp_wmem_schedule(struct sock *sk, int copy); void tcp_push(struct sock *sk, int flags, int mss_now, int nonagle, diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c index a06f4d4a6f47..8978fb6212ff 100644 --- a/net/appletalk/ddp.c +++ b/net/appletalk/ddp.c @@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = { .sendmsg = atalk_sendmsg, .recvmsg = atalk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block ddp_notifier = { diff --git a/net/atm/pvc.c b/net/atm/pvc.c index 53e7d3f39e26..66d9a9bd5896 100644 --- a/net/atm/pvc.c +++ b/net/atm/pvc.c @@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/atm/svc.c b/net/atm/svc.c index d83556d8beb9..36a814f1fbd1 100644 --- a/net/atm/svc.c +++ b/net/atm/svc.c @@ -654,7 +654,6 @@ static const struct proto_ops svc_proto_ops = { .sendmsg = vcc_sendmsg, .recvmsg = vcc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index d8da400cb4de..5db805d5f74d 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = { .sendmsg = ax25_sendmsg, .recvmsg = ax25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c index 4eebcc66c19a..9c82698da4f5 100644 --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = { .sendmsg = caif_seqpkt_sendmsg, .recvmsg = caif_seqpkt_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops caif_stream_ops = { @@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = { .sendmsg = caif_stream_sendmsg, .recvmsg = caif_stream_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* This function is called when a socket is finally destroyed. */ diff --git a/net/can/bcm.c b/net/can/bcm.c index a962ec2b8ba5..9ba35685b043 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1703,7 +1703,6 @@ static const struct proto_ops bcm_ops = { .sendmsg = bcm_sendmsg, .recvmsg = bcm_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto bcm_proto __read_mostly = { diff --git a/net/can/isotp.c b/net/can/isotp.c index 84f9aba02901..1f25b45868cf 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -1699,7 +1699,6 @@ static const struct proto_ops isotp_ops = { .sendmsg = isotp_sendmsg, .recvmsg = isotp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto isotp_proto __read_mostly = { diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 35970c25496a..feaec4ad6d16 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -1306,7 +1306,6 @@ static const struct proto_ops j1939_ops = { .sendmsg = j1939_sk_sendmsg, .recvmsg = j1939_sk_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto j1939_proto __read_mostly = { diff --git a/net/can/raw.c b/net/can/raw.c index f64469b98260..15c79b079184 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = { .sendmsg = raw_sendmsg, .recvmsg = raw_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto raw_proto __read_mostly = { diff --git a/net/core/sock.c b/net/core/sock.c index cff3e82514d1..5c4948fe8ab1 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3267,36 +3267,6 @@ void __receive_sock(struct file *file) } } -ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg(sock, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage); - -ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - ssize_t res; - struct msghdr msg = {.msg_flags = flags}; - struct kvec iov; - char *kaddr = kmap(page); - - iov.iov_base = kaddr + offset; - iov.iov_len = size; - res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size); - kunmap(page); - return res; -} -EXPORT_SYMBOL(sock_no_sendpage_locked); - /* * Default Socket Callbacks */ @@ -4052,7 +4022,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) { seq_printf(seq, "%-9s %4u %6d %6ld %-3s %6u %-3s %-10s " - "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", + "%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n", proto->name, proto->obj_size, sock_prot_inuse_get(seq_file_net(seq), proto), @@ -4073,7 +4043,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto) proto_method_implemented(proto->getsockopt), proto_method_implemented(proto->sendmsg), proto_method_implemented(proto->recvmsg), - proto_method_implemented(proto->sendpage), proto_method_implemented(proto->bind), proto_method_implemented(proto->backlog_rcv), proto_method_implemented(proto->hash), @@ -4094,7 +4063,7 @@ static int proto_seq_show(struct seq_file *seq, void *v) "maxhdr", "slab", "module", - "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n"); + "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n"); else proto_seq_printf(seq, list_entry(v, struct proto, node)); return 0; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 3ab68415d121..fa8079303cb0 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -1010,7 +1010,6 @@ static const struct proto_ops inet_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw dccp_v4_protosw = { diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 93c98990d726..7249ef218178 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -1087,7 +1087,6 @@ static const struct proto_ops inet6_dccp_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 9c124705120d..00302e8b9615 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* DGRAM Sockets (802.15.4 dataframes) */ @@ -989,7 +988,6 @@ static const struct proto_ops ieee802154_dgram_ops = { .sendmsg = ieee802154_sock_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void ieee802154_sock_destruct(struct sock *sk) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 38e649fb4474..9b2ca2fcc5a1 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -847,23 +847,6 @@ void inet_splice_eof(struct socket *sock) } EXPORT_SYMBOL_GPL(inet_splice_eof); -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - struct sock *sk = sock->sk; - const struct proto *prot; - - if (unlikely(inet_send_prepare(sk))) - return -EAGAIN; - - /* IPV6_ADDRFORM can change sk->sk_prot under us. */ - prot = READ_ONCE(sk->sk_prot); - if (prot->sendpage) - return prot->sendpage(sk, page, offset, size, flags); - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(inet_sendpage); - INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *, size_t, int, int *)); int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, @@ -1067,12 +1050,10 @@ const struct proto_ops inet_stream_ops = { .mmap = tcp_mmap, #endif .splice_eof = inet_splice_eof, - .sendpage = inet_sendpage, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .peek_len = tcp_peek_len, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1102,7 +1083,6 @@ const struct proto_ops inet_dgram_ops = { .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .splice_eof = inet_splice_eof, - .sendpage = inet_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, @@ -1134,7 +1114,6 @@ static const struct proto_ops inet_sockraw_ops = { .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .splice_eof = inet_splice_eof, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet_compat_ioctl, #endif diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0e21ea92dc1d..38f9250999cc 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -923,11 +923,10 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int flags) return mss_now; } -/* In some cases, both sendpage() and sendmsg() could have added - * an skb to the write queue, but failed adding payload on it. - * We need to remove it to consume less memory, but more - * importantly be able to generate EPOLLOUT for Edge Trigger epoll() - * users. +/* In some cases, both sendmsg() could have added an skb to the write queue, + * but failed adding payload on it. We need to remove it to consume less + * memory, but more importantly be able to generate EPOLLOUT for Edge Trigger + * epoll() users. */ void tcp_remove_empty_skb(struct sock *sk) { @@ -975,40 +974,6 @@ int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_sendmsg_locked(sk, &msg, size); -} -EXPORT_SYMBOL_GPL(tcp_sendpage_locked); - -int tcp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - int ret; - - lock_sock(sk); - ret = tcp_sendpage_locked(sk, page, offset, size, flags); - release_sock(sk); - - return ret; -} -EXPORT_SYMBOL(tcp_sendpage); - void tcp_free_fastopen_req(struct tcp_sock *tp) { if (tp->fastopen_req) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 31d6005cea9b..81f0dff69e0b 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -486,7 +486,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) long timeo; int flags; - /* Don't let internal sendpage flags through */ + /* Don't let internal flags through */ flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED); flags |= MSG_NO_SHARED_FRAGS; @@ -566,23 +566,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) return copied ? copied : err; } -static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { - .msg_flags = flags | MSG_SPLICE_PAGES, - }; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - return tcp_bpf_sendmsg(sk, &msg, size); -} - enum { TCP_BPF_IPV4, TCP_BPF_IPV6, @@ -612,7 +595,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS], prot[TCP_BPF_TX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_TX].sendmsg = tcp_bpf_sendmsg; - prot[TCP_BPF_TX].sendpage = tcp_bpf_sendpage; prot[TCP_BPF_RX] = prot[TCP_BPF_BASE]; prot[TCP_BPF_RX].recvmsg = tcp_bpf_recvmsg_parser; @@ -647,8 +629,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops) * indeed valid assumptions. */ return ops->recvmsg == tcp_recvmsg && - ops->sendmsg == tcp_sendmsg && - ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP; + ops->sendmsg == tcp_sendmsg ? 0 : -ENOTSUPP; } int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 84a5d557dc1a..a228cdb23831 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3117,7 +3117,6 @@ struct proto tcp_prot = { .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, .splice_eof = tcp_splice_eof, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v4_do_rcv, .release_cb = tcp_release_cb, .hash = inet_hash, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 48fdcd3cad9c..42a96b3547c9 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1340,20 +1340,6 @@ void udp_splice_eof(struct socket *sock) } EXPORT_SYMBOL_GPL(udp_splice_eof); -int udp_sendpage(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return udp_sendmsg(sk, &msg, size); -} - #define UDP_SKB_IS_STATELESS 0x80000000 /* all head states (dst, sk, nf conntrack) except skb extensions are @@ -2933,7 +2919,6 @@ struct proto udp_prot = { .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, .splice_eof = udp_splice_eof, - .sendpage = udp_sendpage, .release_cb = ip4_datagram_release_cb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h index 4ba7a88a1b1d..e1ff3a375996 100644 --- a/net/ipv4/udp_impl.h +++ b/net/ipv4/udp_impl.h @@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname, int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); -int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, - int flags); void udp_destroy_sock(struct sock *sk); #ifdef CONFIG_PROC_FS diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c index 143f93a12f25..39ecdad1b50c 100644 --- a/net/ipv4/udplite.c +++ b/net/ipv4/udplite.c @@ -56,7 +56,6 @@ struct proto udplite_prot = { .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, - .sendpage = udp_sendpage, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index b3451cf47d29..5d593ddc0347 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -696,9 +696,7 @@ const struct proto_ops inet6_stream_ops = { .mmap = tcp_mmap, #endif .splice_eof = inet_splice_eof, - .sendpage = inet_sendpage, .sendmsg_locked = tcp_sendmsg_locked, - .sendpage_locked = tcp_sendpage_locked, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, .read_skb = tcp_read_skb, @@ -729,7 +727,6 @@ const struct proto_ops inet6_dgram_ops = { .recvmsg = inet6_recvmsg, /* retpoline's sake */ .read_skb = udp_read_skb, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index c9caeb5a43ed..ac1cef094c5f 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -1296,7 +1296,6 @@ const struct proto_ops inet6_sockraw_ops = { .sendmsg = inet_sendmsg, /* ok */ .recvmsg = sock_common_recvmsg, /* ok */ .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c17c8ff94b79..40dd92a2f480 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = { .recvmsg = tcp_recvmsg, .sendmsg = tcp_sendmsg, .splice_eof = tcp_splice_eof, - .sendpage = tcp_sendpage, .backlog_rcv = tcp_v6_do_rcv, .release_cb = tcp_release_cb, .hash = inet6_hash, diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index d75d775e9462..b512f9958e43 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -961,24 +961,6 @@ static void kcm_splice_eof(struct socket *sock) release_sock(sk); } -static ssize_t kcm_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) - -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - if (flags & MSG_OOB) - return -EOPNOTSUPP; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return kcm_sendmsg(sock, &msg, size); -} - static int kcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { @@ -1767,7 +1749,6 @@ static const struct proto_ops kcm_dgram_ops = { .recvmsg = kcm_recvmsg, .mmap = sock_no_mmap, .splice_eof = kcm_splice_eof, - .sendpage = kcm_sendpage, }; static const struct proto_ops kcm_seqpacket_ops = { @@ -1789,7 +1770,6 @@ static const struct proto_ops kcm_seqpacket_ops = { .recvmsg = kcm_recvmsg, .mmap = sock_no_mmap, .splice_eof = kcm_splice_eof, - .sendpage = kcm_sendpage, .splice_read = kcm_splice_read, }; diff --git a/net/key/af_key.c b/net/key/af_key.c index 31ab12fd720a..ede3c6a60353 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -3761,7 +3761,6 @@ static const struct proto_ops pfkey_ops = { .listen = sock_no_listen, .shutdown = sock_no_shutdown, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, /* Now the operations that really occur. */ .release = pfkey_release, diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c index 2b795c1064f5..f9073bc7281f 100644 --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -624,7 +624,6 @@ static const struct proto_ops l2tp_ip_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct inet_protosw l2tp_ip_protosw = { diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 5137ea1861ce..b1623f9c4f92 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = { .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c index 9ffbc667be6c..57c35c960b2c 100644 --- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -1232,7 +1232,6 @@ static const struct proto_ops llc_ui_ops = { .sendmsg = llc_ui_sendmsg, .recvmsg = llc_ui_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const char llc_proc_err_msg[] __initconst = diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c index bb4bd0b6a4f7..f6be58b68c6f 100644 --- a/net/mctp/af_mctp.c +++ b/net/mctp/af_mctp.c @@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = { .sendmsg = mctp_sendmsg, .recvmsg = mctp_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = mctp_compat_ioctl, #endif diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 992b89c75631..e67983bd3bc7 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3891,7 +3891,6 @@ static const struct proto_ops mptcp_stream_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, }; static struct inet_protosw mptcp_protosw = { @@ -3986,7 +3985,6 @@ static const struct proto_ops mptcp_v6_stream_ops = { .sendmsg = inet6_sendmsg, .recvmsg = inet6_recvmsg, .mmap = sock_no_mmap, - .sendpage = inet_sendpage, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index cbd9aa7ee24a..39cfb778ebc5 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2815,7 +2815,6 @@ static const struct proto_ops netlink_ops = { .sendmsg = netlink_sendmsg, .recvmsg = netlink_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family netlink_family_ops = { diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c index 5a4cb796150f..eb8ccbd58df7 100644 --- a/net/netrom/af_netrom.c +++ b/net/netrom/af_netrom.c @@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = { .sendmsg = nr_sendmsg, .recvmsg = nr_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block nr_dev_notifier = { diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index a2dbeb264f26..85ff90a03b0c 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4621,7 +4621,6 @@ static const struct proto_ops packet_ops_spkt = { .sendmsg = packet_sendmsg_spkt, .recvmsg = packet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static const struct proto_ops packet_ops = { @@ -4643,7 +4642,6 @@ static const struct proto_ops packet_ops = { .sendmsg = packet_sendmsg, .recvmsg = packet_recvmsg, .mmap = packet_mmap, - .sendpage = sock_no_sendpage, }; static const struct net_proto_family packet_family_ops = { diff --git a/net/phonet/socket.c b/net/phonet/socket.c index 967f9b4dc026..1018340d89a7 100644 --- a/net/phonet/socket.c +++ b/net/phonet/socket.c @@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; const struct proto_ops phonet_stream_ops = { @@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = { .sendmsg = pn_socket_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; EXPORT_SYMBOL(phonet_stream_ops); diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c index 76f0434d3d06..78beb74146e7 100644 --- a/net/qrtr/af_qrtr.c +++ b/net/qrtr/af_qrtr.c @@ -1244,7 +1244,6 @@ static const struct proto_ops qrtr_proto_ops = { .shutdown = sock_no_shutdown, .release = qrtr_release, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto qrtr_proto = { diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index 3ff6995244e5..01c4cdfef45d 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = { .sendmsg = rds_sendmsg, .recvmsg = rds_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static void rds_sock_destruct(struct sock *sk) diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index ca2b17f32670..49dafe9ac72f 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = { .sendmsg = rose_sendmsg, .recvmsg = rose_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct notifier_block rose_dev_notifier = { diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index da0b3b5157d5..f2cf4aa99db2 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -954,7 +954,6 @@ static const struct proto_ops rxrpc_rpc_ops = { .sendmsg = rxrpc_sendmsg, .recvmsg = rxrpc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct proto rxrpc_proto = { diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 664d1f2e9121..274d07bd774f 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1133,7 +1133,6 @@ static const struct proto_ops inet_seqpacket_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; /* Registration with AF_INET family. */ diff --git a/net/socket.c b/net/socket.c index b778fc03c6e0..8c3c8b29995a 100644 --- a/net/socket.c +++ b/net/socket.c @@ -3552,54 +3552,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr) } EXPORT_SYMBOL(kernel_getpeername); -/** - * kernel_sendpage - send a &page through a socket (kernel space) - * @sock: socket - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - */ - -int kernel_sendpage(struct socket *sock, struct page *page, int offset, - size_t size, int flags) -{ - if (sock->ops->sendpage) { - /* Warn in case the improper page to zero-copy send */ - WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send"); - return sock->ops->sendpage(sock, page, offset, size, flags); - } - return sock_no_sendpage(sock, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage); - -/** - * kernel_sendpage_locked - send a &page through the locked sock (kernel space) - * @sk: sock - * @page: page - * @offset: page offset - * @size: total size in bytes - * @flags: flags (MSG_DONTWAIT, ...) - * - * Returns the total amount sent in bytes or an error. - * Caller must hold @sk. - */ - -int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - struct socket *sock = sk->sk_socket; - - if (sock->ops->sendpage_locked) - return sock->ops->sendpage_locked(sk, page, offset, size, - flags); - - return sock_no_sendpage_locked(sk, page, offset, size, flags); -} -EXPORT_SYMBOL(kernel_sendpage_locked); - /** * kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space) * @sock: socket diff --git a/net/tipc/socket.c b/net/tipc/socket.c index dd73d71c02a9..ef8e5139a873 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = { .sendmsg = tipc_sendmsg, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops packet_ops = { @@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = { .sendmsg = tipc_send_packet, .recvmsg = tipc_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct proto_ops stream_ops = { @@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = { .sendmsg = tipc_sendstream, .recvmsg = tipc_recvstream, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage }; static const struct net_proto_family tipc_family_ops = { diff --git a/net/tls/tls.h b/net/tls/tls.h index d002c3af1966..86cef1c68e03 100644 --- a/net/tls/tls.h +++ b/net/tls/tls.h @@ -98,10 +98,6 @@ void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx); void tls_sw_strparser_done(struct tls_context *tls_ctx); int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); void tls_sw_splice_eof(struct socket *sock); -int tls_sw_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags); -int tls_sw_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags); void tls_sw_cancel_work_tx(struct tls_context *tls_ctx); void tls_sw_release_resources_tx(struct sock *sk); void tls_sw_free_ctx_tx(struct tls_context *tls_ctx); @@ -117,8 +113,6 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos, int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); void tls_device_splice_eof(struct socket *sock); -int tls_device_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags); int tls_tx_records(struct sock *sk, int flags); void tls_sw_write_space(struct sock *sk, struct tls_context *ctx); diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 975299d7213b..840ee06f1708 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -621,23 +621,6 @@ void tls_device_splice_eof(struct socket *sock) mutex_unlock(&tls_ctx->tx_lock); } -int tls_device_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - if (flags & MSG_OOB) - return -EOPNOTSUPP; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return tls_device_sendmsg(sk, &msg, size); -} - struct tls_record_info *tls_get_record(struct tls_offload_context_tx *context, u32 seq, u64 *p_record_sn) { diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 7b9c83dd7de2..d5ed4d47b16e 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -958,7 +958,6 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG] ops[TLS_SW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE]; ops[TLS_SW ][TLS_BASE].splice_eof = tls_sw_splice_eof; - ops[TLS_SW ][TLS_BASE].sendpage_locked = tls_sw_sendpage_locked; ops[TLS_BASE][TLS_SW ] = ops[TLS_BASE][TLS_BASE]; ops[TLS_BASE][TLS_SW ].splice_read = tls_sw_splice_read; @@ -970,17 +969,14 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG] #ifdef CONFIG_TLS_DEVICE ops[TLS_HW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE]; - ops[TLS_HW ][TLS_BASE].sendpage_locked = NULL; ops[TLS_HW ][TLS_SW ] = ops[TLS_BASE][TLS_SW ]; - ops[TLS_HW ][TLS_SW ].sendpage_locked = NULL; ops[TLS_BASE][TLS_HW ] = ops[TLS_BASE][TLS_SW ]; ops[TLS_SW ][TLS_HW ] = ops[TLS_SW ][TLS_SW ]; ops[TLS_HW ][TLS_HW ] = ops[TLS_HW ][TLS_SW ]; - ops[TLS_HW ][TLS_HW ].sendpage_locked = NULL; #endif #ifdef CONFIG_TLS_TOE ops[TLS_HW_RECORD][TLS_HW_RECORD] = *base; @@ -1029,7 +1025,6 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], prot[TLS_SW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; prot[TLS_SW][TLS_BASE].sendmsg = tls_sw_sendmsg; prot[TLS_SW][TLS_BASE].splice_eof = tls_sw_splice_eof; - prot[TLS_SW][TLS_BASE].sendpage = tls_sw_sendpage; prot[TLS_BASE][TLS_SW] = prot[TLS_BASE][TLS_BASE]; prot[TLS_BASE][TLS_SW].recvmsg = tls_sw_recvmsg; @@ -1045,12 +1040,10 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; prot[TLS_HW][TLS_BASE].sendmsg = tls_device_sendmsg; prot[TLS_HW][TLS_BASE].splice_eof = tls_device_splice_eof; - prot[TLS_HW][TLS_BASE].sendpage = tls_device_sendpage; prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW]; prot[TLS_HW][TLS_SW].sendmsg = tls_device_sendmsg; prot[TLS_HW][TLS_SW].splice_eof = tls_device_splice_eof; - prot[TLS_HW][TLS_SW].sendpage = tls_device_sendpage; prot[TLS_BASE][TLS_HW] = prot[TLS_BASE][TLS_SW]; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 319f61590d2c..9b3aa89a4292 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1281,41 +1281,6 @@ void tls_sw_splice_eof(struct socket *sock) mutex_unlock(&tls_ctx->tx_lock); } -int tls_sw_sendpage_locked(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | - MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY | - MSG_NO_SHARED_FRAGS)) - return -EOPNOTSUPP; - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return tls_sw_sendmsg_locked(sk, &msg, size); -} - -int tls_sw_sendpage(struct sock *sk, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - - if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | - MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY)) - return -EOPNOTSUPP; - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return tls_sw_sendmsg(sk, &msg, size); -} - static int tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, bool released) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 73c61a010b01..3953daa2e1d0 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon static int unix_shutdown(struct socket *, int); static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int); -static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset, - size_t size, int flags); static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, struct pipe_inode_info *, size_t size, unsigned int flags); @@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = { .recvmsg = unix_stream_recvmsg, .read_skb = unix_stream_read_skb, .mmap = sock_no_mmap, - .sendpage = unix_stream_sendpage, .splice_read = unix_stream_splice_read, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, @@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = { .read_skb = unix_read_skb, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = { .sendmsg = unix_seqpacket_sendmsg, .recvmsg = unix_seqpacket_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_peek_off = unix_set_peek_off, .show_fdinfo = unix_show_fdinfo, }; @@ -2294,20 +2289,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, return sent ? : err; } -static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page, - int offset, size_t size, int flags) -{ - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES }; - - if (flags & MSG_SENDPAGE_NOTLAST) - msg.msg_flags |= MSG_MORE; - - bvec_set_page(&bvec, page, size, offset); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - return unix_stream_sendmsg(socket, &msg, size); -} - static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index efb8a0937a13..020cf17ab7e4 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1306,7 +1306,6 @@ static const struct proto_ops vsock_dgram_ops = { .sendmsg = vsock_dgram_sendmsg, .recvmsg = vsock_dgram_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .read_skb = vsock_read_skb, }; @@ -2234,7 +2233,6 @@ static const struct proto_ops vsock_stream_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .set_rcvlowat = vsock_set_rcvlowat, .read_skb = vsock_read_skb, }; @@ -2257,7 +2255,6 @@ static const struct proto_ops vsock_seqpacket_ops = { .sendmsg = vsock_connectible_sendmsg, .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, .read_skb = vsock_read_skb, }; diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index 5c7ad301d742..0fb5143bec7a 100644 --- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = { .sendmsg = x25_sendmsg, .recvmsg = x25_recvmsg, .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, }; static struct packet_type x25_packet_type __read_mostly = { diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index cc1e7f15fa73..5a8c0dd250af 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1389,7 +1389,6 @@ static const struct proto_ops xsk_proto_ops = { .sendmsg = xsk_sendmsg, .recvmsg = xsk_recvmsg, .mmap = xsk_mmap, - .sendpage = sock_no_sendpage, }; static void xsk_destruct(struct sock *sk) From patchwork Tue Jun 20 14:53:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 110556 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3734781vqr; Tue, 20 Jun 2023 08:09:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7adZafbVeQVVdj/3wQtkiPVXj2rC/xztB1Ucl3skM6YHCQD4OlYzLlsy7qYnbyoCnSshU8 X-Received: by 2002:a05:6a20:429f:b0:122:24ed:f77e with SMTP id o31-20020a056a20429f00b0012224edf77emr5357768pzj.33.1687273769131; Tue, 20 Jun 2023 08:09:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687273769; cv=none; d=google.com; s=arc-20160816; b=v9Z/XZgu14U8c2yk54j9F7cyxw+rVusUegAZksg8ThPXJYDaGMiPHk+A7dF09em/BZ 3ItX+FUlvG2YdbGHhOrXbVAJwRm5F2ra9XAbUaUD8FiZIBLdxUE4DmXb/1uQ9nHPWj6u MSQDUthIXJDNDijZI7ZOGOiy7i2jTU9T3LIhUG4rQlS8MPwiuImu3DoXbTLP1XX9it8y /I6Nt6tUjDYoxxtIQNUUAkWcI8uhdsrU/CsmsF3nRjWivJSdbPTJl3ZveHi69XMC9s5A uOmq946II8MiTZZ/GviZnRcrdeLn0f+BxP3Ke5plS/Jh0nYeI1Fp+uzJ1tgUsM2w/IQO SnzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=096kfvQCNbO3jjvPJlb61wpL4gpNse4m8jbz5dQcH+U=; b=D5W7T33dkGm16pbPAWZ6n9UzPUktKQyrfTeTyHQREc8u4wge09oGAmXsH9G+UsvrJJ AaSEKzPurT1Yaiom45L7Kz2O+GKinzAvU4r3fZtjxosV7EaxLE2DrBrkD3hi2Qt4p/Yg Vr6nVttkRpC8a12+jZLN5DQQgRPXsOHrHbFzDgP0JcppYRHTb6tsb9kGlRv0qQd2uwz4 PFNciE3TLmOvmBWORMeR++bYBb2jXRUEe55FxWdb1+5GueDuquQdfXWBodosWDopdPvM s6dxZQTCmVj5su+WdolgvmS6pUvdDpI1vK5DG6+qgmW5jwgMLYhywPd8qdooYWVc+Bew KFGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G3DILaIj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o10-20020a637e4a000000b0053f3b62c207si1809580pgn.767.2023.06.20.08.09.14; Tue, 20 Jun 2023 08:09:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G3DILaIj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233450AbjFTPAT (ORCPT + 99 others); Tue, 20 Jun 2023 11:00:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233519AbjFTO7v (ORCPT ); Tue, 20 Jun 2023 10:59:51 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38F581BCB for ; Tue, 20 Jun 2023 07:58:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273090; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=096kfvQCNbO3jjvPJlb61wpL4gpNse4m8jbz5dQcH+U=; b=G3DILaIjYHb+Tnc5Pca2c9IlkoHV8B1zEu8QpK4LfclT2xdQPXqm6hdGjz5/GmSibD2HbD Cj5GDXG/eecoNv6SIOBebvbJk6T1JrsDXo3P51VgWtySfFRNJlTcNdEHnF3+tIF1Nntx9C 9TvgL6NI4HtEzL6gvpoiORQA79lXOII= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-246-cetdja7_MNaYUIl_PdTHIQ-1; Tue, 20 Jun 2023 10:58:01 -0400 X-MC-Unique: cetdja7_MNaYUIl_PdTHIQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D63E710665CD; Tue, 20 Jun 2023 14:54:49 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D321E40C6CD2; Tue, 20 Jun 2023 14:54:46 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, dccp@vger.kernel.org, linux-afs@lists.infradead.org, linux-arm-msm@vger.kernel.org, linux-can@vger.kernel.org, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-hams@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-rdma@vger.kernel.org, linux-sctp@vger.kernel.org, linux-wpan@vger.kernel.org, linux-x25@vger.kernel.org, mptcp@lists.linux.dev, rds-devel@oss.oracle.com, tipc-discussion@lists.sourceforge.net, virtualization@lists.linux-foundation.org Subject: [PATCH net-next v3 18/18] net: Kill MSG_SENDPAGE_NOTLAST Date: Tue, 20 Jun 2023 15:53:37 +0100 Message-ID: <20230620145338.1300897-19-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769234780020202008?= X-GMAIL-MSGID: =?utf-8?q?1769234780020202008?= Now that ->sendpage() has been removed, MSG_SENDPAGE_NOTLAST can be cleaned up. Things were converted to use MSG_MORE instead, but the protocol sendpage stubs still convert MSG_SENDPAGE_NOTLAST to MSG_MORE, which is now unnecessary. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: bpf@vger.kernel.org cc: dccp@vger.kernel.org cc: linux-afs@lists.infradead.org cc: linux-arm-msm@vger.kernel.org cc: linux-can@vger.kernel.org cc: linux-crypto@vger.kernel.org cc: linux-doc@vger.kernel.org cc: linux-hams@vger.kernel.org cc: linux-perf-users@vger.kernel.org cc: linux-rdma@vger.kernel.org cc: linux-sctp@vger.kernel.org cc: linux-wpan@vger.kernel.org cc: linux-x25@vger.kernel.org cc: mptcp@lists.linux.dev cc: netdev@vger.kernel.org cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org --- Notes: ver #3) - tcp_bpf is now handled by an earlier patch. include/linux/socket.h | 4 +--- net/tls/tls_device.c | 3 +-- net/tls/tls_main.c | 2 +- net/tls/tls_sw.c | 2 +- tools/perf/trace/beauty/include/linux/socket.h | 1 - tools/perf/trace/beauty/msg_flags.c | 3 --- 6 files changed, 4 insertions(+), 11 deletions(-) diff --git a/include/linux/socket.h b/include/linux/socket.h index 58204700018a..39b74d83c7c4 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -319,7 +319,6 @@ struct ucred { #define MSG_MORE 0x8000 /* Sender will send more */ #define MSG_WAITFORONE 0x10000 /* recvmmsg(): block until 1+ packets avail */ #define MSG_SENDPAGE_NOPOLICY 0x10000 /* sendpage() internal : do no apply policy */ -#define MSG_SENDPAGE_NOTLAST 0x20000 /* sendpage() internal : not the last page */ #define MSG_BATCH 0x40000 /* sendmmsg(): more messages coming */ #define MSG_EOF MSG_FIN #define MSG_NO_SHARED_FRAGS 0x80000 /* sendpage() internal : page frags are not shared */ @@ -341,8 +340,7 @@ struct ucred { /* Flags to be cleared on entry by sendmsg and sendmmsg syscalls */ #define MSG_INTERNAL_SENDMSG_FLAGS \ - (MSG_SPLICE_PAGES | MSG_SENDPAGE_NOPOLICY | MSG_SENDPAGE_NOTLAST | \ - MSG_SENDPAGE_DECRYPTED) + (MSG_SPLICE_PAGES | MSG_SENDPAGE_NOPOLICY | MSG_SENDPAGE_DECRYPTED) /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */ #define SOL_IP 0 diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 840ee06f1708..2021fe557e50 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -441,8 +441,7 @@ static int tls_push_data(struct sock *sk, long timeo; if (flags & - ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST | - MSG_SPLICE_PAGES)) + ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SPLICE_PAGES)) return -EOPNOTSUPP; if (unlikely(sk->sk_err)) diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index d5ed4d47b16e..b6896126bb92 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -127,7 +127,7 @@ int tls_push_sg(struct sock *sk, { struct bio_vec bvec; struct msghdr msg = { - .msg_flags = MSG_SENDPAGE_NOTLAST | MSG_SPLICE_PAGES | flags, + .msg_flags = MSG_SPLICE_PAGES | flags, }; int ret = 0; struct page *p; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 9b3aa89a4292..53f944e6d8ef 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1194,7 +1194,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | - MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY)) + MSG_SENDPAGE_NOPOLICY)) return -EOPNOTSUPP; ret = mutex_lock_interruptible(&tls_ctx->tx_lock); diff --git a/tools/perf/trace/beauty/include/linux/socket.h b/tools/perf/trace/beauty/include/linux/socket.h index 13c3a237b9c9..3bef212a24d7 100644 --- a/tools/perf/trace/beauty/include/linux/socket.h +++ b/tools/perf/trace/beauty/include/linux/socket.h @@ -318,7 +318,6 @@ struct ucred { #define MSG_MORE 0x8000 /* Sender will send more */ #define MSG_WAITFORONE 0x10000 /* recvmmsg(): block until 1+ packets avail */ #define MSG_SENDPAGE_NOPOLICY 0x10000 /* sendpage() internal : do no apply policy */ -#define MSG_SENDPAGE_NOTLAST 0x20000 /* sendpage() internal : not the last page */ #define MSG_BATCH 0x40000 /* sendmmsg(): more messages coming */ #define MSG_EOF MSG_FIN #define MSG_NO_SHARED_FRAGS 0x80000 /* sendpage() internal : page frags are not shared */ diff --git a/tools/perf/trace/beauty/msg_flags.c b/tools/perf/trace/beauty/msg_flags.c index ea68db08b8e7..b5b580e5a77e 100644 --- a/tools/perf/trace/beauty/msg_flags.c +++ b/tools/perf/trace/beauty/msg_flags.c @@ -8,9 +8,6 @@ #ifndef MSG_WAITFORONE #define MSG_WAITFORONE 0x10000 #endif -#ifndef MSG_SENDPAGE_NOTLAST -#define MSG_SENDPAGE_NOTLAST 0x20000 -#endif #ifndef MSG_FASTOPEN #define MSG_FASTOPEN 0x20000000 #endif