Message ID | 1420063.1690904933@warthog.procyon.org.uk |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp2819840vqg; Tue, 1 Aug 2023 10:23:40 -0700 (PDT) X-Google-Smtp-Source: APBJJlF+vo0+H3vQegI3xZJIDd1FSnbZ541q5z5nyILI+TRjyLU/AzJFJS4zI5SE+K50/NOQccYv X-Received: by 2002:a05:6a20:4282:b0:13e:a442:c895 with SMTP id o2-20020a056a20428200b0013ea442c895mr1516172pzj.48.1690910620033; Tue, 01 Aug 2023 10:23:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690910620; cv=none; d=google.com; s=arc-20160816; b=kCFNWS23Mme9Ij8x/qB9yHPjVouyybpLGTSBQTp6oeJRgXLQ+1dF4ZwFG5A96JqMw+ UPLrCOXPDiMhPcmLkTG7LGnaCoVIOX/DWtYuPkaTKSPnySM2fpUTw0/vyEg0PkZlw0Wu 69KuocReNlIE3oXtxuVAYjYD6AvYxR8pOCXMjk3Qinb3MRuAzReJVhYKPrsq9qAIZbKN I1/RLElFwqOsNSTpCeq7toUHYWgEXjwG7yLtSIK8XiGuop5RnR55n9rodnC6lOaB0SV+ 5MCjmaRh8NKtycH12JY6nRNmZy/rtW8d5qUFYV3Iy/fHEWNLHNBAHDBmM6WgNIfJy40w qOhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-transfer-encoding :content-id:mime-version:subject:cc:to:from:organization :dkim-signature; bh=9kL1z+7hz4SUVFbyDmQbPtMivH0vJcYyOMwdaujfUNY=; fh=sNMMWQTQQL4jS23WvYxwAJ6nk8/wHFpQy0iwPKo/4jY=; b=nOrI6c8NYWoo2YgQR7EBZidedc+pZNAOEN7xEWv9Ng3/j/7qDhAT+Y2ZVh/DlYT5lw 4lOAT4EwmIDAGNpwgB2Q1D3ji//V0SzdJGP7vdcTwv0Up8BSfTI4i57VBJQ7Z1q+3a81 QNOwWsT0RcDKoHcJhRHeg1eQ+VQa1gNnorK6QDjlky6YIvYv2KuxsPQv5Sasq/0ift1s OPhVHetHiQA/MjyYUWs5ge15evHO2MuGCo0Mr/Fb2ZdNRYgyF7kTZMpRHk+D+3xXsPk8 T3rVw4flWzIfVw8qqiWK+f/O+CG2NYb+unlA+TNZ/LahSAUAXoDUGFWPUznm+juCChcW Cvqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ExC0AdET; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a7-20020a636607000000b0053f32b910c1si7690260pgc.700.2023.08.01.10.23.26; Tue, 01 Aug 2023 10:23:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ExC0AdET; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232185AbjHAPto (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Tue, 1 Aug 2023 11:49:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232125AbjHAPtn (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 1 Aug 2023 11:49:43 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A8DE1B7 for <linux-kernel@vger.kernel.org>; Tue, 1 Aug 2023 08:49:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690904942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=9kL1z+7hz4SUVFbyDmQbPtMivH0vJcYyOMwdaujfUNY=; b=ExC0AdETW+8BouW046pjdrwgARvsb1ziK4dOtC6fU/eMZx+HD0TLXWd9gQvDEcTiGK08JD Pf8goXorjx+H8FUHkPbH/LK7ek52iGFm85Z05+BTtL35KavRlvL8kmvUrxxRZsQGXF8/st 1RDaT5oiVLi1vD3TXkHrKUlRnWNrkrk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-281-unL-_DaIMdaTsXBFHl9fdA-1; Tue, 01 Aug 2023 11:48:57 -0400 X-MC-Unique: unL-_DaIMdaTsXBFHl9fdA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 206BD185A7AB; Tue, 1 Aug 2023 15:48:56 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.131]) by smtp.corp.redhat.com (Postfix) with ESMTP id 302AE145414B; Tue, 1 Aug 2023 15:48:54 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells <dhowells@redhat.com> To: netdev@vger.kernel.org, Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: dhowells@redhat.com, syzbot+f527b971b4bdc8e79f9e@syzkaller.appspotmail.com, bpf@vger.kernel.org, brauner@kernel.org, davem@davemloft.net, dsahern@kernel.org, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, axboe@kernel.dk, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, syzkaller-bugs@googlegroups.com, linux-kernel@vger.kernel.org Subject: [PATCH net] udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1420062.1690904933.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Tue, 01 Aug 2023 16:48:53 +0100 Message-ID: <1420063.1690904933@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773048294001284245 X-GMAIL-MSGID: 1773048294001284245 |
Series |
[net] udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
|
|
Commit Message
David Howells
Aug. 1, 2023, 3:48 p.m. UTC
__ip_append_data() can get into an infinite loop when asked to splice into
a partially-built UDP message that has more than the frag-limit data and up
to the MTU limit. Something like:
pipe(pfd);
sfd = socket(AF_INET, SOCK_DGRAM, 0);
connect(sfd, ...);
send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE);
write(pfd[1], buffer, 8);
splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0);
where the amount of data given to send() is dependent on the MTU size (in
this instance an interface with an MTU of 8192).
The problem is that the calculation of the amount to copy in
__ip_append_data() goes negative in two places, and, in the second place,
this gets subtracted from the length remaining, thereby increasing it.
This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and
MSG_SPLICE_PAGES), because the terms in:
copy = datalen - transhdrlen - fraggap - pagedlen;
then mostly cancel when pagedlen is substituted for, leaving just -fraggap.
This causes:
length -= copy + transhdrlen;
to increase the length to more than the amount of data in msg->msg_iter,
which causes skb_splice_from_iter() to be unable to fill the request and it
returns less than 'copied' - which means that length never gets to 0 and we
never exit the loop.
Fix this by:
(1) Insert a note about the dodgy calculation of 'copy'.
(2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above
equation, so that 'offset' isn't regressed and 'length' isn't
increased, which will mean that length and thus copy should match the
amount left in the iterator.
(3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if
we're asked to splice more than is in the iterator. It might be
better to not give the warning or even just give a 'short' write.
[!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY
avoids the problem by simply assuming that everything asked for got copied,
not just the amount that was in the iterator. This is a potential bug for
the future.
Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES")
Reported-by: syzbot+f527b971b4bdc8e79f9e@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000881d0606004541d1@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: netdev@vger.kernel.org
---
net/ipv4/ip_output.c | 9 +++++++++
1 file changed, 9 insertions(+)
Comments
David Howells wrote: > > __ip_append_data() can get into an infinite loop when asked to splice into > a partially-built UDP message that has more than the frag-limit data and up > to the MTU limit. Something like: > > pipe(pfd); > sfd = socket(AF_INET, SOCK_DGRAM, 0); > connect(sfd, ...); > send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE); > write(pfd[1], buffer, 8); > splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0); > > where the amount of data given to send() is dependent on the MTU size (in > this instance an interface with an MTU of 8192). > > The problem is that the calculation of the amount to copy in > __ip_append_data() goes negative in two places, and, in the second place, > this gets subtracted from the length remaining, thereby increasing it. > > This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and > MSG_SPLICE_PAGES), because the terms in: > > copy = datalen - transhdrlen - fraggap - pagedlen; > > then mostly cancel when pagedlen is substituted for, leaving just -fraggap. > This causes: > > length -= copy + transhdrlen; > > to increase the length to more than the amount of data in msg->msg_iter, > which causes skb_splice_from_iter() to be unable to fill the request and it > returns less than 'copied' - which means that length never gets to 0 and we > never exit the loop. > > Fix this by: > > (1) Insert a note about the dodgy calculation of 'copy'. > > (2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above > equation, so that 'offset' isn't regressed and 'length' isn't > increased, which will mean that length and thus copy should match the > amount left in the iterator. > > (3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if > we're asked to splice more than is in the iterator. It might be > better to not give the warning or even just give a 'short' write. > > [!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY > avoids the problem by simply assuming that everything asked for got copied, > not just the amount that was in the iterator. This is a potential bug for > the future. > > Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES") > Reported-by: syzbot+f527b971b4bdc8e79f9e@syzkaller.appspotmail.com > Link: https://lore.kernel.org/r/000000000000881d0606004541d1@google.com/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > cc: "David S. Miller" <davem@davemloft.net> > cc: Eric Dumazet <edumazet@google.com> > cc: Jakub Kicinski <kuba@kernel.org> > cc: Paolo Abeni <pabeni@redhat.com> > cc: David Ahern <dsahern@kernel.org> > cc: Jens Axboe <axboe@kernel.dk> > cc: netdev@vger.kernel.org Thanks for limiting this to MSG_SPLICE_PAGES. __ip6_append_data probably needs the same. I see your point that the if (copy > 0) { } else { copy = 0; } might apply to MSG_ZEROCOPY too. I'll take a look at that. For now this is a clear fix to a specific MSG_SPLICE_PAGES commit. copy is recomputed on each iteration in the loop. The only fields it directly affects below this new line are offset and length. offset is only used in copy paths: "offset into linear skb". So this changes length, the number of bytes still to be written. copy -= -fraggap definitely seems off. You point out that it even can turn length negative? The WARN_ON_ONCE, if it can be reached, will be user triggerable. Usually for those cases and when there is a viable return with error path, that is preferable. But if you prefer to taunt syzbot, ok. We can always remove this later. > --- > net/ipv4/ip_output.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 6e70839257f7..91715603cf6e 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -1158,10 +1158,15 @@ static int __ip_append_data(struct sock *sk, > } > > copy = datalen - transhdrlen - fraggap - pagedlen; > + /* [!] NOTE: copy will be negative if pagedlen>0 > + * because then the equation reduces to -fraggap. > + */ > if (copy > 0 && getfrag(from, data + transhdrlen, offset, copy, fraggap, skb) < 0) { > err = -EFAULT; > kfree_skb(skb); > goto error; > + } else if (flags & MSG_SPLICE_PAGES) { > + copy = 0; > } > > offset += copy; > @@ -1209,6 +1214,10 @@ static int __ip_append_data(struct sock *sk, > } else if (flags & MSG_SPLICE_PAGES) { > struct msghdr *msg = from; > > + err = -EIO; > + if (WARN_ON_ONCE(copy > msg->msg_iter.count)) > + goto error; > + > err = skb_splice_from_iter(skb, &msg->msg_iter, copy, > sk->sk_allocation); > if (err < 0) >
Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > copy -= -fraggap definitely seems off. You point out that it even can > turn length negative? Yes. See the logging I posted: ==>splice_to_socket() 6630 udp_sendmsg(8,8) __ip_append_data(copy=-1,len=8, mtu=8192 skblen=8189 maxfl=8188) pagedlen 9 = 9 - 0 copy -1 = 9 - 0 - 1 - 9 length 8 -= -1 + 0 Since datalen and transhdrlen cancel, and fraggap is unsigned, if fraggap is non-zero, copy will be negative. > The WARN_ON_ONCE, if it can be reached, will be user triggerable. > Usually for those cases and when there is a viable return with error > path, that is preferable. But if you prefer to taunt syzbot, ok. We > can always remove this later. It shouldn't be possible for length to exceed msg->msg_iter.count (assuming there is a msg) coming from userspace; further, userspace can't directly specify MSG_SPLICE_PAGES. > __ip6_append_data probably needs the same. Good point. The arrangement of the code is a bit different, but I think it's substantially the same in this regard. David
Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> __ip6_append_data probably needs the same.
Now that's interesting. __ip6_append_data() has a check for this and returns
-EINVAL in this case:
copy = datalen - transhdrlen - fraggap - pagedlen;
if (copy < 0) {
err = -EINVAL;
goto error;
}
but should I bypass that check for MSG_SPLICE_PAGES? It hits the check when
it should be able to get past it. The code seems to go back to prehistoric
times, so I'm not sure why it's there.
For an 8192 MTU, the breaking point is at a send of 8137 bytes. The attached
test program iterates through different send sizes until it hits the point.
David
---
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <fcntl.h>
#include <netinet/ip6.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/mman.h>
#include <sys/uio.h>
#define OSERROR(R, S) do { if ((long)(R) == -1L) { perror((S)); exit(1); } } while(0)
int main()
{
struct sockaddr_storage ss;
struct sockaddr_in6 sin6;
void *buffer;
unsigned int tmp;
int pfd[2], sfd;
int res, i;
OSERROR(pipe(pfd), "pipe");
sfd = socket(AF_INET6, SOCK_DGRAM, 0);
OSERROR(sfd, "socket/2");
memset(&sin6, 0, sizeof(sin6));
sin6.sin6_family = AF_INET6;
sin6.sin6_port = htons(7);
#warning set dest IPv6 address below
sin6.sin6_addr.s6_addr32[0] = htonl(0x01020304);
sin6.sin6_addr.s6_addr32[1] = htonl(0x05060708);
sin6.sin6_addr.s6_addr32[2] = htonl(0x00000000);
sin6.sin6_addr.s6_addr32[3] = htonl(0x00000001);
OSERROR(connect(sfd, (struct sockaddr *)&sin6, sizeof(sin6)), "connect");
buffer = mmap(NULL, 1024*1024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
OSERROR(buffer, "mmap");
for (i = 1000; i < 65535; i++) {
printf("%d\n", i);
OSERROR(send(sfd, buffer, i, MSG_MORE), "send");
OSERROR(write(pfd[1], buffer, 8), "write");
OSERROR(splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0), "splice");
}
return 0;
}
David Howells wrote: > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > > __ip6_append_data probably needs the same. > > Now that's interesting. __ip6_append_data() has a check for this and returns > -EINVAL in this case: > > copy = datalen - transhdrlen - fraggap - pagedlen; > if (copy < 0) { > err = -EINVAL; > goto error; > } > > but should I bypass that check for MSG_SPLICE_PAGES? It hits the check when > it should be able to get past it. The code seems to go back to prehistoric > times, so I'm not sure why it's there. Argh, saved by inconsistency between the two stacks. I don't immediately understand the race that caused this code to move, in commit 232cd35d0804 ("ipv6: fix out of bound writes in __ip6_append_data()"). Maybe a race with a mtu update? Technically there is no Fixes tag to apply, so this would not be a fix for net. If we want equivalent behavior, a patch removing this branch is probably best sent to net-next, in a way that works from the start.
David Howells wrote: > > __ip_append_data() can get into an infinite loop when asked to splice into > a partially-built UDP message that has more than the frag-limit data and up > to the MTU limit. Something like: > > pipe(pfd); > sfd = socket(AF_INET, SOCK_DGRAM, 0); > connect(sfd, ...); > send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE); > write(pfd[1], buffer, 8); > splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0); > > where the amount of data given to send() is dependent on the MTU size (in > this instance an interface with an MTU of 8192). > > The problem is that the calculation of the amount to copy in > __ip_append_data() goes negative in two places, and, in the second place, > this gets subtracted from the length remaining, thereby increasing it. > > This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and > MSG_SPLICE_PAGES), because the terms in: > > copy = datalen - transhdrlen - fraggap - pagedlen; > > then mostly cancel when pagedlen is substituted for, leaving just -fraggap. > This causes: > > length -= copy + transhdrlen; > > to increase the length to more than the amount of data in msg->msg_iter, > which causes skb_splice_from_iter() to be unable to fill the request and it > returns less than 'copied' - which means that length never gets to 0 and we > never exit the loop. > > Fix this by: > > (1) Insert a note about the dodgy calculation of 'copy'. > > (2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above > equation, so that 'offset' isn't regressed and 'length' isn't > increased, which will mean that length and thus copy should match the > amount left in the iterator. > > (3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if > we're asked to splice more than is in the iterator. It might be > better to not give the warning or even just give a 'short' write. > > [!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY > avoids the problem by simply assuming that everything asked for got copied, > not just the amount that was in the iterator. This is a potential bug for > the future. > > Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES") > Reported-by: syzbot+f527b971b4bdc8e79f9e@syzkaller.appspotmail.com > Link: https://lore.kernel.org/r/000000000000881d0606004541d1@google.com/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > cc: "David S. Miller" <davem@davemloft.net> > cc: Eric Dumazet <edumazet@google.com> > cc: Jakub Kicinski <kuba@kernel.org> > cc: Paolo Abeni <pabeni@redhat.com> > cc: David Ahern <dsahern@kernel.org> > cc: Jens Axboe <axboe@kernel.dk> > cc: netdev@vger.kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> I noticed that this is still open in patchwork, no need to resend.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Tue, 01 Aug 2023 16:48:53 +0100 you wrote: > __ip_append_data() can get into an infinite loop when asked to splice into > a partially-built UDP message that has more than the frag-limit data and up > to the MTU limit. Something like: > > pipe(pfd); > sfd = socket(AF_INET, SOCK_DGRAM, 0); > connect(sfd, ...); > send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE); > write(pfd[1], buffer, 8); > splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0); > > [...] Here is the summary with links: - [net] udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES https://git.kernel.org/netdev/net/c/0f71c9caf267 You are awesome, thank you!
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 6e70839257f7..91715603cf6e 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1158,10 +1158,15 @@ static int __ip_append_data(struct sock *sk, } copy = datalen - transhdrlen - fraggap - pagedlen; + /* [!] NOTE: copy will be negative if pagedlen>0 + * because then the equation reduces to -fraggap. + */ if (copy > 0 && getfrag(from, data + transhdrlen, offset, copy, fraggap, skb) < 0) { err = -EFAULT; kfree_skb(skb); goto error; + } else if (flags & MSG_SPLICE_PAGES) { + copy = 0; } offset += copy; @@ -1209,6 +1214,10 @@ static int __ip_append_data(struct sock *sk, } else if (flags & MSG_SPLICE_PAGES) { struct msghdr *msg = from; + err = -EIO; + if (WARN_ON_ONCE(copy > msg->msg_iter.count)) + goto error; + err = skb_splice_from_iter(skb, &msg->msg_iter, copy, sk->sk_allocation); if (err < 0)