Message ID | 20230517124201.441634-3-imagedong@tencent.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1116269vqo; Wed, 17 May 2023 06:02:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6kMyprd7C5DM6SLlduCCxQnwIqXwiRzkZrkfL9lG1eiGdh2CpcTeo7YSVNPyF99Pacaqw6 X-Received: by 2002:a17:90a:69c5:b0:253:3b3d:477f with SMTP id s63-20020a17090a69c500b002533b3d477fmr3207329pjj.13.1684328535868; Wed, 17 May 2023 06:02:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684328535; cv=none; d=google.com; s=arc-20160816; b=G8g8n0/fiXk6m/9VHmP7ngM+Z7kja+V/FfjerbUlWuqg9i7wDbZ4sNlERKuBtfzbVU AZlBtVAfnBFm8y8wkwXs306TJ9atzr6C216JEiom7WOS/kHnbvBe1b4YMd7tdlBtD6kU 1SPhI47+P1BXys6cgaylFQB7sF55Z+289LisF76TwB6btmTFvoU2xDhr3aG5QBmn+DEF yAfgM31nkjFOgZ3GKCfYTSWooO5GRMvK5opx+e0W/EOnGnNVTHOa83n2qin5BdZTUAO0 lyKhLqbrNUjAqatfGQrxIzec/qy8GIDdrX9aJ4hLdrY3g01D2p1aScyVKLVSYxtg7lS6 J2mQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2/0OAhjGOHrmxWrbN9W3gRqX+XIbc1pmgYwBMpZEkis=; b=Kmcm+Xnoy/G/HidDBZR/UV7b7ebBqdsrs0qhPo7zUg2vE5OezhJ/D5Cu3EMNdmyS4l Fc7WWSDR2zduHKnULIl2vnN461W3OI59qKG5S2OfhjzHVgI1PZqBP1XgLSDwk8j3nZ4A 1kDkHwMRTHa7o7qKsr+XhgC5N1RvAJZHzirozQmO+iFcqxfcAiBDhRJ2GKdMn+RvNbD9 v5XpGiyldlQEVT73DgpaBa9b3USV7HOKjanFC8u62/ezgs9bmoPRfoAQuRWjiWv0g/Ps BChVOofZeBlU8ST/tsOpX808piaX2rEG/xccO5HXLkOT1AwFmX3YwC4wkv5lZ6doUd5o NZgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Y2JdKOfL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ga9-20020a17090b038900b00247304b4a27si1635626pjb.173.2023.05.17.06.01.54; Wed, 17 May 2023 06:02:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Y2JdKOfL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230286AbjEQMmY (ORCPT <rfc822;pacteraone@gmail.com> + 99 others); Wed, 17 May 2023 08:42:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230231AbjEQMmO (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 17 May 2023 08:42:14 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CB151BF; Wed, 17 May 2023 05:42:14 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id d2e1a72fcca58-64384274895so497182b3a.2; Wed, 17 May 2023 05:42:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684327334; x=1686919334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2/0OAhjGOHrmxWrbN9W3gRqX+XIbc1pmgYwBMpZEkis=; b=Y2JdKOfL1Y+jzb6X3BwIf/rxj42QRt8mwa4OtrOo0pu0hjdph/vy9O7eH+P3bIA/Wu ObzfpTrmVyGvCJORHYgUYs63ox5JQQv/FRwl9SFocNMVExF5lLaAQG9UiPlCG0Jv7s8G 6LMlcqfLrmNaXeg0l4kft5e1G1DmFygQR4JE0vv3Dw0/+gu1CJglcwWjV9sAL777wQWF zgr7s1jXiNG48TH1iYBF9fwp0zpib2gLE8KvL3Jfux1bEOuSqajB7oeNgg7YzCJ1ehdo sLmmu6yhcGPYxOvDNgsmSJxp6ej8BVPuFWKC9iNYrKSmZJPY/iNfhZ4KCEtIKCuF6jQH +3FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684327334; x=1686919334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2/0OAhjGOHrmxWrbN9W3gRqX+XIbc1pmgYwBMpZEkis=; b=RmG2sAtCxUGALczBSP0DB/C13TIwPnqdOtcUm2TMBZYVf72Kvb2UfK+sWwT+wMOb3E YDawev2Yq2QOEAGVDUM9VoUHu5YBhLDPOnnSbleximUlvAGJqpEYmfFl9qpVEMoLWdW0 Fifc239w3DEjxhJQpYTid/xjwJ8ErfSDIJ2TWvW4eJwD84ttQTq4W7TFZAwfoQhuaEuA 3z8kezc2knXTsTPoOjWiJy8pkI9nIZrMunQtqsMTWrd+Nv0oJZaRxOjVWj2brVwL49ns bQXB05QFX4YyEk4R7XiuHSnsp/lvDmA30T9UPGdsFF5hM1zX/2QJFY6vAyl1CGU+VUY2 J5pw== X-Gm-Message-State: AC+VfDw90wgJerrvxaud1oPY7oUm80haSgOWLCXf0iXtH2rmlcZWF6NF RaWBP7mX8rb4NubmVE1T3DU= X-Received: by 2002:a05:6a00:1308:b0:647:f128:c4f5 with SMTP id j8-20020a056a00130800b00647f128c4f5mr873899pfu.22.1684327333721; Wed, 17 May 2023 05:42:13 -0700 (PDT) Received: from localhost.localdomain ([81.70.217.19]) by smtp.gmail.com with ESMTPSA id u23-20020aa78497000000b0064aea45b040sm9244224pfn.168.2023.05.17.05.42.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 05:42:13 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: kuba@kernel.org Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Menglong Dong <imagedong@tencent.com> Subject: [PATCH net-next 2/3] net: tcp: send zero-window when no memory Date: Wed, 17 May 2023 20:42:00 +0800 Message-Id: <20230517124201.441634-3-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230517124201.441634-1-imagedong@tencent.com> References: <20230517124201.441634-1-imagedong@tencent.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766146478776917590?= X-GMAIL-MSGID: =?utf-8?q?1766146478776917590?= |
Series |
net: tcp: add support of window shrink
|
|
Commit Message
Menglong Dong
May 17, 2023, 12:42 p.m. UTC
From: Menglong Dong <imagedong@tencent.com> For now, skb will be dropped when no memory, which makes client keep retrans util timeout and it's not friendly to the users. Therefore, now we force to receive one packet on current socket when the protocol memory is out of the limitation. Then, this socket will stay in 'no mem' status, util protocol memory is available. When a socket is in 'no mem' status, it's receive window will become 0, which means window shrink happens. And the sender need to handle such window shrink properly, which is done in the next commit. Signed-off-by: Menglong Dong <imagedong@tencent.com> --- include/net/sock.h | 1 + net/ipv4/tcp_input.c | 12 ++++++++++++ net/ipv4/tcp_output.c | 7 +++++++ 3 files changed, 20 insertions(+)
Comments
On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@gmail.com> wrote: > > From: Menglong Dong <imagedong@tencent.com> > > For now, skb will be dropped when no memory, which makes client keep > retrans util timeout and it's not friendly to the users. Yes, networking needs memory. Trying to deny it is recipe for OOM. > > Therefore, now we force to receive one packet on current socket when > the protocol memory is out of the limitation. Then, this socket will > stay in 'no mem' status, util protocol memory is available. > I think you missed one old patch. commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba tcp: fix SO_RCVLOWAT possible hangs under high mem pressure > When a socket is in 'no mem' status, it's receive window will become > 0, which means window shrink happens. And the sender need to handle > such window shrink properly, which is done in the next commit. > > Signed-off-by: Menglong Dong <imagedong@tencent.com> > --- > include/net/sock.h | 1 + > net/ipv4/tcp_input.c | 12 ++++++++++++ > net/ipv4/tcp_output.c | 7 +++++++ > 3 files changed, 20 insertions(+) > > diff --git a/include/net/sock.h b/include/net/sock.h > index 5edf0038867c..90db8a1d7f31 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -957,6 +957,7 @@ enum sock_flags { > SOCK_XDP, /* XDP is attached */ > SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ > SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ > + SOCK_NO_MEM, /* protocol memory limitation happened */ > }; > > #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index a057330d6f59..56e395cb4554 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) > if (skb_queue_len(&sk->sk_receive_queue) == 0) > sk_forced_mem_schedule(sk, skb->truesize); I think you missed this part : We accept at least one packet, regardless of memory pressure, if the queue is empty. So your changelog is misleading. > else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { > + if (sysctl_tcp_wnd_shrink) We no longer add global sysctls for TCP. All new sysctls must per net-ns. > + goto do_wnd_shrink; > + > reason = SKB_DROP_REASON_PROTO_MEM; > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); > sk->sk_data_ready(sk); > goto drop; > +do_wnd_shrink: > + if (sock_flag(sk, SOCK_NO_MEM)) { > + NET_INC_STATS(sock_net(sk), > + LINUX_MIB_TCPRCVQDROP); > + sk->sk_data_ready(sk); > + goto out_of_window; > + } > + sk_forced_mem_schedule(sk, skb->truesize); So now we would accept two packets per TCP socket, and yet EPOLLIN will not be sent in time ? packets can consume about 45*4K each, I do not think it is wise to double receive queue sizes. What you want instead is simply to send EPOLLIN sooner (when the first packet is queued instead when the second packet is dropped) by changing sk_forced_mem_schedule() a bit. This might matter for applications using SO_RCVLOWAT, but not for other applications.
On Wed, May 17, 2023 at 10:45 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@gmail.com> wrote: > > > > From: Menglong Dong <imagedong@tencent.com> > > > > For now, skb will be dropped when no memory, which makes client keep > > retrans util timeout and it's not friendly to the users. > > Yes, networking needs memory. Trying to deny it is recipe for OOM. > > > > > Therefore, now we force to receive one packet on current socket when > > the protocol memory is out of the limitation. Then, this socket will > > stay in 'no mem' status, util protocol memory is available. > > > > I think you missed one old patch. > > commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba tcp: fix > SO_RCVLOWAT possible hangs under high mem pressure > > > > > When a socket is in 'no mem' status, it's receive window will become > > 0, which means window shrink happens. And the sender need to handle > > such window shrink properly, which is done in the next commit. > > > > Signed-off-by: Menglong Dong <imagedong@tencent.com> > > --- > > include/net/sock.h | 1 + > > net/ipv4/tcp_input.c | 12 ++++++++++++ > > net/ipv4/tcp_output.c | 7 +++++++ > > 3 files changed, 20 insertions(+) > > > > diff --git a/include/net/sock.h b/include/net/sock.h > > index 5edf0038867c..90db8a1d7f31 100644 > > --- a/include/net/sock.h > > +++ b/include/net/sock.h > > @@ -957,6 +957,7 @@ enum sock_flags { > > SOCK_XDP, /* XDP is attached */ > > SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ > > SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ > > + SOCK_NO_MEM, /* protocol memory limitation happened */ > > }; > > > > #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index a057330d6f59..56e395cb4554 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) > > if (skb_queue_len(&sk->sk_receive_queue) == 0) > > sk_forced_mem_schedule(sk, skb->truesize); > > I think you missed this part : We accept at least one packet, > regardless of memory pressure, > if the queue is empty. > > So your changelog is misleading. Sorry that I didn't describe the problem clearly enough. The problem is for two cases. Case 1: tcp_mem[2] limitation causes packet drop. In some cases, applications may not read the data in the socket receiving queue quickly enough. In my case, it will call recv() every 5 minutes. And there are a lot of such sockets. tcp_mem[2] limitation can happen easily in such a case, and once this happens, skb will be dropped (the receive queue is not empty) and the send retrans the skb until timeout and the connection break. Case 2: The sender keeps sending small packets and makes the rec_buf full. Meanwhile, the window is not zero, and the sender will keep retrans until timeout, as the skb is dropped by the receiver. > > > else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { > > + if (sysctl_tcp_wnd_shrink) > > We no longer add global sysctls for TCP. All new sysctls must per net-ns. > > > + goto do_wnd_shrink; > > + > > reason = SKB_DROP_REASON_PROTO_MEM; > > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); > > sk->sk_data_ready(sk); > > goto drop; > > +do_wnd_shrink: > > + if (sock_flag(sk, SOCK_NO_MEM)) { > > + NET_INC_STATS(sock_net(sk), > > + LINUX_MIB_TCPRCVQDROP); > > + sk->sk_data_ready(sk); > > + goto out_of_window; > > + } > > + sk_forced_mem_schedule(sk, skb->truesize); > > So now we would accept two packets per TCP socket, and yet EPOLLIN > will not be sent in time ? > > packets can consume about 45*4K each, I do not think it is wise to > double receive queue sizes. > What we want to do here is to send a ack with zero window. It may be not necessary to force receive new data here, but to stay the same with the logic of 'tcp_may_update_window()', only newer 'ack' in a ack packet can shrink the window. If we don't receive new data and send a zero-window ack directly here, it will be weird, as the previous ack with the same 'seq' and 'ack' has non-zero window. Thanks! Menglong Dong > What you want instead is simply to send EPOLLIN sooner (when the first > packet is queued instead when the second packet is dropped) > by changing sk_forced_mem_schedule() a bit. > > This might matter for applications using SO_RCVLOWAT, but not for > other applications.
On Wed, May 17, 2023 at 10:45 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@gmail.com> wrote: > > > > From: Menglong Dong <imagedong@tencent.com> > > > > For now, skb will be dropped when no memory, which makes client keep > > retrans util timeout and it's not friendly to the users. > > Yes, networking needs memory. Trying to deny it is recipe for OOM. > > > > > Therefore, now we force to receive one packet on current socket when > > the protocol memory is out of the limitation. Then, this socket will > > stay in 'no mem' status, util protocol memory is available. > > > > I think you missed one old patch. > > commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba tcp: fix > SO_RCVLOWAT possible hangs under high mem pressure > > > > > When a socket is in 'no mem' status, it's receive window will become > > 0, which means window shrink happens. And the sender need to handle > > such window shrink properly, which is done in the next commit. > > > > Signed-off-by: Menglong Dong <imagedong@tencent.com> > > --- > > include/net/sock.h | 1 + > > net/ipv4/tcp_input.c | 12 ++++++++++++ > > net/ipv4/tcp_output.c | 7 +++++++ > > 3 files changed, 20 insertions(+) > > > > diff --git a/include/net/sock.h b/include/net/sock.h > > index 5edf0038867c..90db8a1d7f31 100644 > > --- a/include/net/sock.h > > +++ b/include/net/sock.h > > @@ -957,6 +957,7 @@ enum sock_flags { > > SOCK_XDP, /* XDP is attached */ > > SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ > > SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ > > + SOCK_NO_MEM, /* protocol memory limitation happened */ > > }; > > > > #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index a057330d6f59..56e395cb4554 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) > > if (skb_queue_len(&sk->sk_receive_queue) == 0) > > sk_forced_mem_schedule(sk, skb->truesize); > > I think you missed this part : We accept at least one packet, > regardless of memory pressure, > if the queue is empty. > > So your changelog is misleading. > > > else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { > > + if (sysctl_tcp_wnd_shrink) > > We no longer add global sysctls for TCP. All new sysctls must per net-ns. > > > + goto do_wnd_shrink; > > + > > reason = SKB_DROP_REASON_PROTO_MEM; > > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); > > sk->sk_data_ready(sk); > > goto drop; > > +do_wnd_shrink: > > + if (sock_flag(sk, SOCK_NO_MEM)) { > > + NET_INC_STATS(sock_net(sk), > > + LINUX_MIB_TCPRCVQDROP); > > + sk->sk_data_ready(sk); > > + goto out_of_window; > > + } > > + sk_forced_mem_schedule(sk, skb->truesize); > > So now we would accept two packets per TCP socket, and yet EPOLLIN > will not be sent in time ? > > packets can consume about 45*4K each, I do not think it is wise to > double receive queue sizes. > > What you want instead is simply to send EPOLLIN sooner (when the first > packet is queued instead when the second packet is dropped) > by changing sk_forced_mem_schedule() a bit. > > This might matter for applications using SO_RCVLOWAT, but not for > other applications. To be more clear, what I talk about here is not to send EPOLLIN sooner, but try to make the TCP connection, which has a "hang" receiver and in TCP protocol memory pressure, entry 0-probe state. And this commit is the first step: make the receiver shrink the window by sending a zero-window ack. Thanks! Menglong Dong
diff --git a/include/net/sock.h b/include/net/sock.h index 5edf0038867c..90db8a1d7f31 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -957,6 +957,7 @@ enum sock_flags { SOCK_XDP, /* XDP is attached */ SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ + SOCK_NO_MEM, /* protocol memory limitation happened */ }; #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index a057330d6f59..56e395cb4554 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) if (skb_queue_len(&sk->sk_receive_queue) == 0) sk_forced_mem_schedule(sk, skb->truesize); else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { + if (sysctl_tcp_wnd_shrink) + goto do_wnd_shrink; + reason = SKB_DROP_REASON_PROTO_MEM; NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); sk->sk_data_ready(sk); goto drop; +do_wnd_shrink: + if (sock_flag(sk, SOCK_NO_MEM)) { + NET_INC_STATS(sock_net(sk), + LINUX_MIB_TCPRCVQDROP); + sk->sk_data_ready(sk); + goto out_of_window; + } + sk_forced_mem_schedule(sk, skb->truesize); + sock_set_flag(sk, SOCK_NO_MEM); } eaten = tcp_queue_rcv(sk, skb, &fragstolen); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cfe128b81a01..21dc4f7e0a12 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -300,6 +300,13 @@ static u16 tcp_select_window(struct sock *sk) NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFROMZEROWINDOWADV); } + if (sock_flag(sk, SOCK_NO_MEM)) { + if (sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 2)) + sock_reset_flag(sk, SOCK_NO_MEM); + else + new_win = 0; + } + return new_win; }