Message ID | 79b1009812b753c3a82d09271c4d655d644d37a6.1669036433.git.bcodding@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1592988wrr; Mon, 21 Nov 2022 05:40:52 -0800 (PST) X-Google-Smtp-Source: AA0mqf7SQT4T46EypEMqHBQOR48ZdEBkXUd3hu9oXzgPVoeyZ3ATqsuW2gjD8loVeGChLhfGeYlF X-Received: by 2002:a17:907:d092:b0:7ad:7e85:8056 with SMTP id vc18-20020a170907d09200b007ad7e858056mr15372293ejc.40.1669038052306; Mon, 21 Nov 2022 05:40:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669038052; cv=none; d=google.com; s=arc-20160816; b=XR9fDX80JZhkIFaG9WLvp46Ebyrn6hlOJQAzm1ovc4zjsNpVjpdDNXbHMlQ+ILzbPD GmuEAVU5E4iOMjh7qWH5Dge/u6mQMUVD9q6UuN3WzPF5fMhBSolwkBy/rViw/fQ5w/3z RD+PzLjvTNs80WpbaNUztjORc8W0EArnAMhcS5YdnxSVpVF2tV5kjKfyHpGc09znE/IA N+KRTA9bMkNDjejgUKv9L38zVyP50H3QAX+VG8XcdoqmBiWauujVNvhQZSWVU8oq1tQv A3bcB6M5gHwSLPXY19ARBd1TF8E4nK0LZiaKQe0E0ifhavlgj4xGmm7Mq3tnWZbm/hwV eCig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mAyhqD1wQb+vA4DuCjO+sKuXXQHamchXQO5xWy2o6/w=; b=W1h54oKYqkPaSq+Muvqh5XJ+60JD3pX7R3MYb943HOzvZNhpzAuDJ4DW2D6bKuhGlB UumnAtGKBeBSYeXTMDurLJ3ZFwriUoCRmiSh/9JfB/MtBSYEnUSHK6mJkkWWsalyKJSt x+PPHnQPgnvolnUyh/gsie2PkRx5Ed0/U+ooZU+maSqa62WHuUI/rDTVB+tODfasCK1O Ha9IsugPRd+Szzm3ZZfAc4+6M5nwA0pGyWb34N2XQmi68G6Yt8qNckZQxMaE4St7Jewb Wt667uKkp67s1JHPSn+J86ZUpjgr4RVtwVKhCNGUUq3WwzWkGSJi9RihG1TrOpTiafDE gu0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JC9SSuff; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dr9-20020a170907720900b0078d4cf8de04si9792335ejc.380.2022.11.21.05.40.25; Mon, 21 Nov 2022 05:40:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JC9SSuff; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231219AbiKUNhh (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Mon, 21 Nov 2022 08:37:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231148AbiKUNgo (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 21 Nov 2022 08:36:44 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 610C3C4940 for <linux-kernel@vger.kernel.org>; Mon, 21 Nov 2022 05:35:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669037741; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mAyhqD1wQb+vA4DuCjO+sKuXXQHamchXQO5xWy2o6/w=; b=JC9SSuffrfY9hmiWq6j/4pp+q4UOk1xU1C0txVVdSt31RSV/ZRemXa4bTTQBTWu5XHDqwu O+ZrdC3aMYS7rfH5DTxGwDFf/iDy2P110AmE6GSZCFJbbUCKR/CPktAUDzKoov9oF9u/FJ 90h4rczsFfbJMgbMUUFUuyH1X09EaoU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-436-5JC7UypcMsWV8m8xtqhKjw-1; Mon, 21 Nov 2022 08:35:40 -0500 X-MC-Unique: 5JC7UypcMsWV8m8xtqhKjw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5C4B3C10147; Mon, 21 Nov 2022 13:35:39 +0000 (UTC) Received: from bcodding.csb (unknown [10.22.50.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7D85EC15BB3; Mon, 21 Nov 2022 13:35:39 +0000 (UTC) Received: by bcodding.csb (Postfix, from userid 24008) id 19BD110C30E3; Mon, 21 Nov 2022 08:35:36 -0500 (EST) From: Benjamin Coddington <bcodding@redhat.com> To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com> Subject: [PATCH v1 3/3] net: simplify sk_page_frag Date: Mon, 21 Nov 2022 08:35:19 -0500 Message-Id: <79b1009812b753c3a82d09271c4d655d644d37a6.1669036433.git.bcodding@redhat.com> In-Reply-To: <cover.1669036433.git.bcodding@redhat.com> References: <cover.1669036433.git.bcodding@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750113244482353695?= X-GMAIL-MSGID: =?utf-8?q?1750113244482353695?= |
Series |
Stop corrupting socket's task_frag
|
|
Commit Message
Benjamin Coddington
Nov. 21, 2022, 1:35 p.m. UTC
Now that in-kernel socket users that may recurse during reclaim have benn
converted to sk_use_task_frag = false, we can have sk_page_frag() simply
check that value.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
include/net/sock.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
Comments
On Mon, 2022-11-21 at 08:35 -0500, Benjamin Coddington wrote: > Now that in-kernel socket users that may recurse during reclaim have benn > converted to sk_use_task_frag = false, we can have sk_page_frag() simply > check that value. > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com> > --- > include/net/sock.h | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index ffba9e95470d..fac24c6ee30d 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -2539,19 +2539,14 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) > * Both direct reclaim and page faults can nest inside other > * socket operations and end up recursing into sk_page_frag() > * while it's already in use: explicitly avoid task page_frag > - * usage if the caller is potentially doing any of them. > - * This assumes that page fault handlers use the GFP_NOFS flags or > - * explicitly disable sk_use_task_frag. > + * when users disable sk_use_task_frag. > * > * Return: a per task page_frag if context allows that, > * otherwise a per socket one. > */ > static inline struct page_frag *sk_page_frag(struct sock *sk) > { > - if (sk->sk_use_task_frag && > - (sk->sk_allocation & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC | > - __GFP_FS)) == > - (__GFP_DIRECT_RECLAIM | __GFP_FS)) > + if (sk->sk_use_task_frag) > return ¤t->task_frag; > > return &sk->sk_frag; To make the above as safe as possible I think we should double-check the in-kernel users explicitly setting sk_allocation to GFP_ATOMIC, as that has the side effect of disabling the task_frag usage, too. Patch 2/3 already catches some of such users, and we can safely leave alone few others, (specifically l2tp, fou and inet_ctl_sock_create()). Even wireguard and tls looks safe IMHO. So the only left-over should be espintcp, I suggest updating patch 2/3 clearing sk_use_task_frag even in espintcp_init_sk(). Other than that LGTM. Cheers, Paolo
On Mon, 2022-11-21 at 08:35 -0500, Benjamin Coddington wrote: > Now that in-kernel socket users that may recurse during reclaim have benn > converted to sk_use_task_frag = false, we can have sk_page_frag() simply > check that value. > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com> > --- > include/net/sock.h | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index ffba9e95470d..fac24c6ee30d 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -2539,19 +2539,14 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) > * Both direct reclaim and page faults can nest inside other > * socket operations and end up recursing into sk_page_frag() > * while it's already in use: explicitly avoid task page_frag > - * usage if the caller is potentially doing any of them. > - * This assumes that page fault handlers use the GFP_NOFS flags or > - * explicitly disable sk_use_task_frag. > + * when users disable sk_use_task_frag. > * > * Return: a per task page_frag if context allows that, > * otherwise a per socket one. > */ > static inline struct page_frag *sk_page_frag(struct sock *sk) > { > - if (sk->sk_use_task_frag && > - (sk->sk_allocation & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC | > - __GFP_FS)) == > - (__GFP_DIRECT_RECLAIM | __GFP_FS)) > + if (sk->sk_use_task_frag) > return ¤t->task_frag; > > return &sk->sk_frag; To make the above as safe as possible I think we should double-check the in-kernel users explicitly setting sk_allocation to GFP_ATOMIC, as that has the side effect of disabling the task_frag usage, too. Patch 2/3 already catches some of such users, and we can safely leave alone few others, (specifically l2tp, fou and inet_ctl_sock_create()). Even wireguard and tls looks safe IMHO. So the only left-over should be espintcp, I suggest updating patch 2/3 clearing sk_use_task_frag even in espintcp_init_sk(). Other than that LGTM. Cheers, Paolo
On Mon, 2022-11-21 at 08:35 -0500, Benjamin Coddington wrote: > Now that in-kernel socket users that may recurse during reclaim have benn > converted to sk_use_task_frag = false, we can have sk_page_frag() simply > check that value. > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com> > --- > include/net/sock.h | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index ffba9e95470d..fac24c6ee30d 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -2539,19 +2539,14 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) > * Both direct reclaim and page faults can nest inside other > * socket operations and end up recursing into sk_page_frag() > * while it's already in use: explicitly avoid task page_frag > - * usage if the caller is potentially doing any of them. > - * This assumes that page fault handlers use the GFP_NOFS flags or > - * explicitly disable sk_use_task_frag. > + * when users disable sk_use_task_frag. > * > * Return: a per task page_frag if context allows that, > * otherwise a per socket one. > */ > static inline struct page_frag *sk_page_frag(struct sock *sk) > { > - if (sk->sk_use_task_frag && > - (sk->sk_allocation & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC | > - __GFP_FS)) == > - (__GFP_DIRECT_RECLAIM | __GFP_FS)) > + if (sk->sk_use_task_frag) > return ¤t->task_frag; > > return &sk->sk_frag; To make the above as safe as possible I think we should double-check the in-kernel users explicitly setting sk_allocation to GFP_ATOMIC, as that has the side effect of disabling the task_frag usage, too. Patch 2/3 already catches some of such users, and we can safely leave alone few others, (specifically l2tp, fou and inet_ctl_sock_create()). Even wireguard and tls looks safe IMHO. So the only left-over should be espintcp, I suggest updating patch 2/3 clearing sk_use_task_frag even in espintcp_init_sk(). Other than that LGTM. Cheers, Paolo
diff --git a/include/net/sock.h b/include/net/sock.h index ffba9e95470d..fac24c6ee30d 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2539,19 +2539,14 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) * Both direct reclaim and page faults can nest inside other * socket operations and end up recursing into sk_page_frag() * while it's already in use: explicitly avoid task page_frag - * usage if the caller is potentially doing any of them. - * This assumes that page fault handlers use the GFP_NOFS flags or - * explicitly disable sk_use_task_frag. + * when users disable sk_use_task_frag. * * Return: a per task page_frag if context allows that, * otherwise a per socket one. */ static inline struct page_frag *sk_page_frag(struct sock *sk) { - if (sk->sk_use_task_frag && - (sk->sk_allocation & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC | - __GFP_FS)) == - (__GFP_DIRECT_RECLAIM | __GFP_FS)) + if (sk->sk_use_task_frag) return ¤t->task_frag; return &sk->sk_frag;