From patchwork Tue May 23 02:59:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Paniakin X-Patchwork-Id: 97711 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1859120vqo; Mon, 22 May 2023 20:06:05 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ58yjxTQ8zHMTjvm2R9qN9gnjKp+6s2Col6XTXMfH8rQpoDLZ4pqYTNRa+uFeIjgj8xVcby X-Received: by 2002:a17:90a:690d:b0:255:a1d9:4486 with SMTP id r13-20020a17090a690d00b00255a1d94486mr1218299pjj.1.1684811165326; Mon, 22 May 2023 20:06:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684811165; cv=none; d=google.com; s=arc-20160816; b=LzMtgpTkJJOUwPofd3qrOOCSeF/E66nu52XlEepVyFI3mQSe5dFFdYADfy2dLzw1EW UBppNmX9m3I5bc1gRLlZWofCfe/dNNGAXpyOw0bRPE9m8y56pQPMVy17PF7wPPVPUQ5f kiopsXU2DIWI2U4hF1jgLGhaXGN++qf18XrvBNjsxZZxIi2JHNrjO76xeWROi7qnv7La KtUeVyIMQHS6E/GbcLRFVVPnhoYl6Nh8QBt3S9JD/0t/q9vhrfSXD4b5ENVSXd9mX2q6 IMy4Ly/vPS1js7mk5YGVl5e+uvZVAjBMx6EvxeDuvTbs93IWOOsag4Zu9wlAgx8Sbh/0 kqwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=nny6LjxpUW6mJRzHzp9QlRzCPGPnlS5WgOn3BfbgZ8Q=; b=lRLt9EjSX39ygAehSB2Q/yXY2n1shFt9uIHOfz3YmPQlldHt6vaHLbPmZyX9mM40cm +2m+ssAnzl34nN1KLgQx8urZiQGG7Y+SfEVIiwxtJpm9jN93fRrGgFwmXmKhoEfGpStn fkFgFZPwfmUF3fc0sDVCvpxqRCMu7XGDEZc5AbR652v0wOwDDPUxR5E13MJ8nyMnD7Ir pDbvm2jCssZLFopdJNActXUU3vRzU5oqJo+339bmx+PVXpLgBFUiwaVAdma1MflOSqND ras5UDZqrMRbGMhBk6LoKpWVGq657SPLG/C0dT3UMfuYBAeJ5AgTNOXbG6r4L6hZ2W/y P2AA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=jTVCBqxO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pf16-20020a17090b1d9000b00255813b0d96si2080363pjb.179.2023.05.22.20.05.51; Mon, 22 May 2023 20:06:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=jTVCBqxO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235154AbjEWDBb (ORCPT + 99 others); Mon, 22 May 2023 23:01:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234587AbjEWDBF (ORCPT ); Mon, 22 May 2023 23:01:05 -0400 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD21210E0; Mon, 22 May 2023 20:00:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1684810831; x=1716346831; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=nny6LjxpUW6mJRzHzp9QlRzCPGPnlS5WgOn3BfbgZ8Q=; b=jTVCBqxOe8UffKjtkGpafqUU/E37SRTrD7EOsTm6J9+Clw/O1JTQbbgr jqKTWMLntBqTDMtCA0RfELboOTqvjU+a8rd95tiLxSeEm3Bcv3H2mYKE/ dSycnFZeGgiPIzOjUi1spMv8hOvanKyIUdP6DJeonEI+UY8KSke38z0fb g=; X-IronPort-AV: E=Sophos;i="6.00,185,1681171200"; d="scan'208";a="5068788" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2b-m6i4x-26a610d2.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2023 03:00:29 +0000 Received: from EX19D009EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2b-m6i4x-26a610d2.us-west-2.amazon.com (Postfix) with ESMTPS id 99CC040D3F; Tue, 23 May 2023 03:00:28 +0000 (UTC) Received: from EX19D026EUB004.ant.amazon.com (10.252.61.64) by EX19D009EUA001.ant.amazon.com (10.252.50.112) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 23 May 2023 03:00:27 +0000 Received: from uc3ecf78c6baf56.ant.amazon.com (10.119.183.60) by EX19D026EUB004.ant.amazon.com (10.252.61.64) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 23 May 2023 03:00:25 +0000 From: Andrew Paniakin To: CC: , , Florian Westphal , Pablo Neira Ayuso , Andrew Paniakin , Jozsef Kadlecsik , "David S. Miller" , , , , Subject: [PATCH 4.14] netfilter: nf_tables: fix register ordering Date: Mon, 22 May 2023 19:59:41 -0700 Message-ID: <20230523025941.1695616-1-apanyaki@amazon.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.119.183.60] X-ClientProxiedBy: EX19D045UWC004.ant.amazon.com (10.13.139.203) To EX19D026EUB004.ant.amazon.com (10.252.61.64) X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766651298635863433?= X-GMAIL-MSGID: =?utf-8?q?1766652552616337733?= From: Florian Westphal commit d209df3e7f7002d9099fdb0f6df0f972b4386a63 upstream [ We hit the trace described in commit message with the kselftest/nft_trans_stress.sh. This patch diverges from the upstream one since kernel 4.14 does not have following symbols: nft_chain_filter_init, nf_tables_flowtable_notifier ] We must register nfnetlink ops last, as that exposes nf_tables to userspace. Without this, we could theoretically get nfnetlink request before net->nft state has been initialized. Fixes: 99633ab29b213 ("netfilter: nf_tables: complete net namespace support") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso [apanyaki: backport to v4.14-stable] Signed-off-by: Andrew Paniakin --- [ 163.471426] Call Trace: [ 163.474901] netlink_dump+0x125/0x2d0 [ 163.479081] __netlink_dump_start+0x16a/0x1c0 [ 163.483589] nf_tables_gettable+0x151/0x180 [nf_tables] [ 163.488561] ? nf_tables_gettable+0x180/0x180 [nf_tables] [ 163.493658] nfnetlink_rcv_msg+0x222/0x250 [nfnetlink] [ 163.498608] ? __skb_try_recv_datagram+0x114/0x180 [ 163.503359] ? nfnetlink_net_exit_batch+0x60/0x60 [nfnetlink] [ 163.508590] netlink_rcv_skb+0x4d/0x130 [ 163.512832] nfnetlink_rcv+0x92/0x780 [nfnetlink] [ 163.517465] ? netlink_recvmsg+0x202/0x3e0 [ 163.521801] ? __kmalloc_node_track_caller+0x31/0x290 [ 163.526635] ? copy_msghdr_from_user+0xd5/0x150 [ 163.531216] ? __netlink_lookup+0xd0/0x130 [ 163.535536] netlink_unicast+0x196/0x240 [ 163.539759] netlink_sendmsg+0x2da/0x400 [ 163.544010] sock_sendmsg+0x36/0x40 [ 163.548030] SYSC_sendto+0x10e/0x140 [ 163.552119] ? __audit_syscall_entry+0xbc/0x110 [ 163.556741] ? syscall_trace_enter+0x1df/0x2e0 [ 163.561315] ? __audit_syscall_exit+0x231/0x2b0 [ 163.565857] do_syscall_64+0x67/0x110 [ 163.569930] entry_SYSCALL_64_after_hwframe+0x59/0xbe Reproduce with debug logs clearly shows the nft initialization issue exactly as in ported patch description: [ 22.600051] nft load start [ 22.600858] nf_tables: (c) 2007-2009 Patrick McHardy [ 22.601241] nf_tables_gettable start: ffff888527c10000 [ 22.601271] register_pernet_subsys ffffffffa02ba0c0 [ 22.601274] netns ops_init ffffffffa02ba0c0 ffffffff821aeec0 [ 22.602506] nf_tables_dump_tables: ffff888527c10000 [ 22.603187] af_info list init done: ffffffff821aeec0 [ 22.604064] nf_tables_dump_tables: afi: (null) [ 22.604077] BUG: unable to handle kernel [ 22.604820] netns ops_init end ffffffffa02ba0c0 ffffffff821aeec0 [ 22.605698] NULL pointer dereference [ 22.606354] netns ops_init ffffffffa02ba0c0 ffff888527c10000 (gdb) p &init_net $2 = (struct net *) 0xffffffff821aeec0 ffff888527c10000 is a testns1 namespaces To reproduce this problem and test the fix I scripted following steps: - start Qemu VM - run nft_trans_stress.sh test - check dmesg logs for NULL pointer dereference - reboot via QMP and repeat I tested the fix with our kernel regression tests (including kselftest) also. net/netfilter/nf_tables_api.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c683a45b8ae53..65495b528290b 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6032,18 +6032,25 @@ static int __init nf_tables_module_init(void) goto err1; } - err = nf_tables_core_module_init(); + err = register_pernet_subsys(&nf_tables_net_ops); if (err < 0) goto err2; - err = nfnetlink_subsys_register(&nf_tables_subsys); + err = nf_tables_core_module_init(); if (err < 0) goto err3; + /* must be last */ + err = nfnetlink_subsys_register(&nf_tables_subsys); + if (err < 0) + goto err4; + pr_info("nf_tables: (c) 2007-2009 Patrick McHardy \n"); - return register_pernet_subsys(&nf_tables_net_ops); -err3: + return err; +err4: nf_tables_core_module_exit(); +err3: + unregister_pernet_subsys(&nf_tables_net_ops); err2: kfree(info); err1: