From patchwork Tue May 23 02:35:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Paniakin X-Patchwork-Id: 97702 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1852021vqo; Mon, 22 May 2023 19:46:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5ypm7Rxsk45n41DpMM+Dnf4nTXsLSS9dBggAo2NsJHCvBy6N3GuYn2XwtxzrKeRhqFo+UH X-Received: by 2002:a05:6a20:7490:b0:10c:67c9:2fdc with SMTP id p16-20020a056a20749000b0010c67c92fdcmr1105270pzd.48.1684809969192; Mon, 22 May 2023 19:46:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684809969; cv=none; d=google.com; s=arc-20160816; b=RavHUv58RVZfREgTCsdDQgijbZNtK+/oPczlTUg2DozpekDqPQo/u0ki4hBV7NHkn3 YbRvAr04FkPHcuzjR1E/dVdVUa2hCa/gWF1jX7vfe7lqqHvlAyT/WAwp4qVpgCRajki0 dWo76SRi5JvhQtdCvY8PzEeEj7DtghYeH6Az4FDi7SrC+W6IG+CU77Heu3QkVp9RSyyT RYSj1cvauVSk71GySlUUYroQya9fnj0bDl0tzjehxBWA1gSCa8nSSo9He/nyNGq0AH0s NbFP/N1ExPYzWJTDqUHI89fqEKOC6Gv8KmF+CED3o1ccwv1NXmmXF7LXNCl2L8MZJUgH x4nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:content-transfer-encoding:mime-version :message-id:date:subject:cc:from:dkim-signature; bh=nny6LjxpUW6mJRzHzp9QlRzCPGPnlS5WgOn3BfbgZ8Q=; b=E19X18U0sRfszFf2Mt4Afb5YWcsCiAFoFsMjuAhUvxtgnKjtx0hNcx9Fnowva8adbe XbEkcIlQm+Q9gKy8yjjxEziFJSgLR3P8SQ/VowvdjjPJGc4LfSj4Z8uvLFo09tLnVkAk ligHB3SpbSUMd19Q2V37TJn13uZ3YgchDn0Na1QLWgzqOR3nnk9Dd3TUQo2TFuxWXI16 rVwnEPXlcN/baZkYKbLRaNzazWKRDDDjindKxq0reWQM+2qvmTrHM1CdmeaYhfyrbKmf Qej6NQ7hl6Ujb8xxli2XfCYNcHa1SY+G2AZBbXnVt4t99BFr9mro+94SX0ZkibpMYHg+ 6nKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@amazon.com header.s=amazon201209 header.b=WofiW930; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bv63-20020a632e42000000b0050ac7d1b32asi3767916pgb.603.2023.05.22.19.45.56; Mon, 22 May 2023 19:46:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@amazon.com header.s=amazon201209 header.b=WofiW930; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233455AbjEWCjY (ORCPT + 99 others); Mon, 22 May 2023 22:39:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233967AbjEWCjR (ORCPT ); Mon, 22 May 2023 22:39:17 -0400 Received: from smtp-fw-9103.amazon.com (smtp-fw-9103.amazon.com [207.171.188.200]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F90CFD; Mon, 22 May 2023 19:39:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1684809554; x=1716345554; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=nny6LjxpUW6mJRzHzp9QlRzCPGPnlS5WgOn3BfbgZ8Q=; b=WofiW930yBbmNG7fWEAaJMcXjKVqmerWGSBXpbo66bIPBzbXzRuF2Kbi 3qfjMFPxSt7pr/UMZqs7pUSg9pT75ywgbOz7h3tstrMmfgPBnnmk+srfO o6/pXggVguxiS9Q+NVgg8Try2NBKXefvW0De5+h6yHdupYV13yux81LML s=; X-IronPort-AV: E=Sophos;i="6.00,185,1681171200"; d="scan'208";a="1132910551" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2c-m6i4x-d2040ec1.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9103.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2023 02:39:07 +0000 Received: from EX19D009EUA003.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2c-m6i4x-d2040ec1.us-west-2.amazon.com (Postfix) with ESMTPS id EAE3A40D40; Tue, 23 May 2023 02:39:01 +0000 (UTC) Received: from EX19D026EUB004.ant.amazon.com (10.252.61.64) by EX19D009EUA003.ant.amazon.com (10.252.50.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 23 May 2023 02:38:53 +0000 Received: from uc3ecf78c6baf56.ant.amazon.com (10.119.183.60) by EX19D026EUB004.ant.amazon.com (10.252.61.64) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 23 May 2023 02:38:50 +0000 From: Andrew Paniakin CC: , , Florian Westphal , Pablo Neira Ayuso , Andrew Paniakin , Jozsef Kadlecsik , "David S. Miller" , , , , Subject: [PATCH] netfilter: nf_tables: fix register ordering Date: Mon, 22 May 2023 19:35:14 -0700 Message-ID: <20230523023514.1672418-1-apanyaki@amazon.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.119.183.60] X-ClientProxiedBy: EX19D045UWA003.ant.amazon.com (10.13.139.46) To EX19D026EUB004.ant.amazon.com (10.252.61.64) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766651298635863433?= X-GMAIL-MSGID: =?utf-8?q?1766651298635863433?= From: Florian Westphal commit d209df3e7f7002d9099fdb0f6df0f972b4386a63 upstream [ We hit the trace described in commit message with the kselftest/nft_trans_stress.sh. This patch diverges from the upstream one since kernel 4.14 does not have following symbols: nft_chain_filter_init, nf_tables_flowtable_notifier ] We must register nfnetlink ops last, as that exposes nf_tables to userspace. Without this, we could theoretically get nfnetlink request before net->nft state has been initialized. Fixes: 99633ab29b213 ("netfilter: nf_tables: complete net namespace support") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso [apanyaki: backport to v4.14-stable] Signed-off-by: Andrew Paniakin --- [ 163.471426] Call Trace: [ 163.474901] netlink_dump+0x125/0x2d0 [ 163.479081] __netlink_dump_start+0x16a/0x1c0 [ 163.483589] nf_tables_gettable+0x151/0x180 [nf_tables] [ 163.488561] ? nf_tables_gettable+0x180/0x180 [nf_tables] [ 163.493658] nfnetlink_rcv_msg+0x222/0x250 [nfnetlink] [ 163.498608] ? __skb_try_recv_datagram+0x114/0x180 [ 163.503359] ? nfnetlink_net_exit_batch+0x60/0x60 [nfnetlink] [ 163.508590] netlink_rcv_skb+0x4d/0x130 [ 163.512832] nfnetlink_rcv+0x92/0x780 [nfnetlink] [ 163.517465] ? netlink_recvmsg+0x202/0x3e0 [ 163.521801] ? __kmalloc_node_track_caller+0x31/0x290 [ 163.526635] ? copy_msghdr_from_user+0xd5/0x150 [ 163.531216] ? __netlink_lookup+0xd0/0x130 [ 163.535536] netlink_unicast+0x196/0x240 [ 163.539759] netlink_sendmsg+0x2da/0x400 [ 163.544010] sock_sendmsg+0x36/0x40 [ 163.548030] SYSC_sendto+0x10e/0x140 [ 163.552119] ? __audit_syscall_entry+0xbc/0x110 [ 163.556741] ? syscall_trace_enter+0x1df/0x2e0 [ 163.561315] ? __audit_syscall_exit+0x231/0x2b0 [ 163.565857] do_syscall_64+0x67/0x110 [ 163.569930] entry_SYSCALL_64_after_hwframe+0x59/0xbe Reproduce with debug logs clearly shows the nft initialization issue exactly as in ported patch description: [ 22.600051] nft load start [ 22.600858] nf_tables: (c) 2007-2009 Patrick McHardy [ 22.601241] nf_tables_gettable start: ffff888527c10000 [ 22.601271] register_pernet_subsys ffffffffa02ba0c0 [ 22.601274] netns ops_init ffffffffa02ba0c0 ffffffff821aeec0 [ 22.602506] nf_tables_dump_tables: ffff888527c10000 [ 22.603187] af_info list init done: ffffffff821aeec0 [ 22.604064] nf_tables_dump_tables: afi: (null) [ 22.604077] BUG: unable to handle kernel [ 22.604820] netns ops_init end ffffffffa02ba0c0 ffffffff821aeec0 [ 22.605698] NULL pointer dereference [ 22.606354] netns ops_init ffffffffa02ba0c0 ffff888527c10000 (gdb) p &init_net $2 = (struct net *) 0xffffffff821aeec0 ffff888527c10000 is a testns1 namespaces To reproduce this problem and test the fix I scripted following steps: - start Qemu VM - run nft_trans_stress.sh test - check dmesg logs for NULL pointer dereference - reboot via QMP and repeat I tested the fix with our kernel regression tests (including kselftest) also. net/netfilter/nf_tables_api.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c683a45b8ae53..65495b528290b 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6032,18 +6032,25 @@ static int __init nf_tables_module_init(void) goto err1; } - err = nf_tables_core_module_init(); + err = register_pernet_subsys(&nf_tables_net_ops); if (err < 0) goto err2; - err = nfnetlink_subsys_register(&nf_tables_subsys); + err = nf_tables_core_module_init(); if (err < 0) goto err3; + /* must be last */ + err = nfnetlink_subsys_register(&nf_tables_subsys); + if (err < 0) + goto err4; + pr_info("nf_tables: (c) 2007-2009 Patrick McHardy \n"); - return register_pernet_subsys(&nf_tables_net_ops); -err3: + return err; +err4: nf_tables_core_module_exit(); +err3: + unregister_pernet_subsys(&nf_tables_net_ops); err2: kfree(info); err1: