Message ID | 20221129025249.463833-1-yin31149@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp91496wrr; Mon, 28 Nov 2022 18:56:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf7TYofxwDiBg6OkC5jzdZL1FQp1i7pjOxBANCYisINssL214E3d0xBNFcBVuEa5hD/eACrk X-Received: by 2002:a05:6402:1768:b0:463:ce05:c00e with SMTP id da8-20020a056402176800b00463ce05c00emr49500393edb.46.1669690591200; Mon, 28 Nov 2022 18:56:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669690591; cv=none; d=google.com; s=arc-20160816; b=Ftpph2eq7Mad77qnUErWEfPLmslLgAmxaDajSKXNDCi+czGbutKh8FQcM8AZNQIe5X U7WIC3jLuuP4+LFpn/6Dy1W0m1Vkd9nQnGB+BU6TbYqAvRE4go5hWoe6+ubmzLmSdnJ7 XwPs2448oLZjtu1YpJ5a8iMnuygakkqZ5mZCdcQKNiez6aemimusjT8rGe+qVfJ9J5/S pJXLW0VtFNhEAhFCE6jBGX0UVOJrDnluzc8Y1VEMPFdQqv6owBMwveUiHJAtslBdUSDq 7ey7WBYUOifOiZPnOXa8+nk9/xKlSp3qluYfPd1vb33eYkXQAprDKEZ2uLssDO8DGi9E PdFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=35p5Pzvij06883L5xcotqkJm3JL5Gnq7PcCbaLDkUh8=; b=OfNAMhTfQX2Zl6r0ceoR4m/Vl4cYc+BfCFI66FiEF+XX6pEOrhsxtozhPoYjadR53Z FtuRwmV7vefRpfTg1HlfwedzahqnSF0dnzcLQoWUpUhNbb/+3vGCl61Kzj1oLt2KWeeU f7wzqhnGgUioYiYs1gLlbOfdhvNuf2kPq/xTnEIdljTINiXyKEbdo1I71tK8h3M+4qGa Sf/yaQDQrUnCRJypccnRTXi7CzH0xrRavpiB6uIgGvz3FVGINaMALkG9B4APNl7tqvgk suyud2SMQq65C3hH2VAxmbUvcAzZhBhl95AjIf2uFwIpZx7oQYIh20l0wAcTjCZ4h0Zx aIyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Y7b9duHr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sb31-20020a1709076d9f00b007c07d0c3c53si2503669ejc.532.2022.11.28.18.56.08; Mon, 28 Nov 2022 18:56:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Y7b9duHr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235266AbiK2CxG (ORCPT <rfc822;gah0developer@gmail.com> + 99 others); Mon, 28 Nov 2022 21:53:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235206AbiK2CxA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 28 Nov 2022 21:53:00 -0500 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDBDF2A428; Mon, 28 Nov 2022 18:52:57 -0800 (PST) Received: by mail-pg1-x52d.google.com with SMTP id 82so4278797pgc.0; Mon, 28 Nov 2022 18:52:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=35p5Pzvij06883L5xcotqkJm3JL5Gnq7PcCbaLDkUh8=; b=Y7b9duHr6K89L/ohp2M0hPG3B89OYr6vF9bFrfIiCWJBLHgcnXkP9P1WwonJ2c//51 XKIIu36s6NdzF4oIuGKkjPxIMjGaK5aZfnAqlwWqnVmuhFiNGCzMedTU7ZRjP2NSqt8w JEOpANuinneiS1fYdV3K0TjMkeuCPBMc9DCoPYox/4Lp812J+9vbKQJndbflPdeM0vHX kpMJNSvcOOepdfEZ4cyCLp9zyScncvgNtpSE788m1zhCckPRO3An4UZdM0hnZcwaW2M0 hoSbv2BCs2mAciVXknzwAynj8LLHl3mfMBLZ1XZgwsprWd2h/R1kC9OVLRgHVuMtonsd wNFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=35p5Pzvij06883L5xcotqkJm3JL5Gnq7PcCbaLDkUh8=; b=ZbqBevg3VloFVrujeQ31ijYaoW75FeB/E2QtVexgvpdobN74uk4//MEW4HixXH9Mt1 JN5vIgYqm7PKOAYG3QlstHJgdL7HmUeGRk5YChJllf5/UNdh+IHFHDE+4M2NAD3VMNZN 9iJp2TGI/LmLxnUH+5VqE+wALIW7IOTRcjie7r/xFSBZWU4jSRHxt2TBf60mJR23HknD mkxVq/bKmPZQCHIbdEPfxOUOn4zzDch/C+xQf01vNnOWRC+3NefcSaxGHQRXSXJD4HhT xCbx3ESTnjJu9JBcV+MmP33MBzeDDKINClpw3eUafQuHA1cqnUbApwvM0eXId+9ly0bd FLiA== X-Gm-Message-State: ANoB5plmi034WwBbeayD4bxQOV1Pt4sGuEj4nFHhyh58f9LNvOPoiXep jkgoUcNSlCkSpEnAMIYJRLo= X-Received: by 2002:a65:4c85:0:b0:46f:59bd:6125 with SMTP id m5-20020a654c85000000b0046f59bd6125mr48407772pgt.147.1669690376990; Mon, 28 Nov 2022 18:52:56 -0800 (PST) Received: from localhost ([183.242.254.166]) by smtp.gmail.com with ESMTPSA id p21-20020a631e55000000b004597e92f99dsm7233338pgm.66.2022.11.28.18.52.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 18:52:55 -0800 (PST) From: Hawkins Jiawei <yin31149@gmail.com> To: yin31149@gmail.com, Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>, Jiri Pirko <jiri@resnulli.us>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com> Cc: 18801353760@163.com, syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com, syzkaller-bugs@googlegroups.com, Cong Wang <cong.wang@bytedance.com>, Dmitry Vyukov <dvyukov@google.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3] net: sched: fix memory leak in tcindex_set_parms Date: Tue, 29 Nov 2022 10:52:49 +0800 Message-Id: <20221129025249.463833-1-yin31149@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750797480866857436?= X-GMAIL-MSGID: =?utf-8?q?1750797480866857436?= |
Series |
[v3] net: sched: fix memory leak in tcindex_set_parms
|
|
Commit Message
Hawkins Jiawei
Nov. 29, 2022, 2:52 a.m. UTC
Syzkaller reports a memory leak as follows: ==================================== BUG: memory leak unreferenced object 0xffff88810c287f00 (size 256): comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff814cf9f0>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 [<ffffffff839c9e07>] kmalloc include/linux/slab.h:576 [inline] [<ffffffff839c9e07>] kmalloc_array include/linux/slab.h:627 [inline] [<ffffffff839c9e07>] kcalloc include/linux/slab.h:659 [inline] [<ffffffff839c9e07>] tcf_exts_init include/net/pkt_cls.h:250 [inline] [<ffffffff839c9e07>] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342 [<ffffffff839caa1f>] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553 [<ffffffff8394db62>] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147 [<ffffffff8389e91c>] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082 [<ffffffff839eba67>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540 [<ffffffff839eab87>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] [<ffffffff839eab87>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345 [<ffffffff839eb046>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921 [<ffffffff8383e796>] sock_sendmsg_nosec net/socket.c:714 [inline] [<ffffffff8383e796>] sock_sendmsg+0x56/0x80 net/socket.c:734 [<ffffffff8383eb08>] ____sys_sendmsg+0x178/0x410 net/socket.c:2482 [<ffffffff83843678>] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536 [<ffffffff838439c5>] __sys_sendmmsg+0x105/0x330 net/socket.c:2622 [<ffffffff83843c14>] __do_sys_sendmmsg net/socket.c:2651 [inline] [<ffffffff83843c14>] __se_sys_sendmmsg net/socket.c:2648 [inline] [<ffffffff83843c14>] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648 [<ffffffff84605fd5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff84605fd5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd ==================================== Kernel uses tcindex_change() to change an existing filter properties. During the process of changing, kernel uses tcindex_alloc_perfect_hash() to newly allocate filter results, uses tcindex_filter_result_init() to clear the old filter result. Yet the problem is that, kernel clears the old filter result, without destroying its tcf_exts structure, which triggers the above memory leak. Considering that there already extis a tc_filter_wq workqueue to destroy the old tcindex_data by tcindex_partial_destroy_work() at the end of tcindex_set_parms(), this patch solves this memory leak bug by removing this old filter result clearing part, and delegating it to the tc_filter_wq workqueue. [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni and Dmitry Vyukov] Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/ Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Cc: Cong Wang <cong.wang@bytedance.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> --- v3: - refactor the commit message - delegate the tcf_exts_destroy() to tc_filter_wq workqueue, suggested by Paolo Abeni and Dmitry Vyukov v2: https://lore.kernel.org/all/20221113170507.8205-1-yin31149@gmail.com/ v1: https://lore.kernel.org/all/20221031060835.11722-1-yin31149@gmail.com/ net/sched/cls_tcindex.c | 8 -------- 1 file changed, 8 deletions(-)
Comments
On Tue, 2022-11-29 at 10:52 +0800, Hawkins Jiawei wrote: > Syzkaller reports a memory leak as follows: > ==================================== > BUG: memory leak > unreferenced object 0xffff88810c287f00 (size 256): > comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffff814cf9f0>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 > [<ffffffff839c9e07>] kmalloc include/linux/slab.h:576 [inline] > [<ffffffff839c9e07>] kmalloc_array include/linux/slab.h:627 [inline] > [<ffffffff839c9e07>] kcalloc include/linux/slab.h:659 [inline] > [<ffffffff839c9e07>] tcf_exts_init include/net/pkt_cls.h:250 [inline] > [<ffffffff839c9e07>] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342 > [<ffffffff839caa1f>] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553 > [<ffffffff8394db62>] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147 > [<ffffffff8389e91c>] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082 > [<ffffffff839eba67>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540 > [<ffffffff839eab87>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] > [<ffffffff839eab87>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345 > [<ffffffff839eb046>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921 > [<ffffffff8383e796>] sock_sendmsg_nosec net/socket.c:714 [inline] > [<ffffffff8383e796>] sock_sendmsg+0x56/0x80 net/socket.c:734 > [<ffffffff8383eb08>] ____sys_sendmsg+0x178/0x410 net/socket.c:2482 > [<ffffffff83843678>] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536 > [<ffffffff838439c5>] __sys_sendmmsg+0x105/0x330 net/socket.c:2622 > [<ffffffff83843c14>] __do_sys_sendmmsg net/socket.c:2651 [inline] > [<ffffffff83843c14>] __se_sys_sendmmsg net/socket.c:2648 [inline] > [<ffffffff83843c14>] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648 > [<ffffffff84605fd5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] > [<ffffffff84605fd5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > [<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd > ==================================== > > Kernel uses tcindex_change() to change an existing > filter properties. During the process of changing, > kernel uses tcindex_alloc_perfect_hash() to newly > allocate filter results, uses tcindex_filter_result_init() > to clear the old filter result. > > Yet the problem is that, kernel clears the old > filter result, without destroying its tcf_exts structure, > which triggers the above memory leak. > > Considering that there already extis a tc_filter_wq workqueue > to destroy the old tcindex_data by tcindex_partial_destroy_work() > at the end of tcindex_set_parms(), this patch solves this memory > leak bug by removing this old filter result clearing part, > and delegating it to the tc_filter_wq workqueue. > > [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni > and Dmitry Vyukov] > > Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") > Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/ > Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com > Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com > Cc: Cong Wang <cong.wang@bytedance.com> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Paolo Abeni <pabeni@redhat.com> > Cc: Dmitry Vyukov <dvyukov@google.com> > Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> The patch looks correct to me, but we are very late in this release cycle, and I fear there is a chance of introducing some regression. The issue addressed here is present since quite some time, I suggest to postpone this fix to the beginning of the next release cycle. Please, repost this patch after that 6.1 is released, thanks! (And feel free to add my Acked-by). Paolo
On Thu, 1 Dec 2022 at 18:24, Paolo Abeni <pabeni@redhat.com> wrote: > > On Tue, 2022-11-29 at 10:52 +0800, Hawkins Jiawei wrote: > > Syzkaller reports a memory leak as follows: > > ==================================== > > BUG: memory leak > > unreferenced object 0xffff88810c287f00 (size 256): > > comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s) > > hex dump (first 32 bytes): > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > backtrace: > > [<ffffffff814cf9f0>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 > > [<ffffffff839c9e07>] kmalloc include/linux/slab.h:576 [inline] > > [<ffffffff839c9e07>] kmalloc_array include/linux/slab.h:627 [inline] > > [<ffffffff839c9e07>] kcalloc include/linux/slab.h:659 [inline] > > [<ffffffff839c9e07>] tcf_exts_init include/net/pkt_cls.h:250 [inline] > > [<ffffffff839c9e07>] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342 > > [<ffffffff839caa1f>] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553 > > [<ffffffff8394db62>] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147 > > [<ffffffff8389e91c>] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082 > > [<ffffffff839eba67>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540 > > [<ffffffff839eab87>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] > > [<ffffffff839eab87>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345 > > [<ffffffff839eb046>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921 > > [<ffffffff8383e796>] sock_sendmsg_nosec net/socket.c:714 [inline] > > [<ffffffff8383e796>] sock_sendmsg+0x56/0x80 net/socket.c:734 > > [<ffffffff8383eb08>] ____sys_sendmsg+0x178/0x410 net/socket.c:2482 > > [<ffffffff83843678>] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536 > > [<ffffffff838439c5>] __sys_sendmmsg+0x105/0x330 net/socket.c:2622 > > [<ffffffff83843c14>] __do_sys_sendmmsg net/socket.c:2651 [inline] > > [<ffffffff83843c14>] __se_sys_sendmmsg net/socket.c:2648 [inline] > > [<ffffffff83843c14>] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648 > > [<ffffffff84605fd5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > [<ffffffff84605fd5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > [<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > ==================================== > > > > Kernel uses tcindex_change() to change an existing > > filter properties. During the process of changing, > > kernel uses tcindex_alloc_perfect_hash() to newly > > allocate filter results, uses tcindex_filter_result_init() > > to clear the old filter result. > > > > Yet the problem is that, kernel clears the old > > filter result, without destroying its tcf_exts structure, > > which triggers the above memory leak. > > > > Considering that there already extis a tc_filter_wq workqueue > > to destroy the old tcindex_data by tcindex_partial_destroy_work() > > at the end of tcindex_set_parms(), this patch solves this memory > > leak bug by removing this old filter result clearing part, > > and delegating it to the tc_filter_wq workqueue. > > > > [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni > > and Dmitry Vyukov] > > > > Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") > > Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/ > > Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com > > Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com > > Cc: Cong Wang <cong.wang@bytedance.com> > > Cc: Jakub Kicinski <kuba@kernel.org> > > Cc: Paolo Abeni <pabeni@redhat.com> > > Cc: Dmitry Vyukov <dvyukov@google.com> > > Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> > > The patch looks correct to me, but we are very late in this release > cycle, and I fear there is a chance of introducing some regression. The > issue addressed here is present since quite some time, I suggest to > postpone this fix to the beginning of the next release cycle. > > Please, repost this patch after that 6.1 is released, thanks! (And feel > free to add my Acked-by). Thanks for your review. I will retest this patch after 6.1, and repost this patch if the patch works fine. > > Paolo >
On Tue, Nov 29, 2022 at 10:52:49AM +0800, Hawkins Jiawei wrote: > Kernel uses tcindex_change() to change an existing > filter properties. During the process of changing, > kernel uses tcindex_alloc_perfect_hash() to newly > allocate filter results, uses tcindex_filter_result_init() > to clear the old filter result. > > Yet the problem is that, kernel clears the old > filter result, without destroying its tcf_exts structure, > which triggers the above memory leak. > > Considering that there already extis a tc_filter_wq workqueue > to destroy the old tcindex_data by tcindex_partial_destroy_work() > at the end of tcindex_set_parms(), this patch solves this memory > leak bug by removing this old filter result clearing part, > and delegating it to the tc_filter_wq workqueue. Hmm?? The tcindex_partial_destroy_work() is to destroy 'oldp' which is different from 'old_r'. I mean, you seem assuming that struct tcindex_filter_result is always from struct tcindex_data, which is not true, check the following tcindex_lookup() which retrieves tcindex_filter_result from struct tcindex_filter. static struct tcindex_filter_result *tcindex_lookup(struct tcindex_data *p, u16 key) { if (p->perfect) { struct tcindex_filter_result *f = p->perfect + key; return tcindex_filter_is_set(f) ? f : NULL; } else if (p->h) { struct tcindex_filter __rcu **fp; struct tcindex_filter *f; fp = &p->h[key % p->hash]; for (f = rcu_dereference_bh_rtnl(*fp); f; fp = &f->next, f = rcu_dereference_bh_rtnl(*fp)) if (f->key == key) return &f->result; } return NULL; } > diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c > index 1c9eeb98d826..3f4e7a6cdd96 100644 > --- a/net/sched/cls_tcindex.c > +++ b/net/sched/cls_tcindex.c > @@ -478,14 +478,6 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, > tcf_bind_filter(tp, &cr, base); > } > > - if (old_r && old_r != r) { > - err = tcindex_filter_result_init(old_r, cp, net); > - if (err < 0) { > - kfree(f); > - goto errout_alloc; > - } > - } > - Even if your above analysis is correct, 'old_r' becomes unused (set but not used) now, I think you should get some compiler warning. Thanks.
On Sun, 4 Dec 2022 at 04:19, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Tue, Nov 29, 2022 at 10:52:49AM +0800, Hawkins Jiawei wrote: > > Kernel uses tcindex_change() to change an existing > > filter properties. During the process of changing, > > kernel uses tcindex_alloc_perfect_hash() to newly > > allocate filter results, uses tcindex_filter_result_init() > > to clear the old filter result. > > > > Yet the problem is that, kernel clears the old > > filter result, without destroying its tcf_exts structure, > > which triggers the above memory leak. > > > > Considering that there already extis a tc_filter_wq workqueue > > to destroy the old tcindex_data by tcindex_partial_destroy_work() > > at the end of tcindex_set_parms(), this patch solves this memory > > leak bug by removing this old filter result clearing part, > > and delegating it to the tc_filter_wq workqueue. > > Hmm?? The tcindex_partial_destroy_work() is to destroy 'oldp' which is > different from 'old_r'. I mean, you seem assuming that struct > tcindex_filter_result is always from struct tcindex_data, which is not > true, check the following tcindex_lookup() which retrieves tcindex_filter_result > from struct tcindex_filter. > > static struct tcindex_filter_result *tcindex_lookup(struct tcindex_data *p, > u16 key) > { > if (p->perfect) { > struct tcindex_filter_result *f = p->perfect + key; > > return tcindex_filter_is_set(f) ? f : NULL; > } else if (p->h) { > struct tcindex_filter __rcu **fp; > struct tcindex_filter *f; > > fp = &p->h[key % p->hash]; > for (f = rcu_dereference_bh_rtnl(*fp); > f; > fp = &f->next, f = rcu_dereference_bh_rtnl(*fp)) > if (f->key == key) > return &f->result; > } > > return NULL; > } Oh, thanks for correcting me! You are right, I wrongly assuming that struct tcindex_filter_result is always from struct tcindex_data `perfect` field. But I think this patch still can fix this problem, after reviewing the tcindex_set_parms(). Because only the `tcindex_filter_result` is from `struct tcindex_data`, can the code reaches the deleted part in this patch. To be more specific, the simplified logic about original tcindex_set_parms() is as below: static int tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, u32 handle, struct tcindex_data *p, struct tcindex_filter_result *r, struct nlattr **tb, struct nlattr *est, u32 flags, struct netlink_ext_ack *extack) { ... if (p->perfect) { int i; if (tcindex_alloc_perfect_hash(net, cp) < 0) goto errout; cp->alloc_hash = cp->hash; for (i = 0; i < min(cp->hash, p->hash); i++) cp->perfect[i].res = p->perfect[i].res; balloc = 1; } cp->h = p->h; ... if (cp->perfect) r = cp->perfect + handle; else r = tcindex_lookup(cp, handle) ? : &new_filter_result; if (old_r && old_r != r) { err = tcindex_filter_result_init(old_r, cp, net); if (err < 0) { kfree(f); goto errout_alloc; } } ... } - cp's h field is directly copied from p's h field - if `old_r` is retrieved from struct tcindex_filter, in other word, is retrieved from p's h field. Then the `r` should get the same value from `tcindex_loopup(cp, handle)`. - so `old_r == r` is true, code will never uses tcindex_filter_result_init() to clear the old_r in such case. So I think this patch still can fix this memory leak caused by tcindex_filter_result_init(), But maybe I need to improve my commit message. Please correct me If I am wrong. > > diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c > > index 1c9eeb98d826..3f4e7a6cdd96 100644 > > --- a/net/sched/cls_tcindex.c > > +++ b/net/sched/cls_tcindex.c > > @@ -478,14 +478,6 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, > > tcf_bind_filter(tp, &cr, base); > > } > > > > - if (old_r && old_r != r) { > > - err = tcindex_filter_result_init(old_r, cp, net); > > - if (err < 0) { > > - kfree(f); > > - goto errout_alloc; > > - } > > - } > > - > > Even if your above analysis is correct, 'old_r' becomes unused (set but not used) > now, I think you should get some compiler warning. Oh, it actually didn't trigger any compiler warning, because there is still a used place as below: static int tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, u32 handle, struct tcindex_data *p, struct tcindex_filter_result *r, struct nlattr **tb, struct nlattr *est, u32 flags, struct netlink_ext_ack *extack) { struct tcindex_filter_result new_filter_result, *old_r = r; ... err = tcindex_filter_result_init(&new_filter_result, cp, net); if (err < 0) goto errout_alloc; if (old_r) cr = r->res; ... } But the `old_r` and `r` has the same value here, so we can just replace the `old_r` with `r` here, and delete the `old_r` as you suggested. Thanks for your suggestion! > > Thanks.
On Mon, Dec 05, 2022 at 11:19:56PM +0800, Hawkins Jiawei wrote: > To be more specific, the simplified logic about original > tcindex_set_parms() is as below: > > static int > tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, > u32 handle, struct tcindex_data *p, > struct tcindex_filter_result *r, struct nlattr **tb, > struct nlattr *est, u32 flags, struct netlink_ext_ack *extack) > { > ... > if (p->perfect) { > int i; > > if (tcindex_alloc_perfect_hash(net, cp) < 0) > goto errout; > cp->alloc_hash = cp->hash; > for (i = 0; i < min(cp->hash, p->hash); i++) > cp->perfect[i].res = p->perfect[i].res; > balloc = 1; > } > cp->h = p->h; > > ... > > if (cp->perfect) > r = cp->perfect + handle; We can reach here if p->perfect is non-NULL. > else > r = tcindex_lookup(cp, handle) ? : &new_filter_result; > > if (old_r && old_r != r) { > err = tcindex_filter_result_init(old_r, cp, net); > if (err < 0) { > kfree(f); > goto errout_alloc; > } > } > ... > } > > - cp's h field is directly copied from p's h field > > - if `old_r` is retrieved from struct tcindex_filter, in other word, > is retrieved from p's h field. Then the `r` should get the same value > from `tcindex_loopup(cp, handle)`. See above, 'r' can be 'cp->perfect + handle' which is newly allocated, hence different from 'old_r'. > > - so `old_r == r` is true, code will never uses tcindex_filter_result_init() > to clear the old_r in such case. Not always. > > So I think this patch still can fix this memory leak caused by > tcindex_filter_result_init(), But maybe I need to improve my > commit message. > I think your patch may introduce other memory leaks and 'old_r' may be left as obsoleted too. Thanks.
On Sun, 11 Dec 2022 at 05:29, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Mon, Dec 05, 2022 at 11:19:56PM +0800, Hawkins Jiawei wrote: > > To be more specific, the simplified logic about original > > tcindex_set_parms() is as below: > > > > static int > > tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, > > u32 handle, struct tcindex_data *p, > > struct tcindex_filter_result *r, struct nlattr **tb, > > struct nlattr *est, u32 flags, struct netlink_ext_ack *extack) > > { > > ... > > if (p->perfect) { > > int i; > > > > if (tcindex_alloc_perfect_hash(net, cp) < 0) > > goto errout; > > cp->alloc_hash = cp->hash; > > for (i = 0; i < min(cp->hash, p->hash); i++) > > cp->perfect[i].res = p->perfect[i].res; > > balloc = 1; > > } > > cp->h = p->h; > > > > ... > > > > if (cp->perfect) > > r = cp->perfect + handle; > > We can reach here if p->perfect is non-NULL. > > > else > > r = tcindex_lookup(cp, handle) ? : &new_filter_result; > > > > if (old_r && old_r != r) { > > err = tcindex_filter_result_init(old_r, cp, net); > > if (err < 0) { > > kfree(f); > > goto errout_alloc; > > } > > } > > ... > > } > > > > - cp's h field is directly copied from p's h field > > > > - if `old_r` is retrieved from struct tcindex_filter, in other word, > > is retrieved from p's h field. Then the `r` should get the same value > > from `tcindex_loopup(cp, handle)`. > > See above, 'r' can be 'cp->perfect + handle' which is newly allocated, > hence different from 'old_r'. But if `r` is `cp->perfect + handle`, this means `cp->perfect` is not NULL. So `p->perfect` should not be NULL, which means `old_r` should be `p->perfect + handle`, according to tcindex_lookup(). This is not correct with the assumption that `old_r` is retrieved from p's h field. > > > > > - so `old_r == r` is true, code will never uses tcindex_filter_result_init() > > to clear the old_r in such case. > > Not always. > > > > > So I think this patch still can fix this memory leak caused by > > tcindex_filter_result_init(), But maybe I need to improve my > > commit message. > > > > I think your patch may introduce other memory leaks and 'old_r' may > be left as obsoleted too. I still think this patch should not introduce any memory leaks. * If the `old_r` is not NULL, it should have only two source according to the tcindex_lookup() - `old_r` is retrieved from `p->perfect`; or `old_r` is retrieved from `p->h`. And if `old_r` is retrieved from `p->h`, this means `p->perfect` is NULL. * If the `old_r` is retrieved from `p->perfect`, kernel uses tcindex_alloc_perfect_hash() to newly allocate the filter results. And `r` should be `cp->perfect + handle`, which is newly allocated. So `r != old_r` in this situation, but kernel will clears the `old_r` at tc_filter_wq workqueue in tcindex_partial_destroy_work(), by destroying the p->perfect. So here kernel doesn't need tcindex_filter_result_init() to clear the old filter result, and there is no memory leak. * If the `old_r` is retrieved from `p->h`, then `p->perfect` is NULL discussed above. Considering that `cp->h` is directly copied from `p->h`, `r` should get the same value as `old_r` from tcindex_lookup(). So `r == old_r`, it will ignore the part that kernel uses tcindex_filter_result_init() to clear the old filter result. So removing this part of code should have no effect in this situation. It seems that whether `old_r` is retrived from `p->h` or `p->perfect`, it is okay to directly deleting the part that kernel uses tcindex_filter_result_init() to clear the old filter result, without any memory leak. But this can fix the memory leak caused by tcindex_filter_result_init(). As for `old_r` may be left as obsoleted, do you mean `old_r` becomes unused(set but not used)? I think we can directly removing `old_r`. > > Thanks.
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index 1c9eeb98d826..3f4e7a6cdd96 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -478,14 +478,6 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, tcf_bind_filter(tp, &cr, base); } - if (old_r && old_r != r) { - err = tcindex_filter_result_init(old_r, cp, net); - if (err < 0) { - kfree(f); - goto errout_alloc; - } - } - oldp = p; r->res = cr; tcf_exts_change(&r->exts, &e);