Message ID | 20240215113905.96817-1-aleksander.lobakin@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:b825:b0:106:860b:bbdd with SMTP id da37csp338241dyb; Thu, 15 Feb 2024 03:47:37 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWOPhfw3sLO4ly9FLMlX+GUXdi/EgnrBxT6kk8wW4kZWULLkWrmQQBy/22x1s2OdEcYAvUQcu2yFZxViLvpKalw/+EFeg== X-Google-Smtp-Source: AGHT+IEF0lgucj81l4uALLSXbGgd4IixbPHA6TjdHYTXuypIiIkpYGAyOIRT5nKuNOqZQ5fc7N3q X-Received: by 2002:aa7:d354:0:b0:560:4e74:9cf8 with SMTP id m20-20020aa7d354000000b005604e749cf8mr1128935edr.34.1707997657272; Thu, 15 Feb 2024 03:47:37 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707997657; cv=pass; d=google.com; s=arc-20160816; b=wrtZ22MQssGTIq1gPLyb6vBQlis7G24R4XXkj94NE0QX7e1oG0Fet5o/Zk76AwkYYs C1TXD0ulva3WQPcIyuQG8L812GtTmUxCWKPrQtEtSXBiTdzU6YiZSOqopY7ajM5J8qvH +acWYZ6eso6yxCvOlGgRvRCvHZF1G9MZk7WUJ/x5ByjTEQFDIIbG3o9MUHn2efHeyTgq gKoxjB9KwJp7Du2pS/rIkSn3r8Fr68AijP2tW9+EPdSmGtzEM7IwBfWSHA8j2fSfBKD3 q5iTL5mUAOWWwM1McEQkCZfQwx/zXwwBbbGgKaFdS4uh/wI6j/DVimwykzrKFZLh/PFI v8Cw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=0YmYxCnc2x6mldydCuG4jMBRwtfERld/kwiLqbxeKZ0=; fh=6rKRVgNNq6bSf8iJ4slGizHiTUs9CG9ayQ4Q0fTCIJ0=; b=cdLzXmwIlcXuovS3HKbL1TYGAvAnpXO6o3VT0wQ5df0b7SJIkjJT9fGR84d+puTgP0 5f/p7Oivdv0JtQaUlqTiyFjrpHPt/rwb317uazvwqcyv4im3vD+ie+CR5w2YS8TS8TQV PkQO7FQuTx1bDSjVksmLkUFICat/S5kIGtl+pWekrUZ3ZZGsoGjqXlnEQP6UaLn0HR+w adxyOkP776YtrGeEMzakU90zx6o7lU7SrwB/vaRmuOfMRf7JWsFUEuIIsoXOWNRGkCpL KWxKGzLtiwRMI0L90Qf3IlLqmAmH1HlC2uh9p7Ro/WUBr+J16N5plT0d5sI7ccZgswrE uz0g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZFKuhjFB; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id c17-20020aa7c991000000b00561e965ced9si582148edt.237.2024.02.15.03.47.37 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Feb 2024 03:47:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZFKuhjFB; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-66822-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 6C0901F2C8C6 for <ouuuleilei@gmail.com>; Thu, 15 Feb 2024 11:40:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4C9DB12C80F; Thu, 15 Feb 2024 11:39:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZFKuhjFB" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C26B4A08; Thu, 15 Feb 2024 11:39:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707997161; cv=none; b=TiMXa1lG2PU08GU3Y4265wIkVc/65ICyhabTY8AoBnmckvK9FXuhukwGE8b4Pp74dZ1WtF197IVGT/OcCnrFVrHWBmhXfb47YS0InDizF7YkEuazOBeGZaUElCEsf5PKnwKDATL+I7y/LhMJ3Poij7byQhOkEafWz2Tg9TfraFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707997161; c=relaxed/simple; bh=xHD1jYlDl3OhNy/NvkoRutSUDcPPUt2DKryla2HlDLA=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=MeR+TkS283JkEX6EFUWC74c60AIkRi6RXNQaHtxqHCbl20JccO/HMa5ODlmxV8PvG1WYWgwjRTsDt9N9fFEucER/cerrzRlGC721A0tEtokTeFjIH7pbLWg6wRCDeWYu2zgEwmmPIhWqsjpn/qO58Q4JVVNksV5h7ZxOz0LMFU8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZFKuhjFB; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1707997160; x=1739533160; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=xHD1jYlDl3OhNy/NvkoRutSUDcPPUt2DKryla2HlDLA=; b=ZFKuhjFBoe83Ct/j45txcE+dR3Qjx8OQvqhYD9C7e0cTVl7TKyAUNOkz xh17gb25n7iyEmexiTZAaa9P8/RuvjmH1sKzFjDloNIiYRigQ5mWf56Yj bm061E2DclSb9aeBJQ6gJWKVY0b4oK91pEw8OnlKdF+q32/qQfw6jUugE M9D8cbE22QV5HNLnJq89YwGEqlwWsmmFemzso4IdS7OalkfbtiVRsTFZd GA6ZxStwTDELybwv9YDRIUkCU8K4biPyBXQYIrmQxQhEwy1ihP4DbgC0Y cj5rAQQ0yFYVIAPcYjI5G3vM71T4pDihWW449sE91KYAV/3tNl4WpYprN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10984"; a="2217772" X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="2217772" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 03:39:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="34554740" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa001.fm.intel.com with ESMTP; 15 Feb 2024 03:39:17 -0800 From: Alexander Lobakin <aleksander.lobakin@intel.com> To: "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com>, Lorenzo Bianconi <lorenzo@kernel.org>, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8r?= =?utf-8?q?gensen?= <toke@redhat.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next] page_pool: disable direct recycling based on pool->cpuid on destroy Date: Thu, 15 Feb 2024 12:39:05 +0100 Message-ID: <20240215113905.96817-1-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790965351602168660 X-GMAIL-MSGID: 1790965351602168660 |
Series |
[net-next] page_pool: disable direct recycling based on pool->cpuid on destroy
|
|
Commit Message
Alexander Lobakin
Feb. 15, 2024, 11:39 a.m. UTC
Now that direct recycling is performed basing on pool->cpuid when set,
memory leaks are possible:
1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool->cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool->cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.
In order to avoid that, rewrite pool->cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool->cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -> page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.
Fixes: 2b0cfa6e4956 ("net: add generic percpu page_pool allocator")
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/net/page_pool/types.h | 5 -----
net/core/page_pool.c | 10 +++++++---
net/core/skbuff.c | 2 +-
3 files changed, 8 insertions(+), 9 deletions(-)
Comments
> Now that direct recycling is performed basing on pool->cpuid when set, > memory leaks are possible: > > 1. A pool is destroyed. > 2. Alloc cache is emptied (it's done only once). > 3. pool->cpuid is still set. > 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > 5. Now alloc cache is not empty, but it won't ever be freed. > > In order to avoid that, rewrite pool->cpuid to -1 when unlinking NAPI to > make sure no direct recycling will be possible after emptying the cache. > This involves a bit of overhead as pool->cpuid now must be accessed > via READ_ONCE() to avoid partial reads. > Rename page_pool_unlink_napi() -> page_pool_disable_direct_recycling() > to reflect what it actually does and unexport it. Hi Alexander, IIUC the reported issue, it requires the page_pool is destroyed (correct?), but system page_pool (the only one with cpuid not set to -1) will never be destroyed at runtime (or at we should avoid that). Am I missing something? Rergards, Lorenzo > > Fixes: 2b0cfa6e4956 ("net: add generic percpu page_pool allocator") > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > --- > include/net/page_pool/types.h | 5 ----- > net/core/page_pool.c | 10 +++++++--- > net/core/skbuff.c | 2 +- > 3 files changed, 8 insertions(+), 9 deletions(-) > > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index 3828396ae60c..3590fbe6e3f1 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -210,17 +210,12 @@ struct page_pool *page_pool_create_percpu(const struct page_pool_params *params, > struct xdp_mem_info; > > #ifdef CONFIG_PAGE_POOL > -void page_pool_unlink_napi(struct page_pool *pool); > void page_pool_destroy(struct page_pool *pool); > void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *), > struct xdp_mem_info *mem); > void page_pool_put_page_bulk(struct page_pool *pool, void **data, > int count); > #else > -static inline void page_pool_unlink_napi(struct page_pool *pool) > -{ > -} > - > static inline void page_pool_destroy(struct page_pool *pool) > { > } > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index 89c835fcf094..e8b9399d8e32 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -949,8 +949,13 @@ void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *), > pool->xdp_mem_id = mem->id; > } > > -void page_pool_unlink_napi(struct page_pool *pool) > +static void page_pool_disable_direct_recycling(struct page_pool *pool) > { > + /* Disable direct recycling based on pool->cpuid. > + * Paired with READ_ONCE() in napi_pp_put_page(). > + */ > + WRITE_ONCE(pool->cpuid, -1); > + > if (!pool->p.napi) > return; > > @@ -962,7 +967,6 @@ void page_pool_unlink_napi(struct page_pool *pool) > > WRITE_ONCE(pool->p.napi, NULL); > } > -EXPORT_SYMBOL(page_pool_unlink_napi); > > void page_pool_destroy(struct page_pool *pool) > { > @@ -972,7 +976,7 @@ void page_pool_destroy(struct page_pool *pool) > if (!page_pool_put(pool)) > return; > > - page_pool_unlink_napi(pool); > + page_pool_disable_direct_recycling(pool); > page_pool_free_frag(pool); > > if (!page_pool_release(pool)) > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 0d9a489e6ae1..b41856585c24 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -1018,7 +1018,7 @@ bool napi_pp_put_page(struct page *page, bool napi_safe) > unsigned int cpuid = smp_processor_id(); > > allow_direct = napi && READ_ONCE(napi->list_owner) == cpuid; > - allow_direct |= (pp->cpuid == cpuid); > + allow_direct |= READ_ONCE(pp->cpuid) == cpuid; > } > > /* Driver set this to memory recycling info. Reset it on recycle. > -- > 2.43.0 >
Alexander Lobakin <aleksander.lobakin@intel.com> writes: > Now that direct recycling is performed basing on pool->cpuid when set, > memory leaks are possible: > > 1. A pool is destroyed. > 2. Alloc cache is emptied (it's done only once). > 3. pool->cpuid is still set. > 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > 5. Now alloc cache is not empty, but it won't ever be freed. Did you actually manage to trigger this? pool->cpuid is only set for the system page pool instance which is never destroyed; so this seems a very theoretical concern? I guess we could still do this in case we find other uses for setting the cpuid; I don't think the addition of the READ_ONCE() will have any measurable overhead on the common arches? -Toke
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu, 15 Feb 2024 13:05:30 +0100 > Alexander Lobakin <aleksander.lobakin@intel.com> writes: > >> Now that direct recycling is performed basing on pool->cpuid when set, >> memory leaks are possible: >> >> 1. A pool is destroyed. >> 2. Alloc cache is emptied (it's done only once). >> 3. pool->cpuid is still set. >> 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. >> 5. Now alloc cache is not empty, but it won't ever be freed. > > Did you actually manage to trigger this? pool->cpuid is only set for the > system page pool instance which is never destroyed; so this seems a very > theoretical concern? To both Lorenzo and Toke: Yes, system page pools are never destroyed, but we might latter use cpuid in non-persistent PPs. Then there will be memory leaks. I was able to trigger this by creating bpf/test_run page_pools with the cpuid set to test direct recycling of live frames. > > I guess we could still do this in case we find other uses for setting > the cpuid; I don't think the addition of the READ_ONCE() will have any > measurable overhead on the common arches? READ_ONCE() is cheap, but I thought it's worth mentioning in the commitmsg anyway :) > > -Toke > Thanks, Olek
Alexander Lobakin <aleksander.lobakin@intel.com> writes: > From: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu, 15 Feb 2024 13:05:30 +0100 > >> Alexander Lobakin <aleksander.lobakin@intel.com> writes: >> >>> Now that direct recycling is performed basing on pool->cpuid when set, >>> memory leaks are possible: >>> >>> 1. A pool is destroyed. >>> 2. Alloc cache is emptied (it's done only once). >>> 3. pool->cpuid is still set. >>> 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. >>> 5. Now alloc cache is not empty, but it won't ever be freed. >> >> Did you actually manage to trigger this? pool->cpuid is only set for the >> system page pool instance which is never destroyed; so this seems a very >> theoretical concern? > > To both Lorenzo and Toke: > > Yes, system page pools are never destroyed, but we might latter use > cpuid in non-persistent PPs. Then there will be memory leaks. > I was able to trigger this by creating bpf/test_run page_pools with the > cpuid set to test direct recycling of live frames. > >> >> I guess we could still do this in case we find other uses for setting >> the cpuid; I don't think the addition of the READ_ONCE() will have any >> measurable overhead on the common arches? > > READ_ONCE() is cheap, but I thought it's worth mentioning in the > commitmsg anyway :) Right. I'm OK with changing this as a form of future-proofing if we end up finding other uses for setting the cpuid field, so: Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> From: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu, 15 Feb 2024 13:05:30 +0100 > > > Alexander Lobakin <aleksander.lobakin@intel.com> writes: > > > >> Now that direct recycling is performed basing on pool->cpuid when set, > >> memory leaks are possible: > >> > >> 1. A pool is destroyed. > >> 2. Alloc cache is emptied (it's done only once). > >> 3. pool->cpuid is still set. > >> 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > >> 5. Now alloc cache is not empty, but it won't ever be freed. > > > > Did you actually manage to trigger this? pool->cpuid is only set for the > > system page pool instance which is never destroyed; so this seems a very > > theoretical concern? > > To both Lorenzo and Toke: > > Yes, system page pools are never destroyed, but we might latter use > cpuid in non-persistent PPs. Then there will be memory leaks. > I was able to trigger this by creating bpf/test_run page_pools with the > cpuid set to test direct recycling of live frames. what about avoiding the page to be destroyed int this case? I do not like the idea of overwriting the cpuid field for it. Regards, Lorenzo > > > > > I guess we could still do this in case we find other uses for setting > > the cpuid; I don't think the addition of the READ_ONCE() will have any > > measurable overhead on the common arches? > > READ_ONCE() is cheap, but I thought it's worth mentioning in the > commitmsg anyway :) > > > > > -Toke > > > > Thanks, > Olek
From: Lorenzo Bianconi <lorenzo@kernel.org> Date: Thu, 15 Feb 2024 14:37:10 +0100 >> From: Toke Høiland-Jørgensen <toke@redhat.com> >> Date: Thu, 15 Feb 2024 13:05:30 +0100 >> >>> Alexander Lobakin <aleksander.lobakin@intel.com> writes: >>> >>>> Now that direct recycling is performed basing on pool->cpuid when set, >>>> memory leaks are possible: >>>> >>>> 1. A pool is destroyed. >>>> 2. Alloc cache is emptied (it's done only once). >>>> 3. pool->cpuid is still set. >>>> 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. >>>> 5. Now alloc cache is not empty, but it won't ever be freed. >>> >>> Did you actually manage to trigger this? pool->cpuid is only set for the >>> system page pool instance which is never destroyed; so this seems a very >>> theoretical concern? >> >> To both Lorenzo and Toke: >> >> Yes, system page pools are never destroyed, but we might latter use >> cpuid in non-persistent PPs. Then there will be memory leaks. >> I was able to trigger this by creating bpf/test_run page_pools with the >> cpuid set to test direct recycling of live frames. > > what about avoiding the page to be destroyed int this case? I do not like the I think I didn't get what you wanted to say here :s Rewriting cpuid doesn't introduce any new checks on hotpath. Destroying the pool is slowpath and we shouldn't hurt hotpath to handle it. > idea of overwriting the cpuid field for it. We also overwrite pp->p.napi field a couple lines below. It happens only when destroying the pool, we don't care about the fields at this point. > > Regards, > Lorenzo > >> >>> >>> I guess we could still do this in case we find other uses for setting >>> the cpuid; I don't think the addition of the READ_ONCE() will have any >>> measurable overhead on the common arches? >> >> READ_ONCE() is cheap, but I thought it's worth mentioning in the >> commitmsg anyway :) >> >>> >>> -Toke Thanks, Olek
> From: Lorenzo Bianconi <lorenzo@kernel.org> > Date: Thu, 15 Feb 2024 14:37:10 +0100 > > >> From: Toke Høiland-Jørgensen <toke@redhat.com> > >> Date: Thu, 15 Feb 2024 13:05:30 +0100 > >> > >>> Alexander Lobakin <aleksander.lobakin@intel.com> writes: > >>> > >>>> Now that direct recycling is performed basing on pool->cpuid when set, > >>>> memory leaks are possible: > >>>> > >>>> 1. A pool is destroyed. > >>>> 2. Alloc cache is emptied (it's done only once). > >>>> 3. pool->cpuid is still set. > >>>> 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > >>>> 5. Now alloc cache is not empty, but it won't ever be freed. > >>> > >>> Did you actually manage to trigger this? pool->cpuid is only set for the > >>> system page pool instance which is never destroyed; so this seems a very > >>> theoretical concern? > >> > >> To both Lorenzo and Toke: > >> > >> Yes, system page pools are never destroyed, but we might latter use > >> cpuid in non-persistent PPs. Then there will be memory leaks. > >> I was able to trigger this by creating bpf/test_run page_pools with the > >> cpuid set to test direct recycling of live frames. > > > > what about avoiding the page to be destroyed int this case? I do not like the > > I think I didn't get what you wanted to say here :s My assumption here was cpuid will be set just system page_pool so it is just a matter of not running page_pool_destroy for them. Anyway in the future we could allow to set cpuid even for non-system page_pool if the pool is linked to a given rx-queue and the queue is pinned to a given cpu. Regards, Lorenzo > > Rewriting cpuid doesn't introduce any new checks on hotpath. Destroying > the pool is slowpath and we shouldn't hurt hotpath to handle it. > > > idea of overwriting the cpuid field for it. > > We also overwrite pp->p.napi field a couple lines below. It happens only > when destroying the pool, we don't care about the fields at this point. > > > > > Regards, > > Lorenzo > > > >> > >>> > >>> I guess we could still do this in case we find other uses for setting > >>> the cpuid; I don't think the addition of the READ_ONCE() will have any > >>> measurable overhead on the common arches? > >> > >> READ_ONCE() is cheap, but I thought it's worth mentioning in the > >> commitmsg anyway :) > >> > >>> > >>> -Toke > > Thanks, > Olek
From: Alexander Lobakin <aleksander.lobakin@intel.com> Date: Thu, 15 Feb 2024 12:39:05 +0100 > Now that direct recycling is performed basing on pool->cpuid when set, > memory leaks are possible: > > 1. A pool is destroyed. > 2. Alloc cache is emptied (it's done only once). > 3. pool->cpuid is still set. > 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > 5. Now alloc cache is not empty, but it won't ever be freed. > > In order to avoid that, rewrite pool->cpuid to -1 when unlinking NAPI to > make sure no direct recycling will be possible after emptying the cache. > This involves a bit of overhead as pool->cpuid now must be accessed > via READ_ONCE() to avoid partial reads. > Rename page_pool_unlink_napi() -> page_pool_disable_direct_recycling() > to reflect what it actually does and unexport it. PW says "Changes requested", but I don't see any in the thread, did I miss something? :D Thanks, Olek
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 15 Feb 2024 12:39:05 +0100 you wrote: > Now that direct recycling is performed basing on pool->cpuid when set, > memory leaks are possible: > > 1. A pool is destroyed. > 2. Alloc cache is emptied (it's done only once). > 3. pool->cpuid is still set. > 4. napi_pp_put_page() does direct recycling basing on pool->cpuid. > 5. Now alloc cache is not empty, but it won't ever be freed. > > [...] Here is the summary with links: - [net-next] page_pool: disable direct recycling based on pool->cpuid on destroy https://git.kernel.org/netdev/net-next/c/56ef27e3abe6 You are awesome, thank you!
diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index 3828396ae60c..3590fbe6e3f1 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -210,17 +210,12 @@ struct page_pool *page_pool_create_percpu(const struct page_pool_params *params, struct xdp_mem_info; #ifdef CONFIG_PAGE_POOL -void page_pool_unlink_napi(struct page_pool *pool); void page_pool_destroy(struct page_pool *pool); void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *), struct xdp_mem_info *mem); void page_pool_put_page_bulk(struct page_pool *pool, void **data, int count); #else -static inline void page_pool_unlink_napi(struct page_pool *pool) -{ -} - static inline void page_pool_destroy(struct page_pool *pool) { } diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 89c835fcf094..e8b9399d8e32 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -949,8 +949,13 @@ void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *), pool->xdp_mem_id = mem->id; } -void page_pool_unlink_napi(struct page_pool *pool) +static void page_pool_disable_direct_recycling(struct page_pool *pool) { + /* Disable direct recycling based on pool->cpuid. + * Paired with READ_ONCE() in napi_pp_put_page(). + */ + WRITE_ONCE(pool->cpuid, -1); + if (!pool->p.napi) return; @@ -962,7 +967,6 @@ void page_pool_unlink_napi(struct page_pool *pool) WRITE_ONCE(pool->p.napi, NULL); } -EXPORT_SYMBOL(page_pool_unlink_napi); void page_pool_destroy(struct page_pool *pool) { @@ -972,7 +976,7 @@ void page_pool_destroy(struct page_pool *pool) if (!page_pool_put(pool)) return; - page_pool_unlink_napi(pool); + page_pool_disable_direct_recycling(pool); page_pool_free_frag(pool); if (!page_pool_release(pool)) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 0d9a489e6ae1..b41856585c24 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1018,7 +1018,7 @@ bool napi_pp_put_page(struct page *page, bool napi_safe) unsigned int cpuid = smp_processor_id(); allow_direct = napi && READ_ONCE(napi->list_owner) == cpuid; - allow_direct |= (pp->cpuid == cpuid); + allow_direct |= READ_ONCE(pp->cpuid) == cpuid; } /* Driver set this to memory recycling info. Reset it on recycle.