[bpf-next,v3,3/4] xdp: recycle Page Pool backed skbs built from XDP frames

Message ID 20230313215553.1045175-4-aleksander.lobakin@intel.com
State New
Headers
Series xdp: recycle Page Pool backed skbs built from XDP frames |

Commit Message

Alexander Lobakin March 13, 2023, 9:55 p.m. UTC
  __xdp_build_skb_from_frame() state(d):

/* Until page_pool get SKB return path, release DMA here */

Page Pool got skb pages recycling in April 2021, but missed this
function.

xdp_release_frame() is relevant only for Page Pool backed frames and it
detaches the page from the corresponding page_pool in order to make it
freeable via page_frag_free(). It can instead just mark the output skb
as eligible for recycling if the frame is backed by a pp. No change for
other memory model types (the same condition check as before).
cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
almost).

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 net/core/xdp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Jesper Dangaard Brouer March 15, 2023, 2:55 p.m. UTC | #1
On 13/03/2023 22.55, Alexander Lobakin wrote:
> __xdp_build_skb_from_frame() state(d):
> 
> /* Until page_pool get SKB return path, release DMA here */
> 
> Page Pool got skb pages recycling in April 2021, but missed this
> function.
> 
> xdp_release_frame() is relevant only for Page Pool backed frames and it
> detaches the page from the corresponding page_pool in order to make it
> freeable via page_frag_free(). It can instead just mark the output skb
> as eligible for recycling if the frame is backed by a pp. No change for
> other memory model types (the same condition check as before).
> cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
> almost).
> 
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
>   net/core/xdp.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 8c92fc553317..a2237cfca8e9 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
>   	 * - RX ring dev queue index	(skb_record_rx_queue)
>   	 */
>   
> -	/* Until page_pool get SKB return path, release DMA here */
> -	xdp_release_frame(xdpf);
> +	if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
> +		skb_mark_for_recycle(skb);

I hope this is safe ;-) ... Meaning hopefully drivers does the correct
thing when XDP_REDIRECT'ing page_pool pages.

Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
see this is not using page_pool, so it should be affected by this (but I
worry if atlantic driver have a potential race condition for its refcnt
scheme).

>   
>   	/* Allow SKB to reuse area used by xdp_frame */
>   	xdp_scrub_frame(xdpf);
  
Alexander Lobakin March 15, 2023, 2:58 p.m. UTC | #2
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Wed, 15 Mar 2023 15:55:44 +0100

> 
> On 13/03/2023 22.55, Alexander Lobakin wrote:
>> __xdp_build_skb_from_frame() state(d):
>>
>> /* Until page_pool get SKB return path, release DMA here */
>>
>> Page Pool got skb pages recycling in April 2021, but missed this
>> function.
>>
>> xdp_release_frame() is relevant only for Page Pool backed frames and it
>> detaches the page from the corresponding page_pool in order to make it
>> freeable via page_frag_free(). It can instead just mark the output skb
>> as eligible for recycling if the frame is backed by a pp. No change for
>> other memory model types (the same condition check as before).
>> cpumap redirect and veth on Page Pool drivers now become zero-alloc (or
>> almost).
>>
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>> ---
>>   net/core/xdp.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 8c92fc553317..a2237cfca8e9 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct
>> xdp_frame *xdpf,
>>        * - RX ring dev queue index    (skb_record_rx_queue)
>>        */
>>   -    /* Until page_pool get SKB return path, release DMA here */
>> -    xdp_release_frame(xdpf);
>> +    if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
>> +        skb_mark_for_recycle(skb);
> 
> I hope this is safe ;-) ... Meaning hopefully drivers does the correct
> thing when XDP_REDIRECT'ing page_pool pages.

Safe when it's done by the schoolbook. For now I'm observing only one
syzbot issue with test_run due to that it assumes yet another bunch
o'things I wouldn't rely on :D (separate subthread)

> 
> Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
> noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
> see this is not using page_pool, so it should be affected by this (but I
> worry if atlantic driver have a potential race condition for its refcnt
> scheme).

If we encounter some driver using Page Pool, but mangling refcounts on
redirect, we'll fix it ;)

> 
>>         /* Allow SKB to reuse area used by xdp_frame */
>>       xdp_scrub_frame(xdpf);
> 

Thanks,
Olek
  
Jesper Dangaard Brouer March 16, 2023, 5:10 p.m. UTC | #3
On 15/03/2023 15.58, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
> Date: Wed, 15 Mar 2023 15:55:44 +0100
> 
>> On 13/03/2023 22.55, Alexander Lobakin wrote:
[...]
>>>
>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>> index 8c92fc553317..a2237cfca8e9 100644
>>> --- a/net/core/xdp.c
>>> +++ b/net/core/xdp.c
>>> @@ -658,8 +658,8 @@ struct sk_buff *__xdp_build_skb_from_frame(struct
>>> xdp_frame *xdpf,
>>>         * - RX ring dev queue index    (skb_record_rx_queue)
>>>         */
>>>    -    /* Until page_pool get SKB return path, release DMA here */
>>> -    xdp_release_frame(xdpf);
>>> +    if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
>>> +        skb_mark_for_recycle(skb);
>>
>> I hope this is safe ;-) ... Meaning hopefully drivers does the correct
>> thing when XDP_REDIRECT'ing page_pool pages.
> 
> Safe when it's done by the schoolbook. For now I'm observing only one
> syzbot issue with test_run due to that it assumes yet another bunch
> o'things I wouldn't rely on :D (separate subthread)
> 
>>
>> Looking for drivers doing weird refcnt tricks and XDP_REDIRECT'ing, I
>> noticed the driver aquantia/atlantic (in aq_get_rxpages_xdp), but I now
>> see this is not using page_pool, so it should be affected by this (but I
>> worry if atlantic driver have a potential race condition for its refcnt
>> scheme).
> 
> If we encounter some driver using Page Pool, but mangling refcounts on
> redirect, we'll fix it ;)
> 

Thanks for signing up for fixing these issues down-the-road :-)

For what is it worth, I've rebased to include this patchset on my
testlab.

For now, I've tested mlx5 with cpumap redirect and net stack processing,
everything seems to be working nicely. When disabling GRO/GRO, then the
cpumap get same and sometimes better TCP throughput performance,
even-though checksum have to be done in software. (Hopefully we can soon
close the missing HW checksum gap with XDP-hints).

--Jesper
  
Alexander Lobakin March 17, 2023, 1:36 p.m. UTC | #4
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Thu, 16 Mar 2023 18:10:26 +0100

> 
> On 15/03/2023 15.58, Alexander Lobakin wrote:
>> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
>> Date: Wed, 15 Mar 2023 15:55:44 +0100

[...]

> Thanks for signing up for fixing these issues down-the-road :-)

At some point, I wasn't sure which commit tags to put to Fixes:. Like,
from one PoV, it's not my patch which introduced them. From the other
side, there was no chance to have 0x42 overwritten in the metadata
during the selftest before that switch and no one could even predict it
(I didn't expect XDP_PASS frames from the test_run to reach neigh xmit
at all), so the original code is not buggy itself as well ._.

> 
> For what is it worth, I've rebased to include this patchset on my
> testlab.
> 
> For now, I've tested mlx5 with cpumap redirect and net stack processing,
> everything seems to be working nicely. When disabling GRO/GRO, then the
> cpumap get same and sometimes better TCP throughput performance,
> even-though checksum have to be done in software. (Hopefully we can soon
> close the missing HW checksum gap with XDP-hints).

Yeah I'm also looking forward to having some hints being passed to
cpumap/veth, so that __xdp_build_skb_from_frame() could consume it. So
that I could pick a bunch of patches from my RFC back to switch cpumap
to GRO finally :D

> 
> --Jesper
> 

Thanks,
Olek
  

Patch

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 8c92fc553317..a2237cfca8e9 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -658,8 +658,8 @@  struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	 * - RX ring dev queue index	(skb_record_rx_queue)
 	 */
 
-	/* Until page_pool get SKB return path, release DMA here */
-	xdp_release_frame(xdpf);
+	if (xdpf->mem.type == MEM_TYPE_PAGE_POOL)
+		skb_mark_for_recycle(skb);
 
 	/* Allow SKB to reuse area used by xdp_frame */
 	xdp_scrub_frame(xdpf);