[net-next,v2,4/4] net: lan96x: Use page_pool API
Commit Message
Use the page_pool API for allocation, freeing and DMA handling instead
of dev_alloc_pages, __free_pages and dma_map_page.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
.../net/ethernet/microchip/lan966x/Kconfig | 1 +
.../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++---------
.../ethernet/microchip/lan966x/lan966x_main.h | 3 +
3 files changed, 43 insertions(+), 33 deletions(-)
Comments
From: Horatiu Vultur <horatiu.vultur@microchip.com>
Date: Sun, 6 Nov 2022 22:11:54 +0100
> Use the page_pool API for allocation, freeing and DMA handling instead
> of dev_alloc_pages, __free_pages and dma_map_page.
>
> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
> ---
> .../net/ethernet/microchip/lan966x/Kconfig | 1 +
> .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++---------
> .../ethernet/microchip/lan966x/lan966x_main.h | 3 +
> 3 files changed, 43 insertions(+), 33 deletions(-)
[...]
> @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
> rx->last_entry = dcb;
> }
>
> +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx)
> +{
> + struct lan966x *lan966x = rx->lan966x;
> + struct page_pool_params pp_params = {
> + .order = rx->page_order,
> + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> + .pool_size = FDMA_DCB_MAX,
> + .nid = NUMA_NO_NODE,
> + .dev = lan966x->dev,
> + .dma_dir = DMA_FROM_DEVICE,
> + .offset = 0,
> + .max_len = PAGE_SIZE << rx->page_order,
::max_len's primary purpose is to save time on DMA syncs.
First of all, you can substract
`SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never
writes to those last couple hundred bytes.
But I suggest calculating ::max_len basing on your current MTU
value. Let's say you have 16k pages and MTU of 1500, that is a huge
difference (except your DMA is always coherent, but I assume that's
not the case).
In lan966x_fdma_change_mtu() you do:
max_mtu = lan966x_fdma_get_max_mtu(lan966x);
max_mtu += IFH_LEN_BYTES;
max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
max_mtu += VLAN_HLEN * 2;
`lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2`
(ie 1536 for the MTU of 1500) is your max_len value actually, given
that you don't reserve any headroom (which is unfortunate, but I
guess you're working on this already, since XDP requires
%XDP_PACKET_HEADROOM).
> + };
> +
> + rx->page_pool = page_pool_create(&pp_params);
> + if (IS_ERR(rx->page_pool))
> + return PTR_ERR(rx->page_pool);
> +
> + return 0;
return PTR_ERR_OR_ZERO(rx->page_pool);
> +}
> +
> static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx)
> {
> struct lan966x *lan966x = rx->lan966x;
[...]
> --
> 2.38.0
Thanks,
Olek
The 11/07/2022 17:40, Alexander Lobakin wrote:
Hi Olek,
>
> From: Horatiu Vultur <horatiu.vultur@microchip.com>
> Date: Sun, 6 Nov 2022 22:11:54 +0100
>
> > Use the page_pool API for allocation, freeing and DMA handling instead
> > of dev_alloc_pages, __free_pages and dma_map_page.
> >
> > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
> > ---
> > .../net/ethernet/microchip/lan966x/Kconfig | 1 +
> > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++---------
> > .../ethernet/microchip/lan966x/lan966x_main.h | 3 +
> > 3 files changed, 43 insertions(+), 33 deletions(-)
>
> [...]
>
> > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
> > rx->last_entry = dcb;
> > }
> >
> > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx)
> > +{
> > + struct lan966x *lan966x = rx->lan966x;
> > + struct page_pool_params pp_params = {
> > + .order = rx->page_order,
> > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > + .pool_size = FDMA_DCB_MAX,
> > + .nid = NUMA_NO_NODE,
> > + .dev = lan966x->dev,
> > + .dma_dir = DMA_FROM_DEVICE,
> > + .offset = 0,
> > + .max_len = PAGE_SIZE << rx->page_order,
>
> ::max_len's primary purpose is to save time on DMA syncs.
> First of all, you can substract
> `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never
> writes to those last couple hundred bytes.
> But I suggest calculating ::max_len basing on your current MTU
> value. Let's say you have 16k pages and MTU of 1500, that is a huge
> difference (except your DMA is always coherent, but I assume that's
> not the case).
>
> In lan966x_fdma_change_mtu() you do:
>
> max_mtu = lan966x_fdma_get_max_mtu(lan966x);
> max_mtu += IFH_LEN_BYTES;
> max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> max_mtu += VLAN_HLEN * 2;
>
> `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2`
> (ie 1536 for the MTU of 1500) is your max_len value actually, given
> that you don't reserve any headroom (which is unfortunate, but I
> guess you're working on this already, since XDP requires
> %XDP_PACKET_HEADROOM).
Thanks for the suggestion. I will try it.
Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be
needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I
need to reserve also the headroom.
>
> > + };
> > +
> > + rx->page_pool = page_pool_create(&pp_params);
> > + if (IS_ERR(rx->page_pool))
> > + return PTR_ERR(rx->page_pool);
> > +
> > + return 0;
>
> return PTR_ERR_OR_ZERO(rx->page_pool);
Yes, I will use this.
>
> > +}
> > +
> > static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx)
> > {
> > struct lan966x *lan966x = rx->lan966x;
>
> [...]
>
> > --
> > 2.38.0
>
> Thanks,
> Olek
From: Horatiu Vultur <horatiu.vultur@microchip.com>
Date: Mon, 7 Nov 2022 22:35:21 +0100
> The 11/07/2022 17:40, Alexander Lobakin wrote:
>
> Hi Olek,
>
> >
> > From: Horatiu Vultur <horatiu.vultur@microchip.com>
> > Date: Sun, 6 Nov 2022 22:11:54 +0100
> >
> > > Use the page_pool API for allocation, freeing and DMA handling instead
> > > of dev_alloc_pages, __free_pages and dma_map_page.
> > >
> > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
> > > ---
> > > .../net/ethernet/microchip/lan966x/Kconfig | 1 +
> > > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++---------
> > > .../ethernet/microchip/lan966x/lan966x_main.h | 3 +
> > > 3 files changed, 43 insertions(+), 33 deletions(-)
> >
> > [...]
> >
> > > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
> > > rx->last_entry = dcb;
> > > }
> > >
> > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx)
> > > +{
> > > + struct lan966x *lan966x = rx->lan966x;
> > > + struct page_pool_params pp_params = {
> > > + .order = rx->page_order,
> > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > > + .pool_size = FDMA_DCB_MAX,
> > > + .nid = NUMA_NO_NODE,
> > > + .dev = lan966x->dev,
> > > + .dma_dir = DMA_FROM_DEVICE,
> > > + .offset = 0,
> > > + .max_len = PAGE_SIZE << rx->page_order,
> >
> > ::max_len's primary purpose is to save time on DMA syncs.
> > First of all, you can substract
> > `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never
> > writes to those last couple hundred bytes.
> > But I suggest calculating ::max_len basing on your current MTU
> > value. Let's say you have 16k pages and MTU of 1500, that is a huge
> > difference (except your DMA is always coherent, but I assume that's
> > not the case).
> >
> > In lan966x_fdma_change_mtu() you do:
> >
> > max_mtu = lan966x_fdma_get_max_mtu(lan966x);
> > max_mtu += IFH_LEN_BYTES;
> > max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > max_mtu += VLAN_HLEN * 2;
> >
> > `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2`
> > (ie 1536 for the MTU of 1500) is your max_len value actually, given
> > that you don't reserve any headroom (which is unfortunate, but I
> > guess you're working on this already, since XDP requires
> > %XDP_PACKET_HEADROOM).
>
> Thanks for the suggestion. I will try it.
> Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be
> needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I
> need to reserve also the headroom.
Correct, since you're disabling metadata support in
xdp_prepare_buff(), headroom is not needed for pass and drop
actions.
It's always good to have at least %NET_SKB_PAD headroom inside an
skb, so that networking stack won't perform excessive reallocations,
and your code currently misses that -- if I understand currently,
after converting hardware-specific header to an Ethernet header you
have 28 - 14 = 14 bytes of headroom, which sometimes can be not
enough for example for forwarding cases. It's not related to XDP,
but I would do that as a prerequisite patch for Tx/redirect, since
you'll be adding headroom support anyway :)
>
> >
> > > + };
> > > +
> > > + rx->page_pool = page_pool_create(&pp_params);
> > > + if (IS_ERR(rx->page_pool))
> > > + return PTR_ERR(rx->page_pool);
[...]
> > > --
> > > 2.38.0
> >
> > Thanks,
> > Olek
>
> --
> /Horatiu
Thanks,
Olek
The 11/08/2022 12:33, Alexander Lobakin wrote:
>
> From: Horatiu Vultur <horatiu.vultur@microchip.com>
> Date: Mon, 7 Nov 2022 22:35:21 +0100
>
> > The 11/07/2022 17:40, Alexander Lobakin wrote:
> >
> > Hi Olek,
> >
> > >
> > > From: Horatiu Vultur <horatiu.vultur@microchip.com>
> > > Date: Sun, 6 Nov 2022 22:11:54 +0100
> > >
> > > > Use the page_pool API for allocation, freeing and DMA handling instead
> > > > of dev_alloc_pages, __free_pages and dma_map_page.
> > > >
> > > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
> > > > ---
> > > > .../net/ethernet/microchip/lan966x/Kconfig | 1 +
> > > > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++---------
> > > > .../ethernet/microchip/lan966x/lan966x_main.h | 3 +
> > > > 3 files changed, 43 insertions(+), 33 deletions(-)
> > >
> > > [...]
> > >
> > > > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
> > > > rx->last_entry = dcb;
> > > > }
> > > >
> > > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx)
> > > > +{
> > > > + struct lan966x *lan966x = rx->lan966x;
> > > > + struct page_pool_params pp_params = {
> > > > + .order = rx->page_order,
> > > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > > > + .pool_size = FDMA_DCB_MAX,
> > > > + .nid = NUMA_NO_NODE,
> > > > + .dev = lan966x->dev,
> > > > + .dma_dir = DMA_FROM_DEVICE,
> > > > + .offset = 0,
> > > > + .max_len = PAGE_SIZE << rx->page_order,
> > >
> > > ::max_len's primary purpose is to save time on DMA syncs.
> > > First of all, you can substract
> > > `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never
> > > writes to those last couple hundred bytes.
> > > But I suggest calculating ::max_len basing on your current MTU
> > > value. Let's say you have 16k pages and MTU of 1500, that is a huge
> > > difference (except your DMA is always coherent, but I assume that's
> > > not the case).
> > >
> > > In lan966x_fdma_change_mtu() you do:
> > >
> > > max_mtu = lan966x_fdma_get_max_mtu(lan966x);
> > > max_mtu += IFH_LEN_BYTES;
> > > max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > > max_mtu += VLAN_HLEN * 2;
> > >
> > > `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2`
> > > (ie 1536 for the MTU of 1500) is your max_len value actually, given
> > > that you don't reserve any headroom (which is unfortunate, but I
> > > guess you're working on this already, since XDP requires
> > > %XDP_PACKET_HEADROOM).
> >
> > Thanks for the suggestion. I will try it.
> > Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be
> > needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I
> > need to reserve also the headroom.
>
> Correct, since you're disabling metadata support in
> xdp_prepare_buff(), headroom is not needed for pass and drop
> actions.
>
> It's always good to have at least %NET_SKB_PAD headroom inside an
> skb, so that networking stack won't perform excessive reallocations,
> and your code currently misses that -- if I understand currently,
> after converting hardware-specific header to an Ethernet header you
> have 28 - 14 = 14 bytes of headroom, which sometimes can be not
> enough for example for forwarding cases. It's not related to XDP,
> but I would do that as a prerequisite patch for Tx/redirect, since
> you'll be adding headroom support anyway :)
Just a small comment here. There is no need to convert hardware-specific
header, because after that header there is the ethernet header. So I
would have 28 bytes left for headroom, but that is still less then
NET_SKB_PAD.
But I got the idea. When I will add the Tx/redirect, one of those
patches will be to make sure we have enough headroom.
>
> >
> > >
> > > > + };
> > > > +
> > > > + rx->page_pool = page_pool_create(&pp_params);
> > > > + if (IS_ERR(rx->page_pool))
> > > > + return PTR_ERR(rx->page_pool);
>
> [...]
>
> > > > --
> > > > 2.38.0
> > >
> > > Thanks,
> > > Olek
> >
> > --
> > /Horatiu
>
> Thanks,
> Olek
@@ -7,5 +7,6 @@ config LAN966X_SWITCH
depends on BRIDGE || BRIDGE=n
select PHYLINK
select PACKING
+ select PAGE_POOL
help
This driver supports the Lan966x network switch device.
@@ -10,47 +10,25 @@ static int lan966x_fdma_channel_active(struct lan966x *lan966x)
static struct page *lan966x_fdma_rx_alloc_page(struct lan966x_rx *rx,
struct lan966x_db *db)
{
- struct lan966x *lan966x = rx->lan966x;
- dma_addr_t dma_addr;
struct page *page;
- page = dev_alloc_pages(rx->page_order);
+ page = page_pool_dev_alloc_pages(rx->page_pool);
if (unlikely(!page))
return NULL;
- dma_addr = dma_map_page(lan966x->dev, page, 0,
- PAGE_SIZE << rx->page_order,
- DMA_FROM_DEVICE);
- if (unlikely(dma_mapping_error(lan966x->dev, dma_addr)))
- goto free_page;
-
- db->dataptr = dma_addr;
+ db->dataptr = page_pool_get_dma_addr(page);
return page;
-
-free_page:
- __free_pages(page, rx->page_order);
- return NULL;
}
static void lan966x_fdma_rx_free_pages(struct lan966x_rx *rx)
{
- struct lan966x *lan966x = rx->lan966x;
- struct lan966x_rx_dcb *dcb;
- struct lan966x_db *db;
int i, j;
for (i = 0; i < FDMA_DCB_MAX; ++i) {
- dcb = &rx->dcbs[i];
-
- for (j = 0; j < FDMA_RX_DCB_MAX_DBS; ++j) {
- db = &dcb->db[j];
- dma_unmap_single(lan966x->dev,
- (dma_addr_t)db->dataptr,
- PAGE_SIZE << rx->page_order,
- DMA_FROM_DEVICE);
- __free_pages(rx->page[i][j], rx->page_order);
- }
+ for (j = 0; j < FDMA_RX_DCB_MAX_DBS; ++j)
+ page_pool_put_full_page(rx->page_pool,
+ rx->page[i][j], false);
}
}
@@ -62,7 +40,7 @@ static void lan966x_fdma_rx_free_page(struct lan966x_rx *rx)
if (unlikely(!page))
return;
- __free_pages(page, rx->page_order);
+ page_pool_recycle_direct(rx->page_pool, page);
}
static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
@@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx,
rx->last_entry = dcb;
}
+static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx)
+{
+ struct lan966x *lan966x = rx->lan966x;
+ struct page_pool_params pp_params = {
+ .order = rx->page_order,
+ .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
+ .pool_size = FDMA_DCB_MAX,
+ .nid = NUMA_NO_NODE,
+ .dev = lan966x->dev,
+ .dma_dir = DMA_FROM_DEVICE,
+ .offset = 0,
+ .max_len = PAGE_SIZE << rx->page_order,
+ };
+
+ rx->page_pool = page_pool_create(&pp_params);
+ if (IS_ERR(rx->page_pool))
+ return PTR_ERR(rx->page_pool);
+
+ return 0;
+}
+
static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx)
{
struct lan966x *lan966x = rx->lan966x;
@@ -93,6 +92,9 @@ static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx)
int i, j;
int size;
+ if (lan966x_fdma_rx_alloc_page_pool(rx))
+ return PTR_ERR(rx->page_pool);
+
/* calculate how many pages are needed to allocate the dcbs */
size = sizeof(struct lan966x_rx_dcb) * FDMA_DCB_MAX;
size = ALIGN(size, PAGE_SIZE);
@@ -436,10 +438,6 @@ static int lan966x_fdma_rx_check_frame(struct lan966x_rx *rx, u64 *src_port)
FDMA_DCB_STATUS_BLOCKL(db->status),
DMA_FROM_DEVICE);
- dma_unmap_single_attrs(lan966x->dev, (dma_addr_t)db->dataptr,
- PAGE_SIZE << rx->page_order, DMA_FROM_DEVICE,
- DMA_ATTR_SKIP_CPU_SYNC);
-
lan966x_ifh_get_src_port(page_address(page), src_port);
if (WARN_ON(*src_port >= lan966x->num_phys_ports))
return FDMA_ERROR;
@@ -468,6 +466,8 @@ static struct sk_buff *lan966x_fdma_rx_get_frame(struct lan966x_rx *rx,
if (unlikely(!skb))
goto free_page;
+ skb_mark_for_recycle(skb);
+
skb_put(skb, FDMA_DCB_STATUS_BLOCKL(db->status));
lan966x_ifh_get_timestamp(skb->data, ×tamp);
@@ -495,7 +495,7 @@ static struct sk_buff *lan966x_fdma_rx_get_frame(struct lan966x_rx *rx,
return skb;
free_page:
- __free_pages(page, rx->page_order);
+ page_pool_recycle_direct(rx->page_pool, page);
return NULL;
}
@@ -740,6 +740,7 @@ static int lan966x_qsys_sw_status(struct lan966x *lan966x)
static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu)
{
+ struct page_pool *page_pool;
dma_addr_t rx_dma;
void *rx_dcbs;
u32 size;
@@ -748,6 +749,7 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu)
/* Store these for later to free them */
rx_dma = lan966x->rx.dma;
rx_dcbs = lan966x->rx.dcbs;
+ page_pool = lan966x->rx.page_pool;
napi_synchronize(&lan966x->napi);
napi_disable(&lan966x->napi);
@@ -765,11 +767,14 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu)
size = ALIGN(size, PAGE_SIZE);
dma_free_coherent(lan966x->dev, size, rx_dcbs, rx_dma);
+ page_pool_destroy(page_pool);
+
lan966x_fdma_wakeup_netdev(lan966x);
napi_enable(&lan966x->napi);
return err;
restore:
+ lan966x->rx.page_pool = page_pool;
lan966x->rx.dma = rx_dma;
lan966x->rx.dcbs = rx_dcbs;
lan966x_fdma_rx_start(&lan966x->rx);
@@ -876,5 +881,6 @@ void lan966x_fdma_deinit(struct lan966x *lan966x)
lan966x_fdma_rx_free_pages(&lan966x->rx);
lan966x_fdma_rx_free(&lan966x->rx);
+ page_pool_destroy(lan966x->rx.page_pool);
lan966x_fdma_tx_free(&lan966x->tx);
}
@@ -9,6 +9,7 @@
#include <linux/phy.h>
#include <linux/phylink.h>
#include <linux/ptp_clock_kernel.h>
+#include <net/page_pool.h>
#include <net/pkt_cls.h>
#include <net/pkt_sched.h>
#include <net/switchdev.h>
@@ -162,6 +163,8 @@ struct lan966x_rx {
u8 page_order;
u8 channel_id;
+
+ struct page_pool *page_pool;
};
struct lan966x_tx_dcb_buf {