diff mbox series

答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition

Message ID bb3c3846334744d7bbe83b1a29eaa762@baidu.com
State Changes Requested
Delegated to: David Miller
Headers show
Series 答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition | expand

Commit Message

Li RongQing Dec. 10, 2019, 9:39 a.m. UTC
> 
> static int mvneta_create_page_pool(struct mvneta_port *pp,
> 				   struct mvneta_rx_queue *rxq, int size) {
> 	struct bpf_prog *xdp_prog = READ_ONCE(pp->xdp_prog);
> 	struct page_pool_params pp_params = {
> 		.order = 0,
> 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> 		.pool_size = size,
> 		.nid = cpu_to_node(0),

This kind of device should only be installed to vendor's platform which did not support numa

But as you say , Saeed advice maybe cause that recycle always fail, if nid is configured like upper, and different from running NAPI node id

And maybe we can catch this case by the below


-Li

> 		.dev = pp->dev->dev.parent,
> 		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> 		.offset = pp->rx_offset_correction,
> 		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> 	};
> 
> the pool->p.nid is not NUMA_NO_NODE, then the node of page allocated for rx
> may not be numa_mem_id() when running in the NAPI polling, because
> pool->p.nid is not the same as the node of cpu running in the NAPI polling.
> 
> Does the page pool support recycling for above case?
> 
> Or we "fix' the above case by setting pool->p.nid to
> NUMA_NO_NODE/dev_to_node(), or by calling pool_update_nid() in NAPI
> polling as mlx5 does?
> 
> 
> >
> > Thanks,
> > Saeed.
> >
> >
> >
> >
> >
> >
> >

Comments

Ilias Apalodimas Dec. 10, 2019, 2:52 p.m. UTC | #1
Hi, 

On Tue, Dec 10, 2019 at 09:39:14AM +0000, Li,Rongqing wrote:
> > 
> > static int mvneta_create_page_pool(struct mvneta_port *pp,
> > 				   struct mvneta_rx_queue *rxq, int size) {
> > 	struct bpf_prog *xdp_prog = READ_ONCE(pp->xdp_prog);
> > 	struct page_pool_params pp_params = {
> > 		.order = 0,
> > 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > 		.pool_size = size,
> > 		.nid = cpu_to_node(0),
> 
> This kind of device should only be installed to vendor's platform which did not support numa
> 
> But as you say , Saeed advice maybe cause that recycle always fail, if nid is configured like upper, and different from running NAPI node id
> 
> And maybe we can catch this case by the below
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 3c8b51ccd1c1..973235c09487 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -328,6 +328,11 @@ static bool pool_page_reusable(struct page_pool *pool, struct page *page)
>  void __page_pool_put_page(struct page_pool *pool, struct page *page,
>                           unsigned int dma_sync_size, bool allow_direct)
>  {
> +       allow_direct = allow_direct && in_serving_softirq();
> +
> +       if (allow_direct)
> +               WARN_ON_ONCE((pool->p.nid != NUMA_NO_NODE) &&
> +                                    (pool->p.nid != numa_mem_id()));

Isn't this a duplicate of what Saeed proposed?
Instead of marking the page as recyclable we just throw a warning here.

>         /* This allocator is optimized for the XDP mode that uses
>          * one-frame-per-page, but have fallbacks that act like the
>          * regular page allocator APIs.
> @@ -342,7 +347,7 @@ void __page_pool_put_page(struct page_pool *pool, struct page *page,
>                         page_pool_dma_sync_for_device(pool, page,
>                                                       dma_sync_size);
>  

Thanks
/Ilias
Saeed Mahameed Dec. 10, 2019, 7:56 p.m. UTC | #2
On Tue, 2019-12-10 at 09:39 +0000, Li,Rongqing wrote:
> > static int mvneta_create_page_pool(struct mvneta_port *pp,
> > 				   struct mvneta_rx_queue *rxq, int
> > size) {
> > 	struct bpf_prog *xdp_prog = READ_ONCE(pp->xdp_prog);
> > 	struct page_pool_params pp_params = {
> > 		.order = 0,
> > 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > 		.pool_size = size,
> > 		.nid = cpu_to_node(0),
> 
> This kind of device should only be installed to vendor's platform
> which did not support numa
> 
> But as you say , Saeed advice maybe cause that recycle always fail,
> if nid is configured like upper, and different from running NAPI node
> id
> 

I don't see an issue here, By definition recycling must fail in such
case :).

> And maybe we can catch this case by the below
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 3c8b51ccd1c1..973235c09487 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -328,6 +328,11 @@ static bool pool_page_reusable(struct page_pool
> *pool, struct page *page)
>  void __page_pool_put_page(struct page_pool *pool, struct page *page,
>                           unsigned int dma_sync_size, bool
> allow_direct)
>  {
> +       allow_direct = allow_direct && in_serving_softirq();
> +
> +       if (allow_direct)
> +               WARN_ON_ONCE((pool->p.nid != NUMA_NO_NODE) &&
> +                                    (pool->p.nid != numa_mem_id()));
>         /* This allocator is optimized for the XDP mode that uses
>          * one-frame-per-page, but have fallbacks that act like the
>          * regular page allocator APIs.
> @@ -342,7 +347,7 @@ void __page_pool_put_page(struct page_pool *pool,
> struct page *page,
>                         page_pool_dma_sync_for_device(pool, page,
>                                                       dma_sync_size);
>  
> -               if (allow_direct && in_serving_softirq())
> +               if (allow_direct)
>                         if (__page_pool_recycle_direct(page, pool))
>                                 return;
> 
> 

too much data path complications for my taste, we need to establish
some assumptions.
1) user know what he is doing, MSIX/NAPI affinity should be as hinted
by driver, pool nid should correspond to the affinity hint.
2) if necessary mitigate NAPI numa migration via page_pool_update_nid()
diff mbox series

Patch

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 3c8b51ccd1c1..973235c09487 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -328,6 +328,11 @@  static bool pool_page_reusable(struct page_pool *pool, struct page *page)
 void __page_pool_put_page(struct page_pool *pool, struct page *page,
                          unsigned int dma_sync_size, bool allow_direct)
 {
+       allow_direct = allow_direct && in_serving_softirq();
+
+       if (allow_direct)
+               WARN_ON_ONCE((pool->p.nid != NUMA_NO_NODE) &&
+                                    (pool->p.nid != numa_mem_id()));
        /* This allocator is optimized for the XDP mode that uses
         * one-frame-per-page, but have fallbacks that act like the
         * regular page allocator APIs.
@@ -342,7 +347,7 @@  void __page_pool_put_page(struct page_pool *pool, struct page *page,
                        page_pool_dma_sync_for_device(pool, page,
                                                      dma_sync_size);
 
-               if (allow_direct && in_serving_softirq())
+               if (allow_direct)
                        if (__page_pool_recycle_direct(page, pool))
                                return;