Message ID | 1371680432.1956.116.camel@bwh-desktop.uk.level5networks.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2013-06-19 at 23:20 +0100, Ben Hutchings wrote: > When we pre-allocate skbs, the sk_buff structures themselves will be > cache-hot (having just been filled in) but the header area will be > cache-cold. The skb cache is small and we're likely to use each skb > quite soon, so prefetch the start of the header area, where we will > copy the headers, and the skb_shared_info structure, where we will > store page fragments. > > Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> > --- > drivers/net/ethernet/sfc/rx.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c > index a28f2fe..0dc4a80 100644 > --- a/drivers/net/ethernet/sfc/rx.c > +++ b/drivers/net/ethernet/sfc/rx.c > @@ -106,11 +106,17 @@ void efx_rx_config_page_split(struct efx_nic *efx) > > static void efx_refill_skb_cache(struct efx_rx_queue *rx_queue) > { > + struct sk_buff *skb; > int i; > + > for (i = 0; i < SKB_CACHE_SIZE; ++i) { > - rx_queue->skb_cache[i] = > - netdev_alloc_skb(rx_queue->efx->net_dev, > - EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); > + skb = netdev_alloc_skb(rx_queue->efx->net_dev, > + EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); > + rx_queue->skb_cache[i] = skb; > + if (skb) { > + prefetch(skb->data); This is really suspect. This part will be dirtied anyway by the NIC once frame is received. > + prefetch(skb_shinfo(skb)); > + } > } > rx_queue->skb_cache_next_unused = 0; > } > > Quite frankly this cache sounds suspect to me. Why is it needed only for sfc ? Have you tried build_skb() instead ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-06-19 at 15:36 -0700, Eric Dumazet wrote: > On Wed, 2013-06-19 at 23:20 +0100, Ben Hutchings wrote: > > When we pre-allocate skbs, the sk_buff structures themselves will be > > cache-hot (having just been filled in) but the header area will be > > cache-cold. The skb cache is small and we're likely to use each skb > > quite soon, so prefetch the start of the header area, where we will > > copy the headers, and the skb_shared_info structure, where we will > > store page fragments. > > > > Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> > > --- > > drivers/net/ethernet/sfc/rx.c | 12 +++++++++--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c > > index a28f2fe..0dc4a80 100644 > > --- a/drivers/net/ethernet/sfc/rx.c > > +++ b/drivers/net/ethernet/sfc/rx.c > > @@ -106,11 +106,17 @@ void efx_rx_config_page_split(struct efx_nic *efx) > > > > static void efx_refill_skb_cache(struct efx_rx_queue *rx_queue) > > { > > + struct sk_buff *skb; > > int i; > > + > > for (i = 0; i < SKB_CACHE_SIZE; ++i) { > > - rx_queue->skb_cache[i] = > > - netdev_alloc_skb(rx_queue->efx->net_dev, > > - EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); > > + skb = netdev_alloc_skb(rx_queue->efx->net_dev, > > + EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); > > + rx_queue->skb_cache[i] = skb; > > + if (skb) { > > + prefetch(skb->data); > > This is really suspect. > > This part will be dirtied anyway by the NIC once frame is received. This is not used for DMA. > > + prefetch(skb_shinfo(skb)); > > + } > > } > > rx_queue->skb_cache_next_unused = 0; > > } > > > > > > Quite frankly this cache sounds suspect to me. > > Why is it needed only for sfc ? > > Have you tried build_skb() instead ? You can't use build_skb() if you put multiple RX buffers in a page. See <1365805319.2791.18.camel@bwh-desktop.uk.solarflarecom.com>. Ben.
On Wed, 2013-06-19 at 23:39 +0100, Ben Hutchings wrote: > You can't use build_skb() if you put multiple RX buffers in a page. See > <1365805319.2791.18.camel@bwh-desktop.uk.solarflarecom.com>. Well, you use small skb (used to have 64 bytes headroom and you changed it to 128) and attach a frag on it, so anyway build_skb() brings nothing. I suspect you have a win on workloads with flood of rx, but on moderate load sk_buff will be cold, so the cache adds latencies. My feeling is that this kind of tradeoff should not be per driver. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-06-19 at 15:56 -0700, Eric Dumazet wrote: > On Wed, 2013-06-19 at 23:39 +0100, Ben Hutchings wrote: > > > You can't use build_skb() if you put multiple RX buffers in a page. See > > <1365805319.2791.18.camel@bwh-desktop.uk.solarflarecom.com>. > > Well, you use small skb (used to have 64 bytes headroom and you changed > it to 128) and attach a frag on it, so anyway build_skb() brings > nothing. > > I suspect you have a win on workloads with flood of rx, but on moderate > load sk_buff will be cold, so the cache adds latencies. > > My feeling is that this kind of tradeoff should not be per driver. Jon's testing showed a significant latency improvement. In Linux 3.10 sfc has reduced UDP performance (both latency and throughput) as we removed the support for skb pre-allocation in preparation for the implementation for DMA scatter and recycling of IOMMU-mapped buffers. The RX changes in this series are intended to recover that, and our benchmark results suggest that we succeeded. Ben.
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c index a28f2fe..0dc4a80 100644 --- a/drivers/net/ethernet/sfc/rx.c +++ b/drivers/net/ethernet/sfc/rx.c @@ -106,11 +106,17 @@ void efx_rx_config_page_split(struct efx_nic *efx) static void efx_refill_skb_cache(struct efx_rx_queue *rx_queue) { + struct sk_buff *skb; int i; + for (i = 0; i < SKB_CACHE_SIZE; ++i) { - rx_queue->skb_cache[i] = - netdev_alloc_skb(rx_queue->efx->net_dev, - EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); + skb = netdev_alloc_skb(rx_queue->efx->net_dev, + EFX_SKB_HEADERS + EFX_PAGE_SKB_ALIGN); + rx_queue->skb_cache[i] = skb; + if (skb) { + prefetch(skb->data); + prefetch(skb_shinfo(skb)); + } } rx_queue->skb_cache_next_unused = 0; }
When we pre-allocate skbs, the sk_buff structures themselves will be cache-hot (having just been filled in) but the header area will be cache-cold. The skb cache is small and we're likely to use each skb quite soon, so prefetch the start of the header area, where we will copy the headers, and the skb_shared_info structure, where we will store page fragments. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> --- drivers/net/ethernet/sfc/rx.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)