Message ID | 5592D105.7080006@certsign.ro |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Le 30/06/2015 19:25, Nicolae Rosia a écrit : > > Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Thanks, bye. > --- > drivers/net/ethernet/cadence/macb.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c > index caeb395..dbb5160 100644 > --- a/drivers/net/ethernet/cadence/macb.c > +++ b/drivers/net/ethernet/cadence/macb.c > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); > - skb = netdev_alloc_skb(dev, pktlen + 2); > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); > if (skb) { > - skb_reserve(skb, 2); > + skb_reserve(skb, NET_IP_ALIGN); > memcpy(skb_put(skb, pktlen), p_recv, pktlen); > > skb->protocol = eth_type_trans(skb, dev); >
On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote: > Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro> > --- > drivers/net/ethernet/cadence/macb.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c > index caeb395..dbb5160 100644 > --- a/drivers/net/ethernet/cadence/macb.c > +++ b/drivers/net/ethernet/cadence/macb.c > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); > - skb = netdev_alloc_skb(dev, pktlen + 2); > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); > if (skb) { > - skb_reserve(skb, 2); > + skb_reserve(skb, NET_IP_ALIGN); > memcpy(skb_put(skb, pktlen), p_recv, pktlen); > > skb->protocol = eth_type_trans(skb, dev); Then please use netdev_alloc_skb_ip_align(), so that you get rid of skb_reserve() -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/01/2015 01:56 PM, Eric Dumazet wrote: > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > skb_reserve() Thank you for the suggestion. I can do that. A related question, should I also replace netdev_alloc with napi_alloc_skb in places where I have a napi struct? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-07-01 at 16:29 +0300, Nicolae Rosia wrote: > On 07/01/2015 01:56 PM, Eric Dumazet wrote: > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > > skb_reserve() > Thank you for the suggestion. > I can do that. > A related question, should I also replace netdev_alloc with > napi_alloc_skb in places where I have a napi struct? I really doubt this adapter can process millions of packets per second ? I would rather enable GRO, it would be more useful. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/01/2015 04:44 PM, Eric Dumazet wrote: > I really doubt this adapter can process millions of packets per second ? I was suggesting this since I was taking into consideration the comment from skbuff.c, "we can save several CPU cycles by avoiding having to disable and re-enable IRQs." Is there a downside to this? > > I would rather enable GRO, it would be more useful. I had no idea what GRO is, so I have read about it [0] and looked at a couple of drivers which use it. They all seem to replace netif_receive_skb with napi_gro_receive and when there are no more packets in napi_pool they call napi_gro_flush. Is it that simple? Regards [0] https://lwn.net/Articles/358910/ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-07-01 at 17:29 +0300, Nicolae Rosia wrote: > On 07/01/2015 04:44 PM, Eric Dumazet wrote: > > I really doubt this adapter can process millions of packets per second ? > I was suggesting this since I was taking into consideration the comment > from skbuff.c, "we can save several CPU cycles by avoiding having to > disable and re-enable IRQs." > Is there a downside to this? This only matters in terms of few nano seconds per packet, so for 10Gb+ NIC anyway. Absolute noise for most NIC. > > > > > I would rather enable GRO, it would be more useful. > I had no idea what GRO is, so I have read about it [0] and looked at a > couple of drivers which use it. They all seem to > replace netif_receive_skb with napi_gro_receive and when there are no > more packets in napi_pool they call napi_gro_flush. > Is it that simple? Yes, but main question is : Do you have the hardware to test your changes ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 1, 2015 at 6:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: [...] > This only matters in terms of few nano seconds per packet, so for 10Gb+ > NIC anyway. Absolute noise for most NIC. > I'll give it a try and benchmark. I achieved a huge speedup by moving TX into napi [0], but my hardware doesn't support multiple TX queues and I can't test that situation. > Yes, but main question is : Do you have the hardware to test your > changes ? Yes, I have a Xilinx ZC706 board with a Zynq7 XC7Z045 processor [1] [0] https://patchwork.ozlabs.org/patch/488949/ [1] http://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote: > On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote: > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c [] > > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) > > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { > > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; > > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); > > - skb = netdev_alloc_skb(dev, pktlen + 2); > > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); > > if (skb) { > > - skb_reserve(skb, 2); > > + skb_reserve(skb, NET_IP_ALIGN); > > memcpy(skb_put(skb, pktlen), p_recv, pktlen); > > > > skb->protocol = eth_type_trans(skb, dev); > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > skb_reserve() It seems there are ~50 of these in the kernel tree that could be converted. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-07-01 at 10:06 -0700, Joe Perches wrote: > On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote: > > On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote: > > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c > [] > > > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) > > > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { > > > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; > > > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); > > > - skb = netdev_alloc_skb(dev, pktlen + 2); > > > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); > > > if (skb) { > > > - skb_reserve(skb, 2); > > > + skb_reserve(skb, NET_IP_ALIGN); > > > memcpy(skb_put(skb, pktlen), p_recv, pktlen); > > > > > > skb->protocol = eth_type_trans(skb, dev); > > > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > > skb_reserve() > > It seems there are ~50 of these in the kernel tree > that could be converted. > Make sure the 2 is really NET_IP_ALIGN Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for example) I would rather not touch this without testing the change on real hardware. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-07-02 at 00:13 +0200, Eric Dumazet wrote: > On Wed, 2015-07-01 at 10:06 -0700, Joe Perches wrote: > > On Wed, 2015-07-01 at 12:56 +0200, Eric Dumazet wrote: > > > On Tue, 2015-06-30 at 20:25 +0300, Nicolae Rosia wrote: > > > > diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c > > [] > > > > @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) > > > > while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { > > > > p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; > > > > pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); > > > > - skb = netdev_alloc_skb(dev, pktlen + 2); > > > > + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); > > > > if (skb) { > > > > - skb_reserve(skb, 2); > > > > + skb_reserve(skb, NET_IP_ALIGN); > > > > memcpy(skb_put(skb, pktlen), p_recv, pktlen); > > > > > > > > skb->protocol = eth_type_trans(skb, dev); > > > > > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > > > skb_reserve() > > > > It seems there are ~50 of these in the kernel tree > > that could be converted. > > > > Make sure the 2 is really NET_IP_ALIGN > > Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for > example) > > I would rather not touch this without testing the change on real > hardware. Nor I really. Most all of those are in fairly old hardware drivers. I just wanted to point out that more exist. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet > Sent: 01 July 2015 23:14 > To: Joe Perches ... > > > Then please use netdev_alloc_skb_ip_align(), so that you get rid of > > > skb_reserve() > > > > It seems there are ~50 of these in the kernel tree > > that could be converted. > > > > Make sure the 2 is really NET_IP_ALIGN > > Some hardwares need 2, even if NET_IP_ALIGN is 0 (on x86 arches for > example) Even on x86 aligning the ethernet receive data on a 4n+2 boundary is likely to give marginally better performance than aligning on a 4n boundary. David
On Fri, 2015-07-03 at 16:18 +0000, David Laight wrote: > Even on x86 aligning the ethernet receive data on a 4n+2 > boundary is likely to give marginally better performance > than aligning on a 4n boundary. You are coming late to the party. Intel guys decided to change NET_IP_ALIGN to 0 (it was 2 in the past) commit ea812ca1b06113597adcd8e70c0f84a413d97544 Author: Alexander Duyck <alexander.h.duyck@intel.com> Date: Tue Jun 29 18:38:00 2010 +0000 x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch x86 architectures can handle unaligned accesses in hardware, and it has been shown that unaligned DMA accesses can be expensive on Nehalem architectures. As such we should overwrite NET_IP_ALIGN to resolve this issue. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > Sent: 03 July 2015 17:39 > On Fri, 2015-07-03 at 16:18 +0000, David Laight wrote: > > > Even on x86 aligning the ethernet receive data on a 4n+2 > > boundary is likely to give marginally better performance > > than aligning on a 4n boundary. > > You are coming late to the party. I've been to many parties at many different times.... Going back many years, Sun's original sbus DMA part generated a lot of single sbus transfers for 4n+2 aligned buffers - so it was necessary to do a 'realignment' copy. The later DMA+ (definitely the DMA2) part did sbus burst transfers even when the buffer was 4n+2 aligned. So with the later parts you could correctly align the buffer. > Intel guys decided to change NET_IP_ALIGN to 0 (it was 2 in the past) ... > x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch > > x86 architectures can handle unaligned accesses in hardware, and it has > been shown that unaligned DMA accesses can be expensive on Nehalem > architectures. As such we should overwrite NET_IP_ALIGN to resolve > this issue. My 2 cents: I'd have thought it would depend on the nature of the 'DMA' requests generated by the hardware - so ethernet hardware dependant. The above may be correct for PCI masters - especially those that do paired 16bit accesses for every 32bit word. If the hardware generated cache line aligned PCI bursts I wouldn't have thought it would matter. I doubt it is valid for PCIe transfers - where the ethernet frame will be split into (probably) 128byte TLPs. Even if it starts on a 64n+2 boundary the splits will be on 64 byte boundaries since the first and last 32bit words of the TLP have separate byte enables. So I'd expect to see a cache line RMW for the first and last cache lines - That may, or may not, be slower than the misaligned accesses for the entire frame (1 clock data delay per access?) Of course, modern nics will write 2 bytes of 'crap' before the frame. Rounding up the transfer to the end of a cache line might also help (especially if only a few extra words are needed). David
diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index caeb395..dbb5160 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -2554,9 +2554,9 @@ static void at91ether_rx(struct net_device *dev) while (lp->rx_ring[lp->rx_tail].addr & MACB_BIT(RX_USED)) { p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ; pktlen = MACB_BF(RX_FRMLEN, lp->rx_ring[lp->rx_tail].ctrl); - skb = netdev_alloc_skb(dev, pktlen + 2); + skb = netdev_alloc_skb(dev, pktlen + NET_IP_ALIGN); if (skb) { - skb_reserve(skb, 2); + skb_reserve(skb, NET_IP_ALIGN); memcpy(skb_put(skb, pktlen), p_recv, pktlen); skb->protocol = eth_type_trans(skb, dev);
Signed-off-by: Nicolae Rosia <nicolae.rosia@certsign.ro> --- drivers/net/ethernet/cadence/macb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)