Message ID | 1411654669.16953.17.camel@edumazet-glaptop2.roam.corp.google.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 25 Sep 2014 07:17:49 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote: > From: Eric Dumazet <edumazet@google.com> > > Tested with a modified pktgen version, I got a 40% increase of > throughput. Pktgen is artificial benchmarking, you know ;-) :-P If you really must use pktgen for this, you can use an unmodified pktgen against the qdisc layer by adding a VLAN interface on-top of the device you want to test. I blame John, for telling me this ;-) /me running away
On Thu, 2014-09-25 at 16:43 +0200, Jesper Dangaard Brouer wrote: > On Thu, 25 Sep 2014 07:17:49 -0700 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > From: Eric Dumazet <edumazet@google.com> > > > > Tested with a modified pktgen version, I got a 40% increase of > > throughput. > > Pktgen is artificial benchmarking, you know ;-) :-P > > If you really must use pktgen for this, you can use an unmodified > pktgen against the qdisc layer by adding a VLAN interface on-top of the > device you want to test. I blame John, for telling me this ;-) > > /me running away But... Nothing yet in qdisc layer sets xmit_more, unless TSO is disabled, and a GSO packet is segmented. So you might notice some improvement with this patch, but not if you use TSO, and a 40GB NIC is better with TSO on you know ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 08:06 -0700, Eric Dumazet wrote: > > But... Nothing yet in qdisc layer sets xmit_more, unless TSO is > disabled, and a GSO packet is segmented. > > So you might notice some improvement with this patch, but not if you use > TSO, and a 40GB NIC is better with TSO on you know ;) > So playing with TSO=off (but GSO=on), we see about 12 % increase of throughput on a TCP_STREAM flow. And kernel profile clearly shows the difference of sending or not the doorbell, as mlx4_en_xmit() shifts. Note also mlx4_en_free_tx_desc() disappears, this might point to some false sharing or something worth investigating. Before patch : 26.36% [kernel] [k] __copy_skb_header 16.59% [kernel] [k] mlx4_en_xmit 5.62% [kernel] [k] __alloc_skb 5.48% [kernel] [k] copy_user_enhanced_fast_string 4.48% [kernel] [k] skb_segment 2.46% [kernel] [k] mlx4_en_free_tx_desc.isra.27 2.31% [kernel] [k] _raw_spin_lock 2.10% [kernel] [k] memcpy 2.01% [kernel] [k] tcp_sendmsg 1.62% [kernel] [k] __iowrite64_copy After patch : 32.78% [kernel] [k] __copy_skb_header 8.26% [kernel] [k] mlx4_en_xmit 7.25% [kernel] [k] __alloc_skb 7.18% [kernel] [k] copy_user_enhanced_fast_string 4.39% [kernel] [k] skb_segment 2.87% [kernel] [k] memcpy 2.59% [kernel] [k] tcp_sendmsg 2.50% [kernel] [k] _raw_spin_lock 2.38% [kernel] [k] dev_hard_start_xmit 1.52% [kernel] [k] tcp_gso_segment 1.50% [kernel] [k] kmem_cache_alloc_node_trace 1.40% [kernel] [k] kmem_cache_alloc_node 1.16% [kernel] [k] ip_send_check -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 08:36 -0700, Eric Dumazet wrote: > So playing with TSO=off (but GSO=on), we see about 12 % increase of > throughput on a TCP_STREAM flow. Correction, this was a 20 % increase. ( 849284 pps -> 1023469 pps) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 25, 2014 at 7:17 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > From: Eric Dumazet <edumazet@google.com> > > skb->xmit_more tells us if another skb is coming next. > > We need to send doorbell when : xmit_more is not set, > or txqueue is stopped (preventing next skb to come immediately) > > Tested with a modified pktgen version, I got a 40% increase of > throughput. this is awesome! I've been hacking pktgen as well based on Jesper's earlier patch, but in slightly different way. Sounds like you already have working pktgen with xmit_more. Do you mind sharing it? even rough patch would be great. Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 08:53 -0700, Alexei Starovoitov wrote: > On Thu, Sep 25, 2014 at 7:17 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > skb->xmit_more tells us if another skb is coming next. > > > > We need to send doorbell when : xmit_more is not set, > > or txqueue is stopped (preventing next skb to come immediately) > > > > Tested with a modified pktgen version, I got a 40% increase of > > throughput. > > this is awesome! > > I've been hacking pktgen as well based on Jesper's earlier patch, > but in slightly different way. Sounds like you already have working > pktgen with xmit_more. Do you mind sharing it? even rough patch > would be great. The plan was to add a 'pburst x' parameter. Because not only we want to avoid sending the doorbell, but we could also not doing the spinlock for every start_xmit(). We also can enforce a global [p]rate (say 100000 packets per second), but still allow pktgen to send burts of x packets to use (and test) this xmit_more Nothing ready yet ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 25, 2014 at 9:05 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Thu, 2014-09-25 at 08:53 -0700, Alexei Starovoitov wrote: >> On Thu, Sep 25, 2014 at 7:17 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >> > From: Eric Dumazet <edumazet@google.com> >> > >> > skb->xmit_more tells us if another skb is coming next. >> > >> > We need to send doorbell when : xmit_more is not set, >> > or txqueue is stopped (preventing next skb to come immediately) >> > >> > Tested with a modified pktgen version, I got a 40% increase of >> > throughput. >> >> this is awesome! >> >> I've been hacking pktgen as well based on Jesper's earlier patch, >> but in slightly different way. Sounds like you already have working >> pktgen with xmit_more. Do you mind sharing it? even rough patch >> would be great. > > The plan was to add a 'pburst x' parameter. > > Because not only we want to avoid sending the doorbell, but we could > also not doing the spinlock for every start_xmit(). exactly! I also hacked atomic_inc(&(pkt_dev->skb->users)) into single atomic_add of N packets for bursting to amortize the cost of both spin_lock and lock xadd > We also can enforce a global [p]rate (say 100000 packets per second), > but still allow pktgen to send burts of x packets to use (and test) this > xmit_more that would be nice. I'm not sure how yet. > Nothing ready yet ;) can't wait :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 08:36 -0700, Eric Dumazet wrote:
> 26.36% [kernel] [k] __copy_skb_header
Note to myself :
Time to optimize this thing, doing a memset() to copy a bunch of
consecutive fields.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 08:36 -0700, Eric Dumazet wrote: > After patch : > > 32.78% [kernel] [k] __copy_skb_header > 8.26% [kernel] [k] mlx4_en_xmit > 7.25% [kernel] [k] __alloc_skb > 7.18% [kernel] [k] copy_user_enhanced_fast_string > 4.39% [kernel] [k] skb_segment > 2.87% [kernel] [k] memcpy > 2.59% [kernel] [k] tcp_sendmsg > 2.50% [kernel] [k] _raw_spin_lock > 2.38% [kernel] [k] dev_hard_start_xmit > 1.52% [kernel] [k] tcp_gso_segment > 1.50% [kernel] [k] kmem_cache_alloc_node_trace > 1.40% [kernel] [k] kmem_cache_alloc_node > 1.16% [kernel] [k] ip_send_check > After __copy_skb_header() optimization I get something even nicer ;) 19.90% [kernel] [k] __copy_skb_header 15.16% [kernel] [k] skb_segment 7.02% [kernel] [k] __alloc_skb 6.89% [kernel] [k] copy_user_enhanced_fast_string 6.60% [kernel] [k] mlx4_en_xmit 2.70% [kernel] [k] _raw_spin_lock 2.68% [kernel] [k] memcpy 2.41% [kernel] [k] tcp_sendmsg I'll post this patch asap. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-09-25 at 13:47 -0700, Eric Dumazet wrote: > After patch : > 32.78% [kernel] [k] __copy_skb_header > After __copy_skb_header() optimization I get something even nicer ;) > 19.90% [kernel] [k] __copy_skb_header > I'll post this patch asap. Eric, you are _killing_ it. network ninja indeed... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 25 Sep 2014 07:17:49 -0700 > From: Eric Dumazet <edumazet@google.com> > > skb->xmit_more tells us if another skb is coming next. > > We need to send doorbell when : xmit_more is not set, > or txqueue is stopped (preventing next skb to come immediately) > > Tested with a modified pktgen version, I got a 40% increase of > throughput. > > Signed-off-by: Eric Dumazet <edumazet@google.com> Applied, thanks Eric. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index c44f4237b9be..adedc47e947d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -667,6 +667,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) int lso_header_size; void *fragptr; bool bounce = false; + bool send_doorbell; if (!priv->port_up) goto tx_drop; @@ -878,12 +879,16 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) skb_tx_timestamp(skb); - if (ring->bf_enabled && desc_size <= MAX_BF && !bounce && !vlan_tx_tag_present(skb)) { + send_doorbell = !skb->xmit_more || netif_xmit_stopped(ring->tx_queue); + + if (ring->bf_enabled && desc_size <= MAX_BF && !bounce && + !vlan_tx_tag_present(skb) && send_doorbell) { tx_desc->ctrl.bf_qpn |= cpu_to_be32(ring->doorbell_qpn); op_own |= htonl((bf_index & 0xffff) << 8); - /* Ensure new descirptor hits memory - * before setting ownership of this descriptor to HW */ + /* Ensure new descriptor hits memory + * before setting ownership of this descriptor to HW + */ wmb(); tx_desc->ctrl.owner_opcode = op_own; @@ -896,12 +901,16 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) ring->bf.offset ^= ring->bf.buf_size; } else { - /* Ensure new descirptor hits memory - * before setting ownership of this descriptor to HW */ + /* Ensure new descriptor hits memory + * before setting ownership of this descriptor to HW + */ wmb(); tx_desc->ctrl.owner_opcode = op_own; - wmb(); - iowrite32be(ring->doorbell_qpn, ring->bf.uar->map + MLX4_SEND_DOORBELL); + if (send_doorbell) { + wmb(); + iowrite32be(ring->doorbell_qpn, + ring->bf.uar->map + MLX4_SEND_DOORBELL); + } } return NETDEV_TX_OK;